Using Investopedia’s definition of Insider Trading:
Insider trading involves trading in a public company’s stock by someone who has non-public, material information about that stock for any reason. Insider trading can be either illegal or legal depending on when the insider makes the trade.
The illegal way does not interest me, as it comes with penalties such as fines and jail time up to 20 years in prison. An example is when a CEO shares an information to a friend even before the public finds out about the information. Let’s say that the cure of a disease that their pharmaceutical company is developing will make the disease even worse. Then that friend uses that information to sell stocks and profit from the sale, before the official news comes out and the stock price plummets. The CEO and the friend could be prosecuted.
So when is insider trading legal? It can be legal as long as the transaction conforms to the rules set forth by the Securities and Exchange Commission (SEC). Those rules were passed into a law in 1934. The law is called Securities Exchange Act of 1934. When corporate insiders trade in their own company stock, they must report their trades to SEC. They file paperwork and that paperwork is all transparent in SEC website on EDGAR system, short for Electronic Data Gathering, Analysis, and Retrieval system
Directors and major shareholders have to publicly disclose it if they are buying or selling their own shares. Many investors and traders use this information to identify companies with investment potential. If insiders are using their money to buy huge amount of shares of their company, it is probably a good sign. They may have a good reason and information backing up their decision to buy.
In this blog post, I wrote a Python program that reads the Insider Trading data publicly available on the SEC website. Thanks to this post from Wrigters.io, their tutorial did most of the heavy lifting and I was able to start coding. Then I refined the logic and displayed the results on an HTML table.
Here’s a sample output of the Python code

Areas of Improvement:
- The ticker symbols “hard-coded” in this Python code are FAANG (FB, AAPL, AMZN, NFLX and GOOG). This can be further improved by searching the filings for all 12,000+ publicly-traded companies.
- For this sample code, “hard-coded” dates are from 2022-05-13 to 2022-06-13. Add an option to filter by filing dates gives more flexibility.
- The output is saved in an HTML file in the local file system. It would be nice to create an API or a web application.
- The code only displays buying and selling transactions on stocks. It can be further improved by including the options trading.
PART 1 – Setup Python project workspace
Step 1. Install Pycharm Community Edition in https://www.jetbrains.com/pycharm/download/
This tutorial uses the latest stable release pycharm-community-2021.1.2.exe
Step 2. Open Pycharm. Create New Project with a Virtual environment. The latest Pycharm installer will always have new virtual environment installed.
Step 3. Create new file with filename requirements.txt. Enter the following library names
pandas
requests>=2.25.1
yfinance
bs4
beautifulsoup4>=4.9.1
lxml
Step 4. Select “Terminal” on the bottom panel. This will launch a command line. Execute the command to install the libraries listed in requirements.txt
pip3 install -r requirements.txt
PART 2 – Writing the Python script
Write these lines of codes in main.py
Full code
import pandas as pd
import re
import copy
import requests, json
import yfinance as yf
from bs4 import BeautifulSoup
from lxml import etree
from datetime import datetime as dt
headers = {"User-Agent": "website.com email@email.com"}
TICKERS_CIK_URL = "https://www.sec.gov/files/company_tickers.json"
FILING_XML_URL = "https://www.sec.gov/Archives/edgar/data/"
FILING_SUBMISSION_URL = "https://data.sec.gov/submissions/CIK"
def make_url(cik, row):
accession_number = row["accessionNumber"].replace("-", "")
return f"{FILING_XML_URL}{cik}/{accession_number}/{row['accessionNumber']}.txt"
def get_element_text(doc, path):
ret_str = ""
if len(doc.xpath(path)) > 0:
ret_str = doc.xpath(path)[0].text
return ret_str
def get_transaction_type(acquired_or_disposed):
if acquired_or_disposed == "D":
transaction_type = "Sale"
elif acquired_or_disposed == "A":
transaction_type = "Purchase"
return transaction_type
def get_transaction(elem2, total1, total2, ticker):
transaction_date = ""
transaction_code = ""
valid_transaction = True
acquired_or_disposed = ""
transaction_form_type = ""
total1 = 0
total2 = 0
for elem3 in elem2.getchildren():
if elem3.tag == "transactiondate":
for elem4 in elem3.getchildren():
if elem4.tag == "value":
transaction_date = elem4.text
if elem3.tag == "transactioncode":
for elem4 in elem3.getchildren():
if elem4.tag == "transactioncode":
transaction_code = elem4.text
if elem4.tag == "transactionformtype":
transaction_form_type = elem4.text
if (transaction_code != "S" and transaction_code != "P") or (transaction_form_type != "4"):
valid_transaction = False
break
if elem3.tag == "transactionamounts":
shares = 0
price = 0
acquired_or_disposed = ""
for elem4 in elem3.getchildren():
if elem4.tag == "transactionacquireddisposedcode":
for elem5 in elem4.getchildren():
if elem5.tag == "value":
acquired_or_disposed = elem5.text
elif elem4.tag == "transactionshares":
for elem5 in elem4.getchildren():
if elem5.tag == "value" and elem5.text is not None:
shares = float(elem5.text)
elif elem4.tag == "transactionpricepershare":
for elem5 in elem4.getchildren():
if elem5.tag == "value" and elem5.text is not None:
price = float(elem5.text)
if price == 0:
ticker_data = yf.Ticker(ticker)
ticker_data = ticker_data.history(start=transaction_date, end=transaction_date)
try:
price = ticker_data["Close"][0]
except Exception as e:
print(f"exception {e}")
price = 1
if elem2.tag == "nonderivativetransaction" and acquired_or_disposed == "D":
stock_sale_amt = price * shares
total2 += stock_sale_amt
elif elem2.tag == "nonderivativetransaction" and acquired_or_disposed == "A":
stock_bought_amt = price * shares
total1 += stock_bought_amt
transaction_detail = {"acquired_or_disposed": acquired_or_disposed,
"total_stocks_bought_dollar": total1, "total_stocks_sold_dollar": total2,
"valid_transaction": valid_transaction}
return transaction_detail
def create_bought_transaction(transaction):
t = copy.copy(transaction)
t["acquired_or_disposed"] = "A"
t["total_stocks_sold_dollar"] = 0.0
return t
def create_sold_transaction(transaction):
t = copy.copy(transaction)
t["acquired_or_disposed"] = "D"
t["total_stocks_bought_dollar"] = 0.0
return t
def convert_int_to_text(num):
if num >= 1000000000:
return str(round(float(num / 1000000000), 2)) + "B"
elif num >= 1000000:
return str(round(float(num / 1000000), 2)) + "M"
elif num >= 1000:
return str(round(float(num / 1000), 2)) + "K"
else:
return num
def get_document(cik, row, ticker):
transactions = []
url = make_url(cik, row)
print(url)
res = requests.get(url, headers=headers)
res.raise_for_status()
soup = BeautifulSoup(res.content, "html.parser")
# use a case insensitive search for the root node of the XML document
docs = soup.find_all(re.compile("ownershipDocument", re.IGNORECASE))
if len(docs) > 0:
doc = etree.fromstring(str(docs[0]))
if docs[0].name == "ownershipdocument":
owner = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerid/rptownername")
is_director = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerrelationship/isdirector")
if is_director == "1" or is_director == "true":
is_director = "Yes"
elif is_director == "0" or is_director == "false":
is_director = "No"
is_officer = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerrelationship/isofficer")
if is_officer == "1" or is_officer == "true":
is_officer = "Yes"
elif is_officer == "0" or is_officer == "false":
is_officer = "No"
title = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerrelationship/officertitle")
if title == "" and is_director == "Yes":
title = "Director"
other_text = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerrelationship/othertext")
if title == "" and other_text is not None:
title = other_text
if title is None:
title = ""
is_ten_percent_owner = get_element_text(doc, "/ownershipdocument/reportingowner/reportingownerrelationship/istenpercentowner")
if is_ten_percent_owner == "1" or is_ten_percent_owner == "true":
is_ten_percent_owner = "Yes"
elif is_ten_percent_owner == "0" or is_ten_percent_owner == "false":
is_ten_percent_owner = "No"
issuer_trading_symbol = get_element_text(doc, "/ownershipdocument/issuer/issuertradingsymbol")
security = get_element_text(doc, "//securitytitle/value")
date = get_element_text(doc, "//transactiondate/value")
total_stocks_bought_dollar = 0
total_stocks_sold_dollar = 0
transaction = {}
valid_transaction = False
for elem in doc.getchildren():
transaction["symbol"] = ticker
transaction["issuer_trading_symbol"] = issuer_trading_symbol
transaction["owner"] = owner
transaction["security"] = security
transaction["date"] = date
transaction["is_director"] = is_director
transaction["is_officer"] = is_officer
transaction["is_ten_percent_owner"] = is_ten_percent_owner
transaction["title"] = title
if elem.tag == "nonderivativetable":
transaction = {}
for elem2 in elem.getchildren():
if elem2.tag == "nonderivativetransaction":
total1 = total_stocks_bought_dollar
total2 = total_stocks_sold_dollar
transaction = get_transaction(elem2, total1, total2, issuer_trading_symbol)
if not transaction["valid_transaction"]:
valid_transaction = False
elif transaction["valid_transaction"]:
if transaction["acquired_or_disposed"] == "A":
total_stocks_bought_dollar += transaction["total_stocks_bought_dollar"]
if transaction["acquired_or_disposed"] == "D":
total_stocks_sold_dollar += transaction["total_stocks_sold_dollar"]
valid_transaction = True
transaction["total_stocks_bought_dollar"] = total_stocks_bought_dollar
transaction["total_stocks_sold_dollar"] = total_stocks_sold_dollar
accession_number = row["accessionNumber"].replace("-", "")
transaction["url"] = f"{cik}/{accession_number}/{row['accessionNumber']}.txt"
else:
raise ValueError(f"Don't know how to process {docs[0].name}")
if valid_transaction is True:
if transaction["total_stocks_bought_dollar"] != 0:
bought_transaction = create_bought_transaction(transaction)
transactions.append(bought_transaction)
if transaction["total_stocks_sold_dollar"] != 0:
sold_transaction = create_sold_transaction(transaction)
transactions.append(sold_transaction)
return transactions
def main():
r = requests.get(TICKERS_CIK_URL)
companies = json.loads(r.content)
tickers = ["FB", "AMZN", "AAPL", "NFLX", "GOOG"]
cik_lookup = dict([(val["ticker"], val["cik_str"]) for key, val in companies.items()])
rows = []
try:
for ticker in tickers:
print(ticker)
cik = cik_lookup[ticker]
edgar_filings = requests.get(f"{FILING_SUBMISSION_URL}{cik:0>10}.json", headers=headers).json()
recents = pd.DataFrame(edgar_filings["filings"]["recent"])
recents["filingDate"] = pd.to_datetime(recents["filingDate"])
insider_q1 = recents[(recents["form"] == "4") &
(recents["filingDate"] >= "2022-05-13") &
(recents["filingDate"] <= "2022-06-13")]
insider_q1.shape
for i in range(len(insider_q1)):
transactions = get_document(cik, insider_q1.iloc[i], ticker)
for trans in transactions:
rows.append(trans)
except Exception as e:
print(e)
pass
HTML_HEAD = "<html> <head> <title> </title> <link rel='stylesheet' href='/static/sorta.css'><script src='/static/sort-table.js'></script></head>"
HTML_HEAD += "<body><table border='1' class =\'js-sort-table\'><thead><tr> <th class=\'js-sort-string\'><b>Ticker</b></th><th class=\'js-sort-string\'><b>Insider Trader</b></th> <th class=\'js-sort-string\'><b>Issuer </b></th> " \
"<th class=\'js-sort-string\'><b>Director?</b></th> <th class=\'js-sort-string\'><b>Officer?</b></th> <th class=\'js-sort-string\'><b>10% Owner</b></th> <th class=\'js-sort-string\'><b>Title</b></th> <th class=\'js-sort-string\'><b>Transaction Date</b></th> <th class=\'js-sort-number\'><b>Value ($)</b></th> <th class=\'js-sort-string\'><b>Type</b></th> <th class=\'js-sort-string\'><b>Filing Document</b></th> </thead></tr>"
html = HTML_HEAD
row_ctr = 1
for row in rows:
row_url = f"<a href=\"{FILING_XML_URL}{row['url']}\" target=\"_blank\">Filing</a>"
transaction_type = get_transaction_type(row['acquired_or_disposed'])
total = 0
if transaction_type == "Sale":
total = round(row['total_stocks_sold_dollar'])
elif transaction_type == "Purchase":
total = round(row['total_stocks_bought_dollar'])
formatted_total = "{:,}".format(total)
html += f"<tr> <td>{row['symbol']}</td> <td>{row['owner']}</td> <td>{row['issuer_trading_symbol']}</td> <td>{row['is_director']}</td> <td>{row['is_officer']}</td> <td>{row['is_ten_percent_owner']}</td> <td>{row['title']}</td><td>{row['date']}</td> <td align='right' title='${convert_int_to_text(total)}'>{formatted_total}</td> <td>{transaction_type}</td> <td>{row_url}</td></tr>"
row_ctr += 1
html += "</table><br><br></body></html>"
now = dt.now()
filename = "output\edgar" + now.strftime("%m%d_%H%M%S")
file = open(f"{filename}.html", "a")
file.write(html)
if __name__ == "__main__":
main()
PART 3 – Running the Python Code
Step 1. From Pycharm Terminal, run
python main.py
Step 2. Open the file generated by the Python program. The filename should look like this: edgar0621_212022.html
The last column contains the link to the filing found in SEC Edgar archives. An example is this link for Filing for FB.

Congratulations!
You may also explore the official API documentation from the SEC EDGAR website https://www.sec.gov/edgar/sec-api-documentation to further improve collecting insider trading information.
