Select Page

The Standard and Poor’s 500, or simply the S&P 500, is a stock market index tracking the performance of 500 large companies listed on stock exchanges in the United States. From Investopedia.com, the S&P 500 index is regarded as one of the best gauges of prominent American equities’ performance, and by extension, that of the stock market overall.

The S&P 500 constituents are re-balanced on a quarterly basis. For this reason, if you are developing a software involving getting the constituents’ list, there is a need programmatically retrieve the latest list.

There are a lot of ways on how to do this. But to get the list for free in a simple and efficient way, we can use web scraping technique. Web scraping (or data scraping) is a technique used to collect content and data from the internet. We will write the program in Python language.

Wikipedia publishes current S&P 500 component stocks. In this tutorial, we will be extracting data from this page https://en.wikipedia.org/wiki/List_of_S%26P_500_companies.

The Python program consists of less than 10 lines of code. Here are the steps to accomplish our objective:

Step 1. Install the required packages. Using your favorite Python IDE, run these commands:

pip install pandas
pip install lxml
pip install numpy

Step 2. Import the packages in your Python file

import pandas as pd
import numpy as np

Step 3. Using Pandas Library, write the code to web scrape the Wikipedia page and extract the table

sp500 = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')

This is the output of Panda reading the HTML of Wikipedia List of S & P 500 Companies and loading the table contents into a Python variable.

Step 4. Write the implementation for extracting the list of symbols from the table extracted from the Wikipedia page

sp500_list = np.array(sp500[0]['Symbol'])

The sp500_list will contain all symbols of S & P 500 companies, such as below:


['MMM' 'AOS' 'ABT' 'ABBV' 'ABMD' 'ACN' 'ATVI' 'ADM' 'ADBE' 'ADP' 'AAP'
 'AES' 'AFL' 'A' 'AIG' 'APD' 'AKAM' 'ALK' 'ALB' 'ARE' 'ALGN' 'ALLE' 'LNT'
 'ALL' 'GOOGL' 'GOOG' 'MO' 'AMZN' 'AMCR' 'AMD' 'AEE' 'AAL' 'AEP' 'AXP'
 'AMT' 'AWK' 'AMP' 'ABC' 'AME' 'AMGN' 'APH' 'ADI' 'ANSS' 'ANTM' 'AON'
 'APA' 'AAPL' 'AMAT' 'APTV' 'ANET' 'AIZ' 'T' 'ATO' 'ADSK' 'AZO' 'AVB'
 'AVY' 'BKR' 'BALL' 'BAC' 'BBWI' 'BAX' 'BDX' 'WRB' 'BRK.B' 'BBY' 'BIO'
 'TECH' 'BIIB' 'BLK' 'BK' 'BA' 'BKNG' 'BWA' 'BXP' 'BSX' 'BMY' 'AVGO' 'BR'
 'BRO' 'BF.B' 'CHRW' 'CDNS' 'CZR' 'CPT' 'CPB' 'COF' 'CAH' 'KMX' 'CCL'
 'CARR' 'CTLT' 'CAT' 'CBOE' 'CBRE' 'CDW' 'CE' 'CNC' 'CNP' 'CDAY' 'CERN'
 'CF' 'CRL' 'SCHW' 'CHTR' 'CVX' 'CMG' 'CB' 'CHD' 'CI' 'CINF' 'CTAS' 'CSCO'
 'C' 'CFG' 'CTXS' 'CLX' 'CME' 'CMS' 'KO' 'CTSH' 'CL' 'CMCSA' 'CMA' 'CAG'
 'COP' 'ED' 'STZ' 'CEG' 'COO' 'CPRT' 'GLW' 'CTVA' 'COST' 'CTRA' 'CCI'
 'CSX' 'CMI' 'CVS' 'DHI' 'DHR' 'DRI' 'DVA' 'DE' 'DAL' 'XRAY' 'DVN' 'DXCM'
 'FANG' 'DLR' 'DFS' 'DISH' 'DIS' 'DG' 'DLTR' 'D' 'DPZ' 'DOV' 'DOW' 'DTE'
 'DUK' 'DRE' 'DD' 'DXC' 'EMN' 'ETN' 'EBAY' 'ECL' 'EIX' 'EW' 'EA' 'EMR'
 'ENPH' 'ETR' 'EOG' 'EPAM' 'EFX' 'EQIX' 'EQR' 'ESS' 'EL' 'ETSY' 'RE'
 'EVRG' 'ES' 'EXC' 'EXPE' 'EXPD' 'EXR' 'XOM' 'FFIV' 'FDS' 'FAST' 'FRT'
 'FDX' 'FITB' 'FRC' 'FE' 'FIS' 'FISV' 'FLT' 'FMC' 'F' 'FTNT' 'FTV' 'FBHS'
 'FOXA' 'FOX' 'BEN' 'FCX' 'AJG' 'GRMN' 'IT' 'GE' 'GNRC' 'GD' 'GIS' 'GPC'
 'GILD' 'GL' 'GPN' 'GM' 'GS' 'GWW' 'HAL' 'HIG' 'HAS' 'HCA' 'PEAK' 'HSIC'
 'HSY' 'HES' 'HPE' 'HLT' 'HOLX' 'HD' 'HON' 'HRL' 'HST' 'HWM' 'HPQ' 'HUM'
 'HII' 'HBAN' 'IEX' 'IDXX' 'ITW' 'ILMN' 'INCY' 'IR' 'INTC' 'ICE' 'IBM'
 'IP' 'IPG' 'IFF' 'INTU' 'ISRG' 'IVZ' 'IPGP' 'IQV' 'IRM' 'JBHT' 'JKHY' 'J'
 'JNJ' 'JCI' 'JPM' 'JNPR' 'K' 'KEY' 'KEYS' 'KMB' 'KIM' 'KMI' 'KLAC' 'KHC'
 'KR' 'LHX' 'LH' 'LRCX' 'LW' 'LVS' 'LDOS' 'LEN' 'LLY' 'LNC' 'LIN' 'LYV'
 'LKQ' 'LMT' 'L' 'LOW' 'LUMN' 'LYB' 'MTB' 'MRO' 'MPC' 'MKTX' 'MAR' 'MMC'
 'MLM' 'MAS' 'MA' 'MTCH' 'MKC' 'MCD' 'MCK' 'MDT' 'MRK' 'FB' 'MET' 'MTD'
 'MGM' 'MCHP' 'MU' 'MSFT' 'MAA' 'MRNA' 'MHK' 'MOH' 'TAP' 'MDLZ' 'MPWR'
 'MNST' 'MCO' 'MS' 'MOS' 'MSI' 'MSCI' 'NDAQ' 'NTAP' 'NFLX' 'NWL' 'NEM'
 'NWSA' 'NWS' 'NEE' 'NLSN' 'NKE' 'NI' 'NDSN' 'NSC' 'NTRS' 'NOC' 'NLOK'
 'NCLH' 'NRG' 'NUE' 'NVDA' 'NVR' 'NXPI' 'ORLY' 'OXY' 'ODFL' 'OMC' 'OKE'
 'ORCL' 'OGN' 'OTIS' 'PCAR' 'PKG' 'PARA' 'PH' 'PAYX' 'PAYC' 'PYPL' 'PENN'
 'PNR' 'PEP' 'PKI' 'PFE' 'PM' 'PSX' 'PNW' 'PXD' 'PNC' 'POOL' 'PPG' 'PPL'
 'PFG' 'PG' 'PGR' 'PLD' 'PRU' 'PEG' 'PTC' 'PSA' 'PHM' 'PVH' 'QRVO' 'PWR'
 'QCOM' 'DGX' 'RL' 'RJF' 'RTX' 'O' 'REG' 'REGN' 'RF' 'RSG' 'RMD' 'RHI'
 'ROK' 'ROL' 'ROP' 'ROST' 'RCL' 'SPGI' 'CRM' 'SBAC' 'SLB' 'STX' 'SEE'
 'SRE' 'NOW' 'SHW' 'SBNY' 'SPG' 'SWKS' 'SJM' 'SNA' 'SEDG' 'SO' 'LUV' 'SWK'
 'SBUX' 'STT' 'STE' 'SYK' 'SIVB' 'SYF' 'SNPS' 'SYY' 'TMUS' 'TROW' 'TTWO'
 'TPR' 'TGT' 'TEL' 'TDY' 'TFX' 'TER' 'TSLA' 'TXN' 'TXT' 'TMO' 'TJX' 'TSCO'
 'TT' 'TDG' 'TRV' 'TRMB' 'TFC' 'TWTR' 'TYL' 'TSN' 'USB' 'UDR' 'ULTA' 'UAA'
 'UA' 'UNP' 'UAL' 'UNH' 'UPS' 'URI' 'UHS' 'VLO' 'VTR' 'VRSN' 'VRSK' 'VZ'
 'VRTX' 'VFC' 'VTRS' 'V' 'VNO' 'VMC' 'WAB' 'WMT' 'WBA' 'WBD' 'WM' 'WAT'
 'WEC' 'WFC' 'WELL' 'WST' 'WDC' 'WRK' 'WY' 'WHR' 'WMB' 'WTW' 'WYNN' 'XEL'
 'XYL' 'YUM' 'ZBRA' 'ZBH' 'ZION' 'ZTS']


Viola!  It's that simple.

Here is the Full Python Code Implementation for Getting the List of S&P 500 Companies Stock Ticker Symbols:
import pandas as pd
import numpy as np

sp500 = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
sp500_list = np.array(sp500[0]['Symbol'])