we are trying to parse SEC Edgar filing using Python . I'm trying to get this table "Sales By Segment Of Business" at line 21 . This is the link to the document.
https://www.sec.gov/ix?doc=/Archives/edgar/data/200406/000020040621000057/jnj-20210704.htm
Below is the code we found online . All the data in the web page is under this tag .
<div id="dynamic-xbrl-form" class="position-relative">
We are not able to print this data .
from bs4 import BeautifulSoup
import requests
import sys
# Access page
cik = '200406'
type = '10-K'
dateb = '20210704'
# Obtain HTML for search page
base_url = "https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK={}&type={}&dateb={}"
edgar_resp = requests.get(base_url.format(cik, type, dateb))
edgar_str = edgar_resp.text
# Find the document link
doc_link = ''
soup = BeautifulSoup(edgar_str, 'html.parser')
print(soup)
Can anyone help us in getting this . Any suggestion is helpful.
The URL you mentioned is a Dynamic page. However, the page content is loaded from this static page.
You can scrape this page and extract the data.
Here is the code that scrapes the data you need.