I'm trying to scrape the links to the 400 models listed on this website: https://www.printables.com/model?category=14&fileType=fff&includeUserGcodes=1, which I refer to as webpage in my code below. However, when I run my code, I get no links.
User_agent = {'User-agent': 'Mozilla/5.0 (X11; CrOS i686 4319.74.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36'}
r = requests.get(webpage, headers = User_agent).text
soup = BeautifulSoup(r,'html5lib')
for link in soup.find_all('a'):
print(link['href'])
So I check if links are even available via: print(soup.prettify())
and none of the desired links appear in the HTML view as well. This led me to assume that the website doesn't allow scraping but r.status_code
returns 200 meaning I'm able to scrape.
Is there a different approach I could take? Where else would these links be stored? Thank you.
The data is loaded from external URL via Javascript, so BeautifulSoup doesn't see it. To get info about all items you can use following example:
Prints: