How to scrape incapsula protected website?

1.5k Views Asked by user1424739 At 20 October 2019 at 15:08

https://www.genecards.org/cgi-bin/carddisp.pl?gene=ZSCAN22

On the above webpage, if I click See all 33, I will see the following GET request is sent in Chrome DevTools.

https://www.genecards.org/gene/api/data/Enhancers?geneSymbol=ZSCAN22

Direct accessing of it is blocked.

I have try to use a puppeteer. I can click "See all 33" with puppeteer, but then I need to parse the resulted HTML file. It would be best to directly get the results from https://www.genecards.org/gene/api/data/Enhancers?geneSymbol=ZSCAN22. I am not sure how to get it after clicking "See all 33" with puppeteer.

I am not sure if apify can help.

Can anybody let me know how to scrape it?

Original Q&A

There are 1 best solutions below

mmblack On 12 March 2021 at 14:28

I used selenium it working fine

from selenium import webdriver
browser = webdriver.Chrome(executable_path="C:/src/webdriver/chromedriver.exe")
genesLocations = 'https://www.genecards.org/cgi-bin/carddisp.pl?gene={}'

Extract Genomic Locations

gene='ZSCAN22'
browser.get(genesLocations.format(gene))
location = browser.find_element_by_xpath('//*[@id="genomic_location"]/div/div[3]/div/div')
print(location.text)

How to scrape incapsula protected website?

There are 1 best solutions below

Related Questions in WEB-SCRAPING

Related Questions in PUPPETEER

Related Questions in APIFY

Related Questions in INCAPSULA

Trending Questions

Popular # Hahtags

Popular Questions