How to "webscrape" a site containing a popup window, using python?

3.2k Views Asked by KALEB At 27 March 2020 at 06:28

I am trying to web scrape a certain part of the etherscan site with python, since there is no api for this functionality. Basically going to this link and one would need to press verify, after doing so a popup comes up which you can see here. What I need to scrape is this part 0x0882477e7895bdc5cea7cb1552ed914ab157fe56 in case the message starts with the message as seen in the picture.

I've written the below python script that starts this off, but I don't know how it's possible to interact further with the site, in order to have that popup come to the foreground and scrape the information. Is this possible to do?

from bs4 import BeautifulSoup
from requests import get

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0','X-Requested-With': 'XMLHttpRequest',}
url = "https://etherscan.io/proxyContractChecker?a=0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48"
response = get(url,headers=headers )
soup = BeautifulSoup(response.content,'html.parser')

Thank You

Original Q&A

There are 2 best solutions below

αԋɱҽԃ αмєяιcαη On 27 March 2020 at 07:08 BEST ANSWER

import requests
from bs4 import BeautifulSoup


def Main(url):
    with requests.Session() as req:
        r = req.get(url, headers={'User-Agent': 'Ahmed American :)'})
        soup = BeautifulSoup(r.content, 'html.parser')
        vs = soup.find("input", id="__VIEWSTATE").get("value")
        vsg = soup.find("input", id="__VIEWSTATEGENERATOR").get("value")
        ev = soup.find("input", id="__EVENTVALIDATION").get("value")
        data = {
            '__VIEWSTATE': vs,
            '__VIEWSTATEGENERATOR': vsg,
            '__EVENTVALIDATION': ev,
            'ctl00$ContentPlaceHolder1$txtContractAddress': '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48',
            'ctl00$ContentPlaceHolder1$btnSubmit': "Verify"
        }
        r = req.post(
            "https://etherscan.io/proxyContractChecker?a=0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48", data=data, headers={'User-Agent': 'Ahmed American :)'})
        soup = BeautifulSoup(r.content, 'html.parser')
        token = soup.find(
            "div", class_="alert alert-success").text.split(" ")[-1]
        print(token)


Main("https://etherscan.io/proxyContractChecker")

Output:

0x0882477e7895bdc5cea7cb1552ed914ab157fe56

EnriqueBet On 27 March 2020 at 06:47

I disagree with @InfinityTM. Usually the workflow that is follow for this kind of problems is that you will need to make a POST request into the website.

Look, if you click on Verify a POST request is made into the website as shown in this image:

This POST request was made with this headers:

and with this params:

You need to figure out how to send this POST request with the correct URL, headers, params, and cookies. Once you have achieved to make the request, you will receive the response:

which contains the information you want to scrape under the div with class "alert alert-success:

Summary

So the steps you need to follow are:

Navigate to your website, and gather all the information (request URL, Cookies, headers, and params) that you will need for your POST request.
Make the request with the requests library.
Once you get a <200> response, scrape the data you are interested in with BS.

Please let me know if this points you in the right direction! :D

How to "webscrape" a site containing a popup window, using python?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in CODE-CONTRACTS

Related Questions in ETHERSCAN

Trending Questions

Popular # Hahtags

Popular Questions