How to parsh event listener <tbody> by bs4?

377 Views Asked by At

This code showing table empty but it is not because web page filling table with help of some js code.

So I don't know how to parsh it. Please tell me how to parsh it.

import bs4 as bs
import urllib.request

source = urllib.request.urlopen('https://monerobenchmarks.info/').read()
soup = bs.BeautifulSoup(source,'lxml')

table = soup.find('table')
table_rows = table.find_all('tr')
for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)
3

There are 3 best solutions below

0
On BEST ANSWER

To use selenium with bs4, try chrome or Firefox driver

from bs4 import BeautifulSoup as Soup

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://monerobenchmarks.info/")

page = Soup(driver.page_source, features='html.parser')

table = page.find('table')
table_rows = table.find_all('tr')
for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)
0
On

BeautifulSoup cannot wait for Javascript to finish, what you're asking for is impossible. Especially since you're using urllib to get the page (and it would be the same if you were using requests, simply because BeautifulSoup lacks an engine to execute Javascript code).

What you want is selenium.

0
On
import requests

r = requests.get("https://monerobenchmarks.info/s/om.php?draw=1&columns%5B0%5D%5Bdata%5D=0&columns%5B0%5D%5Bname%5D=&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=true&columns%5B0%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B0%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B1%5D%5Bdata%5D=1&columns%5B1%5D%5Bname%5D=&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=true&columns%5B1%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B1%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B2%5D%5Bdata%5D=2&columns%5B2%5D%5Bname%5D=&columns%5B2%5D%5Bsearchable%5D=true&columns%5B2%5D%5Borderable%5D=true&columns%5B2%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B2%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B3%5D%5Bdata%5D=3&columns%5B3%5D%5Bname%5D=&columns%5B3%5D%5Bsearchable%5D=true&columns%5B3%5D%5Borderable%5D=true&columns%5B3%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B3%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B4%5D%5Bdata%5D=4&columns%5B4%5D%5Bname%5D=&columns%5B4%5D%5Bsearchable%5D=true&columns%5B4%5D%5Borderable%5D=true&columns%5B4%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B4%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B5%5D%5Bdata%5D=5&columns%5B5%5D%5Bname%5D=&columns%5B5%5D%5Bsearchable%5D=true&columns%5B5%5D%5Borderable%5D=true&columns%5B5%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B5%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B6%5D%5Bdata%5D=6&columns%5B6%5D%5Bname%5D=&columns%5B6%5D%5Bsearchable%5D=true&columns%5B6%5D%5Borderable%5D=true&columns%5B6%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B6%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B7%5D%5Bdata%5D=7&columns%5B7%5D%5Bname%5D=&columns%5B7%5D%5Bsearchable%5D=true&columns%5B7%5D%5Borderable%5D=true&columns%5B7%5D%5Bsearch%5D%5Bvalue%5D=&columns%5B7%5D%5Bsearch%5D%5Bregex%5D=false&order%5B0%5D%5Bcolumn%5D=1&order%5B0%5D%5Bdir%5D=desc&start=0&length=10&search%5Bvalue%5D=&search%5Bregex%5D=false&_=1586421151150").json()


for item in r['data']:
    item = item[:7]
    item[0] = item[0].split("&")[0]
    print(item)

Output:

['AMD EPYC 7742', '44000', '225 W', 'XMRig 5.3', 'N/A', 'WINDOWS 10 x64', 'Dec, 2019']
['AMD RYZEN THREADRIPPER 3990X', '43800', '280 W', 'XMRig 5.7.0', 'SOURCE&#32;&#58;&#32;<a href="https://cryptomining-blog.com/11384-randomx-mining-performance-on-amd-ryzen-threadripper-3990x-processor-64c-128t/" target="_blank">CRYPTO MINING BLOG</a>', 'WINDOWS 10 x64', 'Feb, 2020']
['AMD EPYC 7742', '38732', '225 W', 'RandomX Benchmark Windows x64. SOURCE: <a href="https://redd.it/cqqt12" target="_blank">REDDIT</a>', 'N/A', 'WINDOWS 10 x64', 'Aug, 2019']
['AMD THREADRIPPER 3970X', '28900', '170 W', 'XMRig v5.5', 'SOURCE: <a href="https://redd.it/ei5ra6" target="_blank">REDDIT</a>', 'WINDOWS 10 x64', 'Dec, 2019']
['THREADRIPPER 3970X', '27703', '280 W', 'XMRIG 5.3.0', 'N/A', 'WINDOWS 10 x64', 'Dec, 2019']
['AMD EPYC 7502P', '25300', '200 W', 'XMRig 5.7.0', 'N/A', 'DEBIAN 9 x64', 'Mar, 2020']
['DUAL XEON PLATINUM 8136', '22500', '330 W', 'XMRIG 5.5.0', 'N/A', 'WINDOWS 10 x64', 'Feb, 2020']
['AMD THREADRIPPER 3960X', '20800', '130 W', 'XMRIG 5.5.1', 'cpu: &#177;3.1ghz, ppt: 130w vcore/soc offset: &#45;0.05v, ram: [email protected], 4x8gb, huge pages, 48 threads, power at wall &#126;212w (ax1600i, r vii)', 'WINDOWS 10 x64', 'Jan, 2020']
['AMD THREADRIPPER 2990WX', '20057', '379 W', 'XMRig 5.5.3b', '--threads 32 --randomx-1gb-pages', 'UBUNTU 18.04 x64', 'Feb, 2020']
['RYZEN 9 3950X', '19776', '250 W', 'XMRIG 5.5.3', '3950X @ 4.35ghz 1.3312v , 2x16gb 3733cl14', 'WINDOWS 10 x64', 'Feb, 2020']