Unable to select the correct coronavirus table data

76 Views Asked by At

I am trying to scrape coronavirus data and push it on a tweet, but I am unable to figure out how to loop from a certain line

Source - https://www.worldometers.info/coronavirus/#countries

(https://i.stack.imgur.com/OZpNH.png)

results = soup.find(id = 'main_table_countries_today')
content = results.find_all('td')
print(entries.text)
for entries in content:
    print(entries.text.strip())

Ideally, I should get a non spaced list from the

But it seems I am getting data from the wrapper even though I specified table ID

Image 1

Image 1 = Is the extra data

Image 2 = is where I wanted to the non-spaced data to start

I need to place the countries names into a list based off this

The loop I am trying to run involves using the modulus operator based on line # as every 11th line a new country starts

        i = 1
    for entry in content:
        if i%11 == 1:
            countries.append(entry.text.strip())
        i += 1
        
    print(countries)

However, the above won't run because USA doesn't start at line 1 because of the extra space

Either I need to use a better ID or figure out how to exclude the extra stuff at the top

Any suggestions? How do I go about this? Is there a better way than relying on the line # and modulus?

FYI - I am a beginner when it comes to python

0

There are 0 best solutions below