I am trying to scrape coronavirus data and push it on a tweet, but I am unable to figure out how to loop from a certain line
Source - https://www.worldometers.info/coronavirus/#countries
(https://i.stack.imgur.com/OZpNH.png)
results = soup.find(id = 'main_table_countries_today')
content = results.find_all('td')
print(entries.text)
for entries in content:
print(entries.text.strip())
Ideally, I should get a non spaced list from the
But it seems I am getting data from the wrapper even though I specified table ID
Image 1 = Is the extra data
Image 2 = is where I wanted to the non-spaced data to start
I need to place the countries names into a list based off this
The loop I am trying to run involves using the modulus operator based on line # as every 11th line a new country starts
i = 1
for entry in content:
if i%11 == 1:
countries.append(entry.text.strip())
i += 1
print(countries)
However, the above won't run because USA doesn't start at line 1 because of the extra space
Either I need to use a better ID or figure out how to exclude the extra stuff at the top
Any suggestions? How do I go about this? Is there a better way than relying on the line # and modulus?
FYI - I am a beginner when it comes to python