I am trying to scrape a website, however, I was unable to complete the code so that I could insert several URLs at once. Currently the code is functional with one URL at a time,
The current code is:
import requests
from bs4 import BeautifulSoup
import lxml
import pandas as pd
from urllib.request import urlopen
from urllib.error import HTTPError
from urllib.error import URLError
from bs4 import BeautifulSoup
try:
html = urlopen("http://google.com")
except HTTPError as e:
print(e)
except URLError:
print("error")
else:
res = BeautifulSoup(html.read(),"html5lib")
tags = res.findAll("div", {"itemtype": "http://schema.org/LocalBusiness"})
title = res.title.text
print(title)
for tag in tags:
print(tag)
could someone help me make the modification so that I can insert something like this?
html = urlopen ("url1, url2, url3")
Wrap the repeatable parts of your code in a function and use a list:
Invoke this function with urlhelper(["url1","url2","etc"])
The key concept to understand here is "for" which tells python to iterate over each element in the list.
I recommend reading up on iterators and lists for more info:
https://www.w3schools.com/python/python_lists.asp
https://www.w3schools.com/python/python_iterators.asp