I'm fairly new to Python and am trying to make a web parser for a stock app. I'm essentially using urllib to open the desired webpage for each stock in the argument list and reading the full contents of the html code for that page. Then I'm slicing that in order to find the quote I'm looking for. The method I've implemented works, but I'm doubtful that this is the most efficient means of achieving this result. I've spent some time looking into other potential methods for reading files more rapidly, but none seem to pertain to web scraping. Here's my code:
from urllib.request import urlopen
def getQuotes(stocks):
quoteList = {}
for stock in stocks:
html = urlopen("https://finance.google.com/finance?q={}".format(stock))
webpageData = html.read()
scrape1 = webpageData.split(str.encode('<span class="pr">\n<span id='))[1].split(str.encode('</span>'))[0]
scrape2 = scrape1.split(str.encode('>'))[1]
quote = bytes.decode(scrape2)
quoteList[stock] = float(quote)
return quoteList
print(getQuotes(['FB', 'GOOG', 'TSLA']))
Thank you all so much in advance!
Here's that implementation in
Beautiful Soup
andrequests
:As mentioned in the comments you may want to look into multithreading or grequests.
Using
grequests
to make asynchronous HTTP requests:Update: here's a modified version from Dusty Phillips' Python 3 Object-oriented Programming that uses the built-in
threading
module.