I'm trying to scrape some data from websites, just started yesterday, but I can't get the data I want from the content I scrape from that website.
try:
html=urlopen('https://shopee.tw/api/v4/search/search_items?by=relevancy&keyword=%E9%81%8B%E5%8B%95%E9%9E%8B&limit=60&newest=0&order=desc&page_type=search&scenario=PAGE_GLOBAL_SEARCH&version=2')
except HTTPError as e:
print(e)
except URLError as e:
print('THe server is not found')
else:
print('worked')
bs=BeautifulSoup(html,'html5lib')
bodydata=bs.find_all('body')
How can I get the specific dictionary inside bodydata? I tried
for x in bodydata:
print(x.get_text())
but the result is a string variable. How can I get certain dictionary, if the bodydata look like this?
{"bff_meta":null,"error":null,"error_msg":null,"reserved_keyword":null,"suggestion_algorithm":null,"algorithm":"eyJzZWFyY2giOiIwLmEuNjE2N0BPUEtVVVVFSklXSVBSWEtSVVNaQUVST1ZRUFZaWE5URk1FT05SWTBEWVRZSkcxRVRHVkVTMlpKM1BRNzBaQzExNUdPMTgyIn0=","total_count":3177,"nomore":false,"items":[{"item_basic":{"itemid":10288392472,"shopid":451862210,....
Based on your example above:
This returns a dataframe like the one below, 60 rows × 22 columns: