I'm bit new to python, I've trying to scrap a page using Beautiful Soup and output the results in a JSON format. SimpleJson
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import json as simplejson
webpages = (
"page1.html",
"page2.html",
"page3.html"
)
my_dict = {}
for webpage in webpages:
soup = BeautifulSoup(open(webpage))
title = soup.title.string
body = soup.find(id="bodyText")
my_dict['title'] = title
my_dict['body']= str(body)
print simplejson.dumps(my_dict,indent=4)
I'm only getting the results of the last page? Can someone tell me where I'm going wrong?
An indentation can cause wonders in python , only the last line needed to be indented inside the for loop
or if you really want all the data in one dictioanry, then you could try:
So the code may look like: