I'm a noob using re
library on Python. I am doing a Web Scraping and I would like to match some string patterns and append the values on a list. for instance:
parking = []
rooms = []
toilets = []
attribute = soup.find('ul',{'class':'specs-list'}).find_all('li')
for a in attribute:
print(a.text)
output iteration a with index 0
Metters
50 m�
Rooms
2
Toilets
1
output iteration a with index 1
Metters
50 m�
parking
1
spends
340
so for example I want to match the names of the titles and if exists on the A value I want to append the result on each list
pseudocode:
for a in attribute:
if a contains "Rooms":
rooms.append(a)
if a contains "Parking":
parking.append(a)
if a contains "toilets":
parking.append(a)
if a not contains strings above:
rooms.append(nan)
parking.append(nan)
rooms.append(nan)
I use BeautifulSoup to create the web scraping and the result of attribute value is the following one:
Attribute variable output for index 0:
[<li class="specs-item">
<strong>Metters</strong>
<span>50 m�</span>
</li>,<li class="specs-item">
<strong>Rooms</strong>
<span>2</span>
</li>,<li class="specs-item">
<strong>Toilets</strong>
<span>1</span>
</li>,<li class="specs-item">
<strong>Spends</strong>
<span>340</span></li>]
An attribute has a length 0f 5 values and each value has a similar code than the above but the titles and values are different, someones contain parking, rooms, toilets, others just have toilets and rooms, and so on.
This should help u:
Output for the
li
values provided by u:Edit:
Though this works, what I feel is that having so many
lists
is not a good approach. Instead, u can use adictionary
. This is how u can achieve the same output using adictionary
:Output:
I feel that this is a better approach. But it is all up to u. Choose whichever best suits ur task.