Beautifulsoup getting the content of specific children

485 Views Asked by At

I'm currently working on a crawler with Beautifulsoup. I want to get the data of specific children of an unordered list.

So the webpage is basically like this:

<div class= product-list-item--usp-list>

    <ul class="unordered-list"> 
        <li>a</li> 
        <li>b</li> 
        <li>c</li> 
    </ul> 

I'm currently only receiving the content of the 0th child (a). I only want to get the content of the first en the second child (b & c). My code is like this:

    a = item.find("ul", class_="unordered-list").li
    b = item.find("ul", class_="unordered-list").li

So i tried this: a = item.find("ul", class_="unordered-list").li[1] b = item.find("ul", class_="unordered-list").li[2]

This was my error:

   a = item.find("ul", class_="unordered-list").li[1]
  File "/usr/local/lib/python2.7/dist-packages/bs4/element.py", line 905, in __getitem__
    return self.attrs[key]
KeyError: 1
[Finished in 2.9s with exit code 1]

My question is: How do i receive the content of child[1] and child[2]? Thanks in advance!

1

There are 1 best solutions below

1
On BEST ANSWER

You could do like below.

>>> from bs4 import BeautifulSoup
>>> s = """<div class= product-list-item--usp-list>

    <ul class="unordered-list"> 
        <li>a</li> 
        <li>b</li> 
        <li>c</li> 
    </ul> """
>>> soup = BeautifulSoup(s)
>>> foo = soup.find("ul", class_="unordered-list")
>>> [i.text for i in foo.find_all('li')[1:]]
['b', 'c']