Best visible content extractor available

110 Views Asked by najeeb At 27 July 2025 at 20:30

So my application needs visible content from a given URL, like just the text part, no html no header or footer data. As of now I am using beautifulsoup and boilerpipe for getting the same. But in some rare cases I am not getting enough data or the right data. So was wondering is there any other competitor, programming language is not a barrier.

Original Q&A

There are 1 best solutions below

eLRuLL On 02 January 2017 at 13:19

I would recommend xpath or css extractors directly for content extraction, both selectors are already simply implemented on parsel module.

For a complete suite of web-crawling + content extractor, scrapy would be my preferred option.

And if you want to extract to visually select what parts of the html to extract, I would recommend portia.

Hope that helped.

Best visible content extractor available

There are 1 best solutions below

Related Questions in WEB-SCRAPING

Related Questions in WEB-CRAWLER

Related Questions in SCREEN-SCRAPING

Related Questions in HTML-CONTENT-EXTRACTION

Trending Questions

Popular # Hahtags

Popular Questions