How to extract the content of an HTML attibute

87 Views Asked by At

I'm developing a Java program that consist in a web crawler parser. I download HTML source code from web page using Jsoup and I want to extract src's and alt's in order to write them into a CSV file. The problem is, I can't find a way to remove the alt="" and src="". I don't want them in my CSV file, I just want the picture URL and its description. Does anyone has an idea? Here is what I do :

Document html = Jsoup.connect(url).get();
Elements titres = html.select("img[src$=.jpg], div[class$=price] ");
1

There are 1 best solutions below

0
On

Thank you for your answer, but as it was a professional project I've already found an other way to do it. For those who would like to know how i did

String image = titres.get(i).attr("src");