I'm developing a Java program that consist in a web crawler parser.
I download HTML source code from web page using Jsoup and I want to extract src's and alt's in order to write them into a CSV file.
The problem is, I can't find a way to remove the alt=""
and src=""
.
I don't want them in my CSV file, I just want the picture URL and its description. Does anyone has an idea?
Here is what I do :
Document html = Jsoup.connect(url).get();
Elements titres = html.select("img[src$=.jpg], div[class$=price] ");
Thank you for your answer, but as it was a professional project I've already found an other way to do it. For those who would like to know how i did