I am trying to read an html link that contains something like this
<html>
<head>
<title>
Title
</title>
</head>
<body>
Name1 Age1 Hometown1<br>
Name2 Age2 Hometown2<br>
Name3 Age3 Hometown3<br>
</body>
</html>
with method readData(String[] urls) where String[] urls is an array of strings, strings being one or more urls. Now I'm only interested in what's in the html body of each url, hence I used while .readLine!=null and .contains("<br>"). However, it appears that my code can only read the first line of the body block (starting with line after <body>, as I want) and does not go on to the lines after until the </body>. How would I make my code read past the first line?
public void readData(String[] urls) {
for (int i=0; i<urls.length; i++) {
String str="";
try {
URL url=new URL(urls[i]);
URLConnection conn=url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while (( s = in.readLine())!=null)
if (s.contains("<br>")) {
str += s;
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
EDIT1: The issue appears to be that the entire input is coming in as one line rather than multiple lines, as it should be. How would I partition that one line into multiple lines so that I can read each?
EDIT2:
Thanks everyone. I've figured that out. I still use the single long input of String but I just partition it into a String array using .split() and read each element of that. However, there is a new problem now. for my String[] urls, I am only reading the first element. I cannot read anything beyond the first String urls element when actually I want to read all the String elements in urls. Any ideas?
I may be completely wrong about this, but it seems if your data seems to have newlines, they may actually be carriage returns.
Check out String.split()
Also check out the difference between
\nand\rYou can try something like
String textStr[] = yourString.split("\\r?\\n");Just as a side note,
StringBuilderwas built for this.