First, some background :- I'm trying to solve a question asked by an interviewer recently. I had to write a code and use below URL to return JSON response - https://losangeles.craigslist.org/
This is what I did :- 1) I created a webclient and made HTTPURL Request to fetch an HTTP Response.
public static JSONArray getSearchResults(String arg) {
JSONArray jsonArray = null;
try {
QueryString qs = new QueryString("query", arg);
URL url = new URL("https://toronto.craigslist.ca/search?"+qs);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "application/text");
if (conn.getResponseCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ conn.getResponseCode());
}
BufferedReader br = new BufferedReader(new InputStreamReader(
(conn.getInputStream())));
String readAPIResponse = " ";
StringBuilder output = new StringBuilder();
while ((readAPIResponse = br.readLine()) != null) {
output.append(readAPIResponse);
}
jsonArray = convertToJson(output);
System.out.println(" JSON response : "+jsonArray.toString(2));
conn.disconnect();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return jsonArray;
}
2) Below was my function to convert the response into JSON :-
public static JSONArray convertToJson(StringBuilder response) {
JSONArray jsonArr = new JSONArray();
if (response != null) {
try {
Document document = Jsoup.parse(response.toString());
Elements resultRows = document.getElementsByClass("result-row");
JSONObject jsonObj;
for (int i = 0; i < resultRows.size(); i++) {
jsonObj = new JSONObject();
Element e = resultRows.get(i);
Elements resultsDate = e.getElementsByClass("result-date");
Elements resultsTitle = e.getElementsByClass("result-title hdrlnk");
String key1 = "date";
String value1 = resultsDate.get(0).text();
jsonObj.put(key1, value1);
String key2 = "title";
String value2 = resultsTitle.get(0).text();
jsonObj.put(key2, value2);
jsonArr.put(i, jsonObj);
}
} catch (JSONException e) {
e.printStackTrace();
}
}
return jsonArr;
}
The response I received was the whole HTML page(I used postman to make requests). Since, I only had few hours to solve this question and was not sure how to parse an entire HTML, I ended up using a third party library, called JSoup. I was not 100% happy about it, but ended up having no other option.
I have not heard back from them and I am curious if this was the worst approach and if yes, what could be better options? They did not mention anything about what technology I could use. But,since the skill set I was interviewing involved Java/J2EE I was thinking to implement this in Java (Not using Node js though) Thanks!
If you only need an XML Parser which is obviously the base of HTML this is built in in the JRE core API.
Even in the SE Version the needed packages to parse exist:
Take a look at these classes they are the most important to parse or create an XML/HTML File
and here simple example for HTML