How to create program to sift local Craig's List finds?

343 Views Asked by At

I just started to learn programming a little over a week ago and am as green as they come. I have never taken any computer science courses but am being mentored by a very good professional programmer (my boyfriend). As a challenge for myself I have to create a program that sifts through Craig's List and emails me continuously thumb nails and links to Craig's List finds that I have set specifications for based on things that I like like vintage, modern, etc. The program will be written in Java because that is what I aim to get a job in eventually. My question is, how would one go about this task? I am not looking for it to be done for me just for some help of course. Thank you for all your help in advance.

Best,

Paula

1

There are 1 best solutions below

0
On

Interesting question, with lots of potential answers.

I’ll choose to answer by describing how I’d go about trying to solve the problem. Now, since you’ll be working with Craigslist, a web service, the first thing I’d do is find out what kind of API it has. Searching Google for "craigslist api", the first few pages of hits suggest that there actually is no Craigslist API. This is disappointing and slightly surprising for a web service these days, and means that you’ll probably have to get your hands dirty and scrape the actual HTML code. This is not really something I’d do in Java, although I admit that that may be partly because I don’t know the Java HTTP APIs. So I’ll just provide an outline of what the program could do:

  1. The input to the program is the search terms, and they only need to be provided once at the beginning. That’s what command line arguments are used for, so each argument could be a search term. What does it mean to provide multiple search terms? Probably either that they all must match, or that at least one must match. You decide.
  2. The interesting part of the program consists of a main loop which repeats the following:
    1. Fetch the Craigslist search result page that you need. @Danny suggested you look at Apache HttpClient, and that sounds good to me.
    2. Extract the information you need from the HTML. (This is the hard part, and I’ll leave it as an excercise—feel free to start a new question if you need help with this.)
    3. Hmm, here you need to know which results are new and which you’ve already seen. Since this information should persist even if you restart the program, you’ll probably want to store it in a file. In that case, you’d open the file here, and for each line in the file, go through each of your results and remove that result if it’s found in the file. The file could contain one URL per row, for example.
    4. Now you’ll try mailing the results to yourself. @Danny suggested you check out an API called "JavaMail", so that’s what I’d try first. (Attaching the thumbnails to the e-mails is going to be a little tricker, so I suggest skipping that part at first.) If sending the e-mail doesn’t succeed, you could just skip saving the new results to your file with the list of already-seen URLs so you can include them next time instead (have your boyfriend explain try..catch for you). If it does succeed, just open the file and put the new URLs at the end.
    5. You’re probably going to want to print some information to the screen about what you just did, so that you get some feedback on what’s happening when you use the program. Then you’d just sleep for a while (say, one minute) before going back around to the start of the main loop and doing it all over again.

If you need help with any of these parts, start a separate question. If you need help with how to get started writing Java programs, search Google for a beginner’s tutorial and you should turn up plenty. When you get the hang of it, I suggest trying an IDE like Eclipse, which is a complicated but sophisticated tool that can help you out in a number of ways. IDEs come and go, though, and it’s always good to know how to program in a particular language using only the basic language tools (the java and javac programs, in your case).

Anyway, good luck, and happy hacking! :-)