A program that gather search results from Google and Yahoo

899 Views Asked by At

I want to search on Google Yahoo, for forums and blog posts limited to a specific country. The results will be saved to a database for sorting and further processing.

From each search result, I need :

  • the URL itself
  • date and time
  • the domain

I am working on a program, that accepts keywords as input, and the program will automatically search on Google and Yahoo and save the results to a database.

function OnLoad() {
  // Create a search control
  var searchControl = new google.search.SearchControl();

  // Add in a full set of searchers
  var localSearch = new google.search.LocalSearch();
  searchControl.addSearcher(localSearch);
  searchControl.addSearcher(new google.search.WebSearch());
  searchControl.addSearcher(new google.search.VideoSearch());
  searchControl.addSearcher(new google.search.BlogSearch());
  searchControl.addSearcher(new google.search.NewsSearch());
  searchControl.addSearcher(new google.search.ImageSearch());
  searchControl.addSearcher(new google.search.BookSearch());
  searchControl.addSearcher(new google.search.PatentSearch());

  // Set the Local Search center point
  localSearch.setCenterPoint("New York, NY");

  // tell the searcher to draw itself and tell it where to attach
  searchControl.draw(document.getElementById("searchcontrol"));

  // execute an inital search
  searchControl.execute("VW GTI");
}
google.setOnLoadCallback(OnLoad);

This code is from the Google AJAX search API, however there seems not to be a way to specify the domain, country, date and time as search criteria. Moreover, it returns the result in HTML, which is hard to slice up and save as search results entries to the DB.

EDITED to describe my specific problem.

1

There are 1 best solutions below

2
On

Parsing the raw HTML should be your last resort here. If they change the markup, you have to redesign your parser. That is pretty much guaranteed to happen before the "3 years" time period that you have mentioned with Google's AJAX Search API.