I want to search on Google Yahoo, for forums and blog posts limited to a specific country. The results will be saved to a database for sorting and further processing.
From each search result, I need :
- the URL itself
- date and time
- the domain
I am working on a program, that accepts keywords as input, and the program will automatically search on Google and Yahoo and save the results to a database.
function OnLoad() {
// Create a search control
var searchControl = new google.search.SearchControl();
// Add in a full set of searchers
var localSearch = new google.search.LocalSearch();
searchControl.addSearcher(localSearch);
searchControl.addSearcher(new google.search.WebSearch());
searchControl.addSearcher(new google.search.VideoSearch());
searchControl.addSearcher(new google.search.BlogSearch());
searchControl.addSearcher(new google.search.NewsSearch());
searchControl.addSearcher(new google.search.ImageSearch());
searchControl.addSearcher(new google.search.BookSearch());
searchControl.addSearcher(new google.search.PatentSearch());
// Set the Local Search center point
localSearch.setCenterPoint("New York, NY");
// tell the searcher to draw itself and tell it where to attach
searchControl.draw(document.getElementById("searchcontrol"));
// execute an inital search
searchControl.execute("VW GTI");
}
google.setOnLoadCallback(OnLoad);
This code is from the Google AJAX search API, however there seems not to be a way to specify the domain, country, date and time as search criteria. Moreover, it returns the result in HTML, which is hard to slice up and save as search results entries to the DB.
EDITED to describe my specific problem.
Parsing the raw HTML should be your last resort here. If they change the markup, you have to redesign your parser. That is pretty much guaranteed to happen before the "3 years" time period that you have mentioned with Google's AJAX Search API.