I'm thinking if it could be possible to get a site's "characteristic" color. For instance, TechCrunch would be green, ReadWriteWeb would be red, CNN also red, Microsoft blueish, PHP purple, etc...
It doesn't have to be accurate, just a best guess.
Some things I have on my mind:
- parse all css rules and find the one matching the most elements
- parse all css rules and find background colors of the elements having the biggest dimensions
- getting the body element's background image and getting the predominant color of that (is this possible for an image)
- somehow finding the site's "header" (first element in DOM with background css attribute set?) and getting its background
Also I would need a way to eliminate blacks, greys and white.
Is this feasible? Do you have any other ideas?
P.S. Sorry for my English
Ok, here comes some seriously unorthodox approach:
Use some screen capturing package[1][2] to render the given URL to a Raster Image (like PNG). Analyse the resulting raster image sampling it's pixels for an average, if you're looking for average, or give a threshold to group pixels into "colour-groups". Using the average or max-occurrence of colour groups (which method to use depends on what matters most to you) you can get a pretty high accuracy representation of the predominant colour in the page.
[1] http://cutycapt.sourceforge.net/ [2] http://weblogs.mozillazine.org/roc/archives/2005/05/rendering_web_p.html