I have seen that many sites provide free http proxy lists, for example this site and I want to write a script to search for http proxies from the internet rather than from those sites.
I googled a lot but couldn't find any papers or blogs about methods of scanning the web for http proxies.
Any idea will be appreciated.
Few months ago i needed the same, and at the ended i abandoned the idea of getting them by googling, cause the proxies found were old / expired.
I approched the problem in another way and now i get ~1K fresh proxies every hour or so (which is enough for me).
As a part in my last project (a full featured scraping platform based on zeromq/mongo/php/casperjs), i built a free proxies crawler that, i think, does what you need, but it targets specific freeproxies sites (in my case 15), using simple xpath/regex (with php/curl on raw html, and with casperjs on browser evaluated html) it extracts the proxies list, verify proxies availability and geoip them to make them filterable by region, performance, etc.
I suggest you to do the same, first identify valid free proxies sources and then scrape them as frequently as you need (most of them update the free list every hour or so).
Hope it helps