Block bots by their Java referrer string?

284 Views Asked by At

I have been getting a lot of web hits in my logs that crawl most top level pages of my site and show a referrer as a Java version.

I see different variants of the Java versions in the referrer, i.e. Java/1.6.0_04, Java/1.4.1_04, Java/1.7.0_25, etc.

And sometimes, but not always, I get a 404 for /contact/ but none of the other pages below.

The IPs are usually always spam harvesters and bots, according to Project Honeypot

78.129.252.190 - - [24/Jan/2014:01:28:52 -0800] "GET / HTTP/1.1" 200 6728 "-" "Java/1.6.0_04" 198 7082
78.129.252.190 - - [24/Jan/2014:01:28:55 -0800] "GET /about HTTP/1.1" 301 - "-" "Java/1.6.0_04" 203 352
78.129.252.190 - - [24/Jan/2014:01:28:55 -0800] "GET /about/ HTTP/1.1" 200 29933 "-" "Java/1.6.0_04" 204 30330
78.129.252.190 - - [24/Jan/2014:01:28:56 -0800] "GET /articles-columns HTTP/1.1" 301 - "-" "Java/1.6.0_04" 214 363
78.129.252.190 - - [24/Jan/2014:01:28:57 -0800] "GET /articles-columns/ HTTP/1.1" 200 29973 "-" "Java/1.6.0_04" 215 30370
78.129.252.190 - - [24/Jan/2014:01:28:58 -0800] "GET /contact HTTP/1.1" 301 - "-" "Java/1.6.0_04" 205 354
78.129.252.190 - - [24/Jan/2014:01:28:58 -0800] "GET /contact/ HTTP/1.1" 200 47424 "-" "Java/1.6.0_04" 206 47827

What are they looking for? A vulnerability?

Can I block these visits by their Java referrer? If so, how? With a php function?

Or should I block them by IP? (Which I know how to do in .htaccess, but which is a less proactive method).

0

There are 0 best solutions below