The problem
I'm testing an HTTP proxy that is wrapping a SOCKS proxy (TOR). It works ok for normal URLs but I'm getting strange results with some .onion addresses.
In this example, I'm pointing at "the hidden wiki". The output looks like garbage:
$ curl --proxy localhost:8118 http://kpvz7ki2v5agwt35.onion/
m�AO�@�����ۑp��ĖPbj
Background
Using the torch hidden service works ok:
$ curl --proxy localhost:8118 http://xmh57jrzrnw6insl.onion/
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>TORCH: Tor Search!</title>...
Similarly, normal URLs seem ok:
$ curl --proxy localhost:8118 https://check.torproject.org/ | grep Congratulations
<img alt="Congratulations. Your browser is configured to use Tor." src="/images/tor-on.png">
Congratulations. Your browser is configured to use Tor.<br>
The proxy is created with polipo with the following configuration:
proxyName = "localhost"
proxyAddress = "127.0.0.1"
proxyPort = 8118
allowedClients = 127.0.0.1
allowedPorts = 1-65535
cacheIsShared = false
chunkHighMark = 67108864
socksParentProxy = "localhost:9050"
socksProxyType = socks5
diskCacheRoot = ""
localDocumentRoot = ""
disableLocalInterface = true
disableConfiguration = true
disableVia = true
dnsUseGethostbyname = yes
maxConnectionAge = 5m
maxConnectionRequests = 120
serverMaxSlots = 8
serverSlots = 2
tunnelAllowedPorts = 1-65535
Possible causes
My thoughts on a possible cause:
- The server responding with garbage as some kind of anti-web-crawler measure.
- There something wrong with the way I'm handling the response.
- Polipo is messing it up.
- Something else...
Thoughts?
There are several issues. First, you need to make sure that ntp is running and synced. If not, you will have precisely this problem.
If I may ask, why aren't you just calling the socks port on 9050 that the Tor Project supplies?
Also, why not precisely specify which protocol you are interested in, socks 4, 4a, or 5? With curl, you could even specify that the dns is done through Tor. If you don't do that, then of course your DNS resolver won't know where the hidden services are. Put this line in your .curlrc file
and then
curl -v http://xmh57jrzrnw6insl.onion/
will return the page corectly, provided the Tor network isn't overloaded. Check in the Tor Browser to make sure the hidden services are up.