Using Request, HttpNtlmAuth to make a system call with authentication

10k Views Asked by At

I've got the following snip of code the has the audacity to tell me it is "FAIL to load undefnied" (the nerve...) I'm trying to pass my authenticated session to a system call that uses javascript.

    import requests
    from requests_ntlm import HttpNtlmAuth
    from subprocess import call
    # THIS WORKS - 200 returned


    s = requests.Session()
    r = s.get("http://example.com",auth=HttpNtlmAuth('domain\MyUserName','password'))
    call(["phantomjs", "yslow.js", r.url])

The issue is when "calL" gets called - all I get is the following

FAIL to load undefined.

Im guessing that just passing the correct authenticated session should work - but the question is how do I do it such that I can extract the info I want. Out of all the other attempts this has been the most fruitful. Please help - thanks!

1

There are 1 best solutions below

0
On BEST ANSWER

There seem to be couple things going on here so I'll address them one-by-one.

  1. The subprocess module in python is meant to be used to call out to the system as if you were using the command line. It knows nothing of "authenticated session"s and the command line (or shell) has no knowledge of how to use a python object, like a session, to work with phantomjs.

  2. phantomjs has python bindings since version 1.8 so I would expect this might be made easier by using them. I have not used them, however, so I can not tell you with certainty that they will be helpful.

I looked at yslow's website and there appears to be no way to pass it the content that you are downloading with requests. Even then, the content would not have everything (for example: any externally hosted javascript that would be loaded by selenium/phantomjs or a browser, is not loaded by requests)

yslow seems as though it normally just downloads the URL for you and performs its analysis. When the website is behind NTLM, however, it first sends the client a 401 response which should indicate to the client that it must authenticate. Further, information is sent to the client that tells it how to authenticate and provides it parameters to use when authenticating for NTLM. This is how requests_ntlm works with requests. The first request is made and generates a 401 response, then the authentication handler generates the proper header(s) and re-sends the request which is why you see the 200 response bound to r.

yslow accepts a JSON representation of the headers you want to send so you can try to use the headers found in r.request.headers but I doubt they will work.

In short, this is not a question that the people who normally follow the requests tag can help you with. And looking at the documentation for yslow it seems that it (technically) does not support authentication of any type. yslow developers might argue though that it supports Basic Authentication because it allows you to specify headers.