I'm trying to make some automatic captcha input systems (recognition of figures in captcha image by deep learning and input the figure automatically) . For that, the captcha image should be inputted to some deep learning system.
The captcha image in some online web page is in the img src but the src is not ending with some file extension like jpg, png..
It look like the below(it is an example).
img src="/nn/mm/captchaimg?/kk=image"
The above image shown in the web browser is changing after some time periods(about 1,2 min). If the web including the above image is reloaded, the captcha is changed to another captcha image (It means that in web crawler, the image is changed to by each request to server).
How to save the above image which has the above special properties in html in crawler?
I'm now doing it with goquery in Golang.
I should do
- Saving the captcha image in the web page got by requesting in crawler.
- Getting the figures in the above captcha image using my deep learning system.
- Input the figures to the form in the above web page in step 1(with maintaining the session, reload or retry of requesting should not be done) and submit
I have done deep learning system in Step 2(test be done, it works well).
But I have no idea of Step 1, 3.
Any advice will be helpful to me.
Thank you in advance.
There are lots of documents about downloading the image in golang crawlier.
But I cannot find the methods to download the images changing by requesting the image URL (I want to download the first image when I request the web page including the image at first time).