how to crawl a login protected site or page?

630 Views Asked by At

I want to crawl a site, which is required access to see pages. I am able to crawl guest pages, but how to crawl login protected pages? It will be great if somebody share steps to configure or skip the authentication mechanism to crawl a page using storm crawler.

Thank you very much in advance.

1

There are 1 best solutions below

0
On BEST ANSWER

You can set the following keys with their corresponding values in the configuration of your topology

http.basicauth.user
http.basicauth.password

See WIKI page on configuration