Xenu Link checker

2.2k Views Asked by At

I want to use an application that checks for broken links. I got to know that, Xenu is one such software. I do not have access to internal aspx/http files on a drive. The Problem I am facing is the Website requires the user to be authenticated. After login I need to crawl the site to determine which links are broken.

As an example, I kick off with mail.google.com. We end up typing the Username and password after which we are served different URLs. If I give the Xenu (or similar programs) the link such as mail.google.com it will not be able to fecth URLs inside the mail.google.com which will be of type - /mail/u/0/?shva=1#inbox/ etc. There lies the problem.

With minimal or least scripting language how can I provide Xenu (or other similar app) capability to Login by providing external URL (mail.google.com) in this example in order to do whatever xenu has to do.

Thanks
Balaji S

1

There are 1 best solutions below

0
On

Xenu can be used with an authenticated user as long as the cookies are persistent. You will need to enable cookies in Xenu and login once yourself using IE.

From their FAQ:

By default, cookies are disabled, and Xenu rejects all cookies. If you need cookies because

  • you have used Internet Explorer to authenticate yourself before starting a run
  • to prevent the server from delivering URLs with a session ID

then you can enable the cookies in the advanced options dialog. (This has been available since Version 1.2g) Warning: You should not use this option if you have links that delete data, e.g. a database or a shop - you are risking data loss!!!

You can enable cookies in the Options menu. Click Preferences and switch to the Advanced tab.

For single page applications (like gmail) you will also need to configure Xenu to parse Javascript This is done by modifying the ini file (traditionally at C:\Program Files (x86)\Xenu135\Xenu.ini) and adding a line of code under [Options]

Javascript=[Jj]ava[Ss]cript: *[_a-zA-Z0-9]+ *\( *['"]((/|ftp://|https?://)[^'"]+)['"]

There are several variations provided in their FAQ, but I didn't get them to work perfectly.