Watch changes in a .jspx page with session hash

196 Views Asked by At

I want to watch changes in a webpage but I don't know how to get a URL to add it to a watcher.

This is how you reach the desired page. First you go to this page. (If you get a Service Temporarily Unavailable error try forcing a reload, Shift+Click Reload or open it in a new page with Alt+Intro)

There are several items beginning with "Convocatoria". Every item has several sections: OPE, Tipo de personal, etc. and then 4 links. The page which you are redirected to when you click the 3rd (Tablón de anuncios) is the one I'm interested in. Choose one (the first, for example) and click Tablón de anuncios.The highlighted part is to be watched, i.e. if there is any line added to the table (only one ATOW).

Text to watch highlighted

Inspecting the element (that 3rd link) shows

<a onclick="submitForm('_id2',1,{source:'_id63'});return false;" class="x8w" href="#">Tablón de anuncios</a>

Function submitForm is in https://sede.cordoba.es:4443/seleccion/web-empleo/adf/jsLibs/Common10_1_3_4_0.js (line 2749). If I set a break at the beginning of that function, step by step, I reach line 3013

form.submit();

And this is all I know to do. The page loads but the loaded resources then are pictures, css, js, etc. The only page I see in Network has the same URL as the first (...faces/empleo.jspx).

  • A valid answer would be "This is the URL you need", if you can easily infer how submitForm('_id2',1,{source:'_id63'}) transform into that URL. Or if you explain the path of actions to find it out.

  • Or a documented source and an explanation of why this is impossible (without hacking the server or so).


Solution in Android (or so it seems)

Looking for info on this topic I stumbled upon this free Android app (there's also a pro version, but you don't need it for this) so I tried it. It has an embedded browser: you can open a page, navigate wherever you want to, select an element (or group of them) inside that final page and watch it/them.

I navigated to the page I needed, selected the table and the program showed me only the text inside it to compare for changes, so I think when there's a new row it will alert me.

You can configure every alert to use javascript or not, so I selected to use it. For now, it showed no alert so it seems that reaches the desired page and not the first I opened, but you cannot be sure until there's a change (and don't know when it will but I'll inform here of the outcome).

I would have preferred a watch from Windows (and I'll keep searching) but for now I feel confident enough. When watching pages I prefer to use two different solutions just in case and here all the more.

Almost solution

The curl command provided by Marks Polakovs gave me an idea. If you can download the intended page (with curl or whatever) to a file.htm inside a folder, it's easy to start a localhost web server (I know PHP and python do it but there must be a bunch of them) to share that path and watch localhost/file.htm for changes.

All you have to do then is to run a Windows task every x minutes/hours to refresh file.htm. I suppose a headless browser (PhantomJS or the like) would do the job of downloading, virtually clicking the link and saving the page, but I actually don't know how to do it.

No answer provides a working proof of concept of its idea but at least Marks Polakovs tried it and gave valuable hints, so I'll upvote it and the system will assign him the bounty.

3

There are 3 best solutions below

3
EECOLOR On

The form containing the url that the data is posted to is in the document. In your specific example the form looks like this:

<form id="_id2" ... method="POST" action="/seleccion/web-empleo/faces/empleo.jspx;jsessionid=efba550217d9f5adbe8cbe05c049fbc967ac5e5fe755adb2a53bf3a41aa5eaa8.e3iMchmQbhuNe3aLbN8Rax8Nay0">

So, if you want to intercept calls to this url you can do the following:

var _oldSubmitForm = submitForm
submitForm = function (form, doValidate, parameters) {
  console.log(document.forms[form].action)
  _oldSubmitForm(form, doValidate, parameters)
}

I would not recommend doing it like this, but I don't know the situation you are in and how much control you have over the generated HTML.


Update

As cdlvcdlv explained, the above is not enough. The situation here is quite tricky. The desired content is only rendered when the form is posted with specific information. Just posting the form will not work as the server expects a Java session to be available.

The only way to get the server to display the desired page is by performing the following steps:

If you want to have a service that can be given a url and just loads the page you need to set up a server yourself to perform the steps above. This server would perform the 2 actions and return the appropriate content.

To summarize: there is no way to get the correct page rendered using a single GET request.

3
Geuis On

There's no way to "watch" the url changing. Clicking on that link submits the form back to the same source url, triggering a complete page reload.

3
Marks Polakovs On

Stepping through the code for the onclick handler you see that at some point it finds the form with ID _id2 (line 2775) and, after adding some parameters through hidden fields ({source:'_id63'}), submits it. (To find this out, open Common10_1_3_4_0.js in the Chrome DevTools, place a breakpoint on line 2758, and step through the execution.) The request itself is a POST to https://sede.cordoba.es:4443/seleccion/web-empleo/faces/empleo.jspx with the following POST parameters:

oracle.adf.faces.FORM: _id2
oracle.adf.faces.STATE_TOKEN: 5
event: 
source: _id63

Some of them were there already in the form's HTML, while others were added by submitForm().

So, to answer your question, your HTTP watcher will need to support POST, and if it does, add that URL with those parameters to it.

EDIT: here's a cURL command (escaped for Windows cmd):

curl -X POST "https://sede.cordoba.es:4443/seleccion/web-empleo/faces/empleo.jspx" --data "oracle.adf.faces.FORM=_id2^&oracle.adf.faces.STATE_TOKEN=9^&event=^&source=_id63"