Is there a way we can detect if an html file carries Javascript ? and can we stop rendering Javascript from html, in Node JS ?
I know we can stop the html rendering all together by setting the response content-type from text/html to text/plain. But I'm trying to figure out some way to stop rendering the JS only.
Kindly let me know if it's even possible, Thanks.
I'm guessing you're sending the file to a browser from Node.js (you talked about changing the content type header).
To do this, you'll need to:
Parse the file with an HTML parser (there are a few available for Node.js). Be sure it's one that normalizes input, so that (for instance),
<a href="javascript:codeHere()">xxx</a>is normalized to<a href="javascript:codeHere()">...</a>. (Thanks Quentin for emphasizing that!)Using the resulting document model, remove:
any
scriptelementsany
onxyzattributes (onclick,onmouseover) on elementsFor instance,
<div onclick="..."should be changed to<div ....remove any URL attributes (like
hrefonaelements) that use thejavascript:schemeFor instance,
<a href="javascript:codeHere()"should be changed to<a href="#"or similar (if you removehrefentirely, that works to, but the link will no longer automatically be a tabstop etc.).(This is where normalization in the parser is important.)
Serialize the resulting document model to HTML and send it to the browser