Is there a way we can detect if an html file carries Javascript ? and can we stop rendering Javascript from html, in Node JS ?
I know we can stop the html rendering all together by setting the response content-type
from text/html
to text/plain
. But I'm trying to figure out some way to stop rendering the JS only.
Kindly let me know if it's even possible, Thanks.
I'm guessing you're sending the file to a browser from Node.js (you talked about changing the content type header).
To do this, you'll need to:
Parse the file with an HTML parser (there are a few available for Node.js). Be sure it's one that normalizes input, so that (for instance),
<a href="javascript:codeHere()">xxx</a>
is normalized to<a href="javascript:codeHere()">...</a>
. (Thanks Quentin for emphasizing that!)Using the resulting document model, remove:
any
script
elementsany
onxyz
attributes (onclick
,onmouseover
) on elementsFor instance,
<div onclick="..."
should be changed to<div ...
.remove any URL attributes (like
href
ona
elements) that use thejavascript:
schemeFor instance,
<a href="javascript:codeHere()"
should be changed to<a href="#"
or similar (if you removehref
entirely, that works to, but the link will no longer automatically be a tabstop etc.).(This is where normalization in the parser is important.)
Serialize the resulting document model to HTML and send it to the browser