I have an API which receives a string containing HTML code and stores it in a database. I'm using node-html-parser package to perform some logic on the HTML.
Among other things, I want to remove any potentially-malicious script. According to the documentation, the package should be able to do this when instructed via the options object (see 'Global Methods' heading in previous link).
My code:
const parser = require('node-html-parser');
const html = `<p>My text</p><script></script>`
const options = {
blockTextElements: {
script: false
}
}
const root = parser.parse(html, options)
return ({ html: root.innerHTML})
I tried modifying the options object with script: true, noscript: false, and noscript: true as well, but neither removed the script tags from the html.
Am I doing something wrong?
Seems like the
'node-html-parser'is kind of buggy forscript: falsebut we still can use this library to work with DOM. My solution is to usequerySelectorAllto find all the<script>tags and remove them so the final solution might looks like: