This must've been asked before, but I couldn't find it.
I want to allow my users to enter text into an HTML form, and later to display that text in the webpage, exactly as it was written, avoiding:
- XSS attacks
- Encoded punctuation being displayed (e.g.
%2C
instead of a comma,+
instead of a space) - Unexpected results due to < or > being used and the browser treating it as part of the HTML
The form enctype
is the default application/x-www-form-urlencoded
. I'm not sure if I really need this enctype
, but for various reasons I'm sticking with it for now.
I've seen that I can partially fix (2) by using decodeURI
or decodeURIComponent
, although it doesn't convert +
back to space
.
For the rest, isn't there also a built-in function I can use? The only libraries I found were server-side ones for .NET or Java, I didn't find anything for doing it client-side in Javascript, but I found plenty of stern warnings that if you roll your own code, you'll probably make subtle mistakes.
For now I'm using the myDecode
function below, which seems to work, but I can't believe it's the best way.
function myDecode(string) {
// First convert + to space, since decodeURIComponent may introduce new + characters that were previously encoded
// Then use decodeURIComponent to convert all other punctuation
// Then escape HTML special characters
return htmlEscape( decodeURIComponent( string.replace(/\+/g, " ") ) );
}
function htmlEscape(string) {
return string.replace( /&/g, "&") // remember to do & first, otherwise you'll mess up the subsequent escaping!
.replace( /</g, "<" )
.replace( />/g, ">" )
.replace( /\"/g, """ )
.replace( /\'/g, "'" );
}
My test is that the user can enter the below text and have it displayed as-written, without any changes and without running the script:
<script>alert( "Gotcha! + & + " );</script>
But I don't know if that's a strong enough test.
This is just a small hobby project with no sensitive data and very few users, so it doesn't have to be totally bullet proof. But it would be nice to know how to do things the right way.