Our web service has been hit with some Zalgo text and I'm trying to come up with a good solution for the future. Our policy is to accept all user input and save it in permanent storage (we correctly encode the input for our backend so this part is okay). During the output phase, we run the original user input through filter/parser with a whitelist to avoid XSS attacks and other mayhem. Lately some users have found the world of Zalgo and they just love to cause some trouble to other people with that.
As I see it, Zalgo text is just a piece of Unicode text that leaks out of the intended container. As a result, I think automatically removing all complex combining characters is too drastic defence. Does anybody know a CSS trick to force the Zalgo text to be contained in a given parent element without some nasty side-effects?
For example, if I have
<section class="userinput">
... user input here ...
</section>
how can I make sure that the user input does not leak outside the borders of section.userinput
? I guess overflow: hidden
or clip: rect(...)
could be the correct answer, but do you know something better for this use case? Preferably I could still use section.userinput { max-height: 200vh; }
or something similar to avoid users from creating artificially long comments. If some comment were longer than 200vh
, it should have a scroll bar for that comment alone. Normally there should be just a single scroll bar for the whole page.
Note that I'm trying to combat the problem in visual domain only. I'm perfectly happy to accept any valid UTF-8 sequence as user input and I'm okay if a messed up user comment results in that user comment looking like crap. I'm only trying to avoid that crap overflowing all over the place. Specifically, I'm not trying to block the zalgo text or to filter zalgo-like text before display.
After testing an example case with Firefox and Chrome, I would say the best option is to use declaration
overflow: auto
. Usingoverflow: hidden
would make sense only if possible scrollbars are considered worse than losing user content.The
overflow: auto
allows falling back to scrollbars automatically if content does not fit and it still forces clipping to selected element.The declaration
clip: rect(0,auto,auto,0);
is no good because it only works withposition: absolute;
and withoutoverflow: visible
.See an example without
overflow: auto
for an comparision.Above examples inlined here as snippets:
An example without safeguards against Zalgo text:
And example with safeguards against bleeding zalgo text:
With
overflow: auto
applied to each user input container (the only differences is thev1
class inbody
element; you could also try classv2
to understand why it doesn't work for this purpose):