Have been going over this problem for two days without any real luck. I am using asp.net webapi2 with jquery ajax on client side.
I have an edit box for entering memo text, allowable characters are ^[©a-zA-Z0-9\u0900-\u097f,\.\s\-\'\"!?\(\)\[\]]+$
and two tags <LineBreak/>
and <Link attr="value"/>
(may be couple of more attributes in Link tag. The problem is that NO other tags are allowable - which means that even a simple <br/>
should be prevented. This negative check is proving to be bit complicated.
Requesting help in formulating regex for javascript on client side and c# based DataAnnotation check on the server side.
What you're attempting to do is sanitize user input, however, using JavaScript and Regex is the wrong way to go about it.
Don't concern yourself with validating user input on the front end, at least not yet, the focus should be validating it server side first and the best tool for the job is HtmlSanitizer. In their words:
I've mocked up a demo on dotnetfiddle.net using that library for you to play with
Edit
Regex isn't made for this type of task, you need to be able to parse a html document, meaning parsing its tags, attributes and values within those attributes in a tree like structure to be able to properly sanitize it because there's just too many edge cases that's too difficult to cover with just Regex. Regex is better used for scraping data from a source that's already in a structure that is predictable, user input isn't one of those things.
Even though your use case is simple enough, you're still enabling users to type in HTML that will be re displayed to other users in its raw format so anything that you miss will give you a headache down the line.
Here's the XSS Filter Evasion Cheat Sheet from OWASP, if Regex could cover everything listed here, I would say fine, but it's such a difficult task to achieve that in Regex that it just doesn't make sense.
HtmlSanitizer on the other hand does cover the issues listed on that cheat sheet, it's also actively maintained and is specially built for exactly this sort of application, it's also not bulky by any means, it can handle large sanitization tasks with processing times in the 50-100ms range.