Concerns regarding potential Regex misuse?

96 Views Asked by At

I have the following regex: ^[a-zA-Z](.*)[a-zA-Z]$ on both the Javascript and PHP side that I have been using for validating a person's name and message fields on a contact form (no database interaction). It basically ensures that the first and last character in the field are alphabets, and allows anything else in-between.

My concerns are:

  1. For this type of functionality, should I be bothered with trying to validate a person's name or message? The only thing I am validating or rather protecting against is any malicious input.
  2. I'm unsure what type of attacks I could leave my site open to, if the only thing I do is check that the fields aren't empty.

Are these valid concerns? If I start catering for all types of Name and Message scenarios, I'm going to end up with a very long expression that will become too difficult to maintain...So is it really worth it, or is there a bare-minimum regex that I should use for these 2 fields to protect against malicious attacks/scripting?

(PS - I've just been reminded by one of my co-workers about names beginning with an " ! ")

THANK YOU!!

2

There are 2 best solutions below

4
On BEST ANSWER

Now someone named Dieter Voß can’t use your contact form anymore. That’s bad.

If you don’t have any database interaction and the data is sent to someone via e-mail or the like (as opposed to being displayed publicly on the web), then there’s not much of an security concern to protect yourself against. I’d recommend simply doing no check at all. (Except maybe whether the fields are empty.)

Disclaimer: Without knowing about the rest of the code, any statements about possible security implications can possibly be wrong.

2
On
  1. For this type of functionality, should I be bothered with trying to validate a person's name or message? The only thing I am validating or rather protecting against is any malicious input.

Protecting against malicious input is different to validating that a person's name is indeed a person's name. Because people's names can be so varied (containing all sorts of characters) it may be better just to ensure that this has at least one non-whitespace character in it and be done with it. Some people have only one name, and some people have a surname that is only one letter.

  1. I'm unsure what type of attacks I could leave my site open to, if the only thing I do is check that the fields aren't empty.

Protecting against malicious input is a different problem, and it involves for example filtering the entry before it is entered into the database, but this is (probably) a job for your database abstraction or framework.

You will also have to ensure that whenever you output such data as part of an HTML page, that you use htmlspecialchars() on it. This protects against cross-site scripting (inclusion of HTML tags in what should be plain text). This is (probably) a job for your template system or view layer.