0

I've read a lot of debate on the pros and cons of sanitising user input and there doesn't appear to be a definitive answer either way.

My scenario is that I am collecting email addresses via a HTML/jQuery form so that those email addresses can be used in a mailing list. There will be no retrieval from the database at this stage, therefore no usage of JSON, XML etc.

Do I need to be worried about sanitising user input or not? A good number of people seem to be saying that sanitisation on the way in isn't needed whilst others say you should never underestimate the need to sanitise whenever you can.

Does anybody have any thoughts that would make this clearer?

8
  • 6
    There are many controversial issues in software developing, but I didn't know that sanitizing user input would be one. Commented Jan 6, 2012 at 21:03
  • 2
    "You should never underestimate the need to sanitize whenever you can." Commented Jan 6, 2012 at 21:03
  • 1
    It's better to be safe than sorry, right? Is there a reason you don't want to sanitize the input? Commented Jan 6, 2012 at 21:04
  • @wescrow No, no reason at all - I just wanted to build some kind of consensus so I get it entirely right. Commented Jan 6, 2012 at 21:13
  • 1
    See also: Method for sanitizing user input -and- What are the best PHP input sanitizing functions? Commented Jan 6, 2012 at 21:20

6 Answers 6

1

Two things are important at this point:

  • Ensuring the user doesn't comprise your data: Prevent SQL Injections

See SQL Injection documentation here:

http://php.net/manual/en/security.database.sql-injection.php

  • Validating the email address to ensure the user did input a correct email

http://www.linuxjournal.com/article/9585

Sign up to request clarification or add additional context in comments.

1 Comment

validating an email is the 2nd most important point related to data sanitisation ?
1

Always do it. It will only take a few more minutes of your time. There really isn't a downside to it. Why risk it?

1 Comment

In some cases, sanitization can present issues. You're changing user data. It really depends on the data and the application. Sanitize where sanitization is relatively low-risk, and improves your life as a person who has to work with the data. For all else, validate.
1

Sanitizing any and all input, regardless of whether it will be used for output, is always a good idea, for the simple reason that it is input and therefore enacted upon in some way by code/compiler/system/etc. You may not need (per your use cases) to validate all the input (e.g. is an email address in the format of an email address vs is it a valid/working email address), but at least ensure a minimal set of sanitization functions to prevent XSS and SQL injections.

Comments

0

All user input should be sanitized. You can't trust a user to only submit valid input. That's not the way it works. There's always someone that'll try to test your code for weaknesses.

This goes for e-mail addresses as well. You should verify that it's a valid e-mail address before submitting it to the database.

2 Comments

@Fleep: Validation is the pre-condition to Sanitization
@hakre: That's not true. There are cases where you'd want to sanitize prior to validation - many cases in fact. Consider the case above. Let's say we were getting an e-mail from input. It makes sense to sanitize it (remove trailing/leading whitespace, make it lowercase) PRIOR to validating it (e.g., spaces/newlines aren't legal in e-mails, but the user may have just accidentally hit a space before/after). Sanitizing data doesn't JUST mean protecting against SQL-injection/XSS/CSRF. It also means cleaning up the data so it's usable.
0

Sanitization needs really vary based on use-case and datatype. For example, you're asking a user for an e-mail address. You will probably need to see if that e-mail already exists in your mailing list. If you don't have to, no problem. If you need to avoid duplicates and your mailing list system doesn't support it's own clean de-duping, it's generally safe and recommended to:

  1. trim() the e-mail input. Leading/trailing whitespace aren't meaningful parts of e-mails, removing them presents no risks, but it does make matching against e-mails more reliable/consistent.
  2. strtolower() the e-mail input. E-mails aren't case-sensitive, there's no real risk, and it makes matching more reliable/consistent.

Comments

0

Input validation refers to the process of validating all the input to an application before using it. Input validation is absolutely critical to application security, and most application risks involve tainted input at some level. Many applications do not plan input validation, and leave it up to the individual developers. This is a recipe for disaster, as different developers will certainly all choose a different approach, and many will simply leave it out in the pursuit of more interesting development.

Read more: Data Validation

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.