Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 18
  1. #1
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts

    Sanitizing or validating input first.

    Hi, when receiving user input from a form, should I be sanitizing or validating my user input first?

    I would guess you should sanitize the input first to make sure there are no threats from what the entered (bad code etc) and then validate it to make sure the data is what you require (making sure they entered a valid email for instance, or numbers for a telephone).

    I would like to know different peoples points of views!

    Please let me know your thoughts.

    Kind regards,

    LC.

  • #2
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,994
    Thanks
    4
    Thanked 2,662 Times in 2,631 Posts
    No, you should validate first. Sanitizing is preformed to handle the datastorage level which is the last step. There's no point in approaching a datastorage level if the business rules don't pass the validation phase. If you require a number and you're given a string, that's an error so you send them back to the form.
    Sanitizing with the exception of stripslashes if required (not necessary as of 5.4) is not necessary if you use SQL with prepared statements and would in fact corrupt the input instead.

  • #3
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,642
    Thanks
    0
    Thanked 649 Times in 639 Posts
    Sanitizing is not something you would need to do where you know that the data has just been validated. It is something you might do on data read from a database if you want to be certain that any tampering with the data in the database will not be able to do any harm when it is subsequently processed.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • #4
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    Thank you both for your replies.

    What I want to do is make the data secure i.e escape any harmful characters.

    The characters I'm looking to deal with are < > " ' \ /

    I'm not sure if it is enough to just deal with them or are there others? I've read so much on the functions to use before inserting data and people have posted so many different methods so it isn't very clear to me what to use.

    At the moment, I check every field that has a value to see if they contain any of those characters and execute a str_replace() on them to convert them to their html entity equivalent like so:

    PHP Code:
    for($n=0;$n<$array_length;$n++){

         if(empty(
    $post_data[$n])){
             
    $post_data[$n] = NULL;
         }
         else{
             
    $post_data[$n] = str_replace(array("<",">","'","\"","\\"),array("&lt;","&gt;","&apos;","&quot;","\"),$post_data[$n]);
             
             switch(
    $n){
                 case 
    0//name
                     
    if (!ctype_alpha($post_data)) {
                         
    $errors[] = "You can only enter alphabetical characters in the name field.";
                     }
                 break;

                 case 
    1//email
                     
    if(!filter_var($post_data[$n], FILTER_VALIDATE_EMAIL)){
                         
    $errors[] = "You entered an invalid email";
                     }
                 break;
                 
                 case 
    2//phone number
                     
    if(!ctype_digit($post_data[$n])){
                         
    $errors[] = "You entered an incorrect phone number.";
                     }
                 break;
                 
                 case 
    3//additional comments
                     
    if(!strlen($post_data) <= 400){
                         
    $errors[] = "";
                     }
                 break;
             }
         }
         

    Can someone please clarify what EXACTLY how to make data safe.

    I know this question has come up so many times but honestly there are so many examples. I want some advice from people whose judgement I can trust.

    Kind regards,

    LC.
    Last edited by LearningCoder; 11-12-2012 at 08:45 PM.

  • #5
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,642
    Thanks
    0
    Thanked 649 Times in 639 Posts
    Those characters are not generically harmful. They are only harmful in specific situations. The way to resolve that is to identify the specific situation where they would be harmful and either change the way your code works so that situation doesn't occur or to escape the characters if tchanging the code isn't possible.

    For interacting with a mysql database you should be using mysqli or PDO and then you can use separate prepare and bind statements where the SQL and the data are kept completely separate and so there is no way the data can be confused with the SQL. If you use the antiquated mysql interface where you can't perform this separation then you'd use mysql_real_escape_strong() to escape the data and so reduce the chances of certain characters being harmful.

    For outputting to HTML there isn't any way to separate the data from the tags and so you would use htmlspecialchars() to convert the potetially harmful characters into their entity code equivalents and so avoid having them mistake n for HTML.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • Users who have thanked felgall for this post:

    LearningCoder (11-12-2012)

  • #6
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    My intention was to use a mysqli prepared statement once my code reached a certain point (i.e I've checked the data is exactly what it should be).

    The thing I am worried about mostly is someone entering some harmful code into the form which could damage the database. I have 2 textarea fields in the form which allow up to 400 characters in each.

    From what you said, I gather that you would only secure your data for certain purposes, so upon inserting to the database, use mysqli_real_escape_string()? I thought we needed to use functions like html htmlentities($var, ENT_QUOTES) also before using any data in a query to make sure we convert any single or double quotes as they could corrupt the query?

    Thank you for your reply.

    Kind regards,

    LC.

  • #7
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,994
    Thanks
    4
    Thanked 2,662 Times in 2,631 Posts
    If you are using prepared statements, sanitizing it is not necessary and will corrupt the input if you use it. So don't bother with an escape on it if you bind. Issue a stripslashes if magic_quotes_gpc is enabled first.

    As mentioned, tagging is not a security issue, it is a data issue. HTML tags will not harm your storage integrity, it will damage the output of the data. Use htmlspecialchars or htmlentities to translate them, use strip_tags to remove them, or whatever else you want to do. This is a part of the business rules for your input, which is a verification process. If you intend to keep them, I'd keep them in their raw state using the actual characters in question. Display them using htmlspecialchars or htmlentities instead.

  • #8
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    Quote Originally Posted by Fou-Lu View Post
    If you are using prepared statements, sanitizing it is not necessary and will corrupt the input if you use it. So don't bother with an escape on it if you bind. Issue a stripslashes if magic_quotes_gpc is enabled first.
    When you say don't bother with an escape do you mean I don't need send it to use sayaddslashes()?

    Quote Originally Posted by Fou-Lu
    As mentioned, tagging is not a security issue, it is a data issue. HTML tags will not harm your storage integrity, it will damage the output of the data. Use htmlspecialchars or htmlentities to translate them, use strip_tags to remove them, or whatever else you want to do. This is a part of the business rules for your input, which is a verification process. If you intend to keep them, I'd keep them in their raw state using the actual characters in question. Display them using htmlspecialchars or htmlentities instead.
    So it is ok to send characters such as < > into the database, but I need to convert them when reading the data out?

    What about forward and backward slashes? Are they harmful with post data?

    Also, with double and single quotes do I need to add slashes to the input to make sure it doesn't affect the statement or does it work differently with prepared statements??

    Sorry to keep asking the same questions but this is a real issue of mine when working with user input. Once I understand I can just follow the same rule of thumb every time I have to deal with input.

    Kind regards,

    LC.
    Last edited by LearningCoder; 11-13-2012 at 02:30 AM.

  • #9
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,642
    Thanks
    0
    Thanked 649 Times in 639 Posts
    [QUOTE=LearningCoder;1291415]So it is ok to send characters such as < > into the database, but I need to convert them when reading the data out?[/code]

    No you only convert them when writing to HTML. Everywhere else leave them along.

    Quote Originally Posted by LearningCoder View Post
    What about forward and backward slashes? Are they harmful with post data?
    No.

    Quote Originally Posted by LearningCoder View Post
    Also, with double and single quotes do I need to add slashes to the input to make sure it doesn't affect the statement or does it work differently with prepared statements??
    It works differently in BIND statements - the data never gets anywhere near the PREPARE statement.


    Validate user input and if you want to be really sure then sanitize input from other sources such as the database - just in case it has been tampered with.
    Escape output only where necessary to keep the data from being confused with the surrounding code.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • #10
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,994
    Thanks
    4
    Thanked 2,662 Times in 2,631 Posts
    That's right. When you separate a prepared statement from the data and bind, it now becomes incapable of corrupting the SQL statement itself. So if input data exists of 1; DROP TABLE atable;--, it cannot break a prepared statement. The data goes in exactly as it shows. Addslashes itself should never be used for SQL data since it only accommodates the quotations and backslashes, not null entries, so this is why we used a real_escape_string from the sql libraries. Until magic_quotes_gpc is actually gone from common use (gone as of 5.4), we still need to check for it and issue a stripslashes if its enabled to get the original data provided back.
    So SQL wise, it is absolutely fine to run <script> tags in it if you want. Its just data, it doesn't have the ability to parse it. Just like I can also put PHP code, or Java code, or whatever else I want into the db itself. Typically I suggest to leave it as is when inserting into a database, and convert it as necessary for display. Its easy to get hitched into wanting to insert it with the html entities encoded, but that is only because PHP makes conversion so easy. Languages like Java for example require either manual conversion or extended libraries in order to convert them. The difference with Java is I can explicitly tell an item if its capable of rendering HTML or if it just treats it as data, unlike PHP.

  • #11
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    Quote Originally Posted by Felgall
    It works differently in BIND statements - the data never gets anywhere near the PREPARE statement.


    Validate user input and if you want to be really sure then sanitize input from other sources such as the database - just in case it has been tampered with.
    Escape output only where necessary to keep the data from being confused with the surrounding code.
    So let's say the user submits details in the form, it sends to database. All good.

    Maybe I want to setup an email to send to users where I bind the data to result variables. (for instance, being able to address the user by their name rather than having to hardcode each).

    Say the entered "O'connor". If I do not escape that data, what would be the need in using any htmlspecialchars() or htmlentities() as I would want it to be displayed the same way it was inserted wouldn't I?

    Kind regards,

    LC.
    Last edited by LearningCoder; 11-13-2012 at 03:00 AM.

  • #12
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    Quote Originally Posted by Fou-Lu View Post
    That's right. When you separate a prepared statement from the data and bind, it now becomes incapable of corrupting the SQL statement itself. So if input data exists of 1; DROP TABLE atable;--, it cannot break a prepared statement. The data goes in exactly as it shows. Addslashes itself should never be used for SQL data since it only accommodates the quotations and backslashes, not null entries, so this is why we used a real_escape_string from the sql libraries. Until magic_quotes_gpc is actually gone from common use (gone as of 5.4), we still need to check for it and issue a stripslashes if its enabled to get the original data provided back.
    So SQL wise, it is absolutely fine to run <script> tags in it if you want. Its just data, it doesn't have the ability to parse it. Just like I can also put PHP code, or Java code, or whatever else I want into the db itself. Typically I suggest to leave it as is when inserting into a database, and convert it as necessary for display. Its easy to get hitched into wanting to insert it with the html entities encoded, but that is only because PHP makes conversion so easy. Languages like Java for example require either manual conversion or extended libraries in order to convert them. The difference with Java is I can explicitly tell an item if its capable of rendering HTML or if it just treats it as data, unlike PHP.
    So with using a prepared statement, I do not have to do anything to the input in terms of making sure it doesn't contain any data because it isn't able to parse the code and 'execute' it?

    Do we still have to use mysqli_real_escape_string() with prepared statements?

  • #13
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,994
    Thanks
    4
    Thanked 2,662 Times in 2,631 Posts
    No, using escaping will corrupt the data. O'Conner would be inserted into the database as O\'Conner, not as O'Conner. You do not want to execute an escape of any kind on a bind, the data goes in exactly as presented. Its also why you have to check for magic quotes and issue a stripslash if its enabled.

  • #14
    Regular Coder LearningCoder's Avatar
    Join Date
    Jan 2011
    Location
    The Pleiades
    Posts
    925
    Thanks
    76
    Thanked 29 Times in 29 Posts
    Ah ok, so I literally just make sure the data they entered is what I want i.e digits for a phone number, a valid email etc. Once my code gets far enough and I'm ready to insert, I simply just call the prepared statement script and insert the data using no character encoding or escaping functions?

    Kind regards,

    LC.

  • #15
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,994
    Thanks
    4
    Thanked 2,662 Times in 2,631 Posts
    That's right, you just need to validate and verify the data, then bind it. If your rules specify it must be a phone number, use a pattern match and verify or convert the format to your liking.

  • Users who have thanked Fou-Lu for this post:

    LearningCoder (11-13-2012)


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •