Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    New Coder
    Join Date
    Mar 2010
    Posts
    40
    Thanks
    7
    Thanked 0 Times in 0 Posts

    Need help with mysql 'IN' statement...

    Hi folks, I'm no expert with mysql so I'm hoping one of you geniuses will be able to spot the flaw in the statement that I'm about to show:

    SELECT *
    FROM profiles
    WHERE coursearea IN (3)


    +------+------+------------+
    | id | name | coursearea |
    +------+------+------------+
    | 1 | John | 3,16 |
    | 2 | Ram | 3,16,21 |
    | 3 | Jack | 3 |
    | 4 | Jill | 3,12,24 |
    +------+------+------------+--------------------+

    This gives me 4 results

    (Yes I know you will tell me off because one of the cells contains multiple values but bear with me)

    If I then change the query to this:

    SELECT *
    FROM profiles
    WHERE coursearea IN (16)


    I'm expecting 2 results but not getting any.

    What query should I be using to get the correct results?

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,695
    Thanks
    80
    Thanked 4,511 Times in 4,475 Posts
    Well, you goofed on your DB design and now you pay the price for it.

    First of all IN() isn't at all usable for this. Even if it were, you would want to be using something like
    Code:
       WHERE 16 IN ( coursearea )
    But that's not how IN works. In can *ONLY* find things in a list of values. And you do not *HAVE* a "list".

    It may look like a list to human eyes, but to SQL it is only *one* field. It's just a field that HAPPENS to have commas in it. SQL makes no distinction whatsoever about characters *with* field values. You could as well have had 3$16$21 or 3!16!21 and it would be all the same to the database.

    The *PROPER* design for this kind of thing is to use a separate table.

    That is, you would have:
    Code:
    TABLE: profiles
    +------+------+
    | id | name | 
    +------+------+
    | 1 | John |
    | 2 | Ram | 
    +------+------+
    
    TABLE: profileCourseareas
    +------+------+
    | id | area | 
    +------+------+
    | 1 |    3 |
    | 1 |   16 |
    | 2 |    3 | 
    | 2 |   16 | 
    | 2 |   21 | 
    +------+------+
    Your profileCoursearea table would be defined something like this:
    Code:
    CREATE TABLE profileCoursearea (
        id int REFERENCES profiles(id),
        area int REFERENCES coursearea(id) /* or whatever the field name is in the coursearea table */
    );
    Is it too late to convince you to fix your DB design?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #3
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,695
    Thanks
    80
    Thanked 4,511 Times in 4,475 Posts
    If you are unwilling to correct the DB design, then the best you can do is something ugly like this:
    Code:
    SELECT * FROM profiles 
    WHERE CONCAT(',', courseareas,',') LIKE '%,16,%'
    If doing this from PHP, for example, you'd code something like
    Code:
    $sql = "SELECT * FROM profiles WHERE CONCAT(',', courseareas,',') LIKE '%," . $areanum . "16,%'";
    The reason you need to add the commas to the front and back of both courseareas and the value are looking for is to avoid "false positives".

    Suppose, for example, the areanumber you are looking for is 1.

    If you didn't do as I show, you'd code something such as
    Code:
    SELECT * FROM profiles WHERE courseareas LIKE '%1%'
    But then the '%1%' would match 16 and 21 and 17 and...

    With the commas, ,16, matches only ,16,

    See why you should change your DB design?

    And this is just the tip of the iceberg.

    Using YOUR db design, how would you answer a question such as this:
    How many students are there that have two or more courseareas in common?
    or even
    How many students have exactly two courseareas in common and have at least one of them in the range of 21 through 30?
    You can't do it with your design. Not in any even close to efficient way.

    Whereas with a NORMALIZED design both queries are trivial.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,695
    Thanks
    80
    Thanked 4,511 Times in 4,475 Posts
    If you care, the reason that even *THIS* query works
    Code:
    SELECT *
    FROM profiles
    WHERE coursearea IN (3)
    Is because MySQL is so "sloppy" (some would say "forgiving").

    What happens: Because your IN() expression was a number (that is, the 3), MySQL attempts to convert all your coursearea field values to a number.

    In the process, it does its best to convert strings such as '3,16,21' to a number and succeeds in converting '3' before it encounters the comma, which causes a conversion error. Some other DBs might get this far, but then they would stop on that conversion error and report a type mismatch. MySQL ignores the error and accepts what has been converted so far and so indeed finds that, *AS A NUMBER*, the value '3,16,21' is just 3.

    As I said sloppy. I think MySQL is way way too forgiving and so it leads many people (yourself included) into believing that their code is okay when, in fact, you find out later (sometimes years later) that it is not.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    New Coder
    Join Date
    Mar 2010
    Posts
    40
    Thanks
    7
    Thanked 0 Times in 0 Posts

    Accept with a sinking heart!

    Hi,

    Yes, I had a sneaking suspicion that this would be the case. The database has been passed on to me to sort and this was my fear. However, it's not too late to change the design and add in a couple more tables (one for the courseareas and one for the relationship).

    Thanks for confirming.

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    26,695
    Thanks
    80
    Thanked 4,511 Times in 4,475 Posts
    Whether or not to rebuild the DB is up to you. If it's not used much, and you know you will never have to answer questions such as those I showed you, you can probably get away with leaving it alone.

    But if it's used a lot, in the long run you'll find that rebuilding it will pay huge dividends.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    jockturner (12-03-2011)

  • #7
    New Coder
    Join Date
    Mar 2010
    Posts
    40
    Thanks
    7
    Thanked 0 Times in 0 Posts
    Thanks, Have rebuilt the database now and learnt something new. A good day.

    Thanks for your help.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •