Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 24
  1. #1
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts

    Querying the most frequent numbers in a database

    Hi Guys,

    I am trying to make a quick little lotto number tool on my website and I'm not sure how to write a couple of the queries, hopefully one of you can help.

    I have a database with lotto numbers similar to this.

    ID B1 B2 B3 B4 B5 S1 S2
    1 20 19 22 34 45 45 26
    2 7 19 34 45 53 18 10


    Now the columns prefixed with B are the white lotto balls, and columns prefixed with S are the supplemental balls or powerball if you are in the u.s.

    Query 1:
    I would like to query the white balls to see which numbers show up most frequent.

    Query 2:
    Using the white balls, i would like to query to see which numbered pairs come up the most, in this example you will see 19 and 34 being a pair.

    If someone could shed some light on this would be great. I appreciate any and all help. Thanks in advance.

  • #2
    New Coder
    Join Date
    Oct 2007
    Posts
    84
    Thanks
    0
    Thanked 8 Times in 8 Posts
    I would probably set the tables up a little differently.

    Code:
    create table lottery_drawings (
    lottery_id int not null auto_increment,
    draw_date date,
    location text,
    primary key (lottery_id)) engine = innodb;
    
    create table lottery_results (
     ball_id int not null auto_increment,
     lottery_id int not null,
     ball_number int not null,
     ball_type varchar(1) not null,
     primary key(ball_id)) engine = innodb;
    Then you would be able to run a query like:
    Code:
    select
      ball_number,
      count(ball_number),
      ball_type
    from
      lottery_results
    group by
      ball_number, ball_type;
    This would give you the count of the number as either a white ball or a 'special' ball.

    You could also modify the query to find the percentage of times that ball has been drawn against the total number of draws, as well as how hot that number is in a given month or year.

  • #3
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    Thanks for the help. How could I also use this to query the most paired numbers as well?

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Quote Originally Posted by macleodjb View Post
    How could I also use this to query the most paired numbers as well?
    You can't. But assuming you modified your DB as he suggested, then you could write a different query to find the pairs.

    But it will be a messy and complex query.

    Let's see...

    First of all, I disagree a little bit with Todd's DB design. Let me correct him:
    Code:
    create table lottery_results (
     lottery_id int not null,
     ball_number int not null,
     ball_type varchar(1) not null,
     primary key(lottery_id, ball_number) ) engine = innodb;
    There is no reason at all to have that auto_increment ball_id in that table. The ball_id in no way helps identify a particular record in a useful way.

    So get rid of it and change the primary key to a *composite* key of (lottery_id, ball_number) and you will have a useful key that (a) uniquely identifies each ball and (b) makes looking them up by lottery and ball_number as fast as you want and need.

    So then...

    Just to FIND all the the pairs in the table:
    Code:
    CREATE VIEW v_ball_pairs AS
        SELECT L1.lottery_id, L1.ball_number AS ball1, L2.ball_number AS ball2
        FROM lottery_results AS L1, lottery_results AS L2
        WHERE L1.lottery_id = L2.lottery_id
        AND L1.ball_number < L2.ball_number
        AND L1.ball_type = 'W'
        AND L2.ball_type = 'W';
    And then to find pairs that match from different lotteries:
    Code:
    SELECT P1.lottery_id AS lottery1, P2.lottery_id AS lottery2, P1.ball1, P2.ball2
    FROM v_ball_pairs AS P1, v_ball_pairs AS P2
    WHERE P1.lottery_id < P2.lottery_id
    AND P1.ball1 = P2.ball1
    AND P1.ball2 = P2.ball2
    Wasn't that fun? There will be a quiz on Friday.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Oh, I forgot! You wanted to know the *number* of cross-lottery pairs and find which pairs occurred most often.

    Okay.
    Code:
    SELECT P1.lottery_id AS lottery1, P2.lottery_id AS lottery2, 
           P1.ball1, P2.ball2, COUNT(*) AS pairCount
    FROM v_ball_pairs AS P1, v_ball_pairs AS P2
    WHERE P1.lottery_id < P2.lottery_id
    AND P1.ball1 = P2.ball1
    AND P1.ball2 = P2.ball2
    GROUP BY P1.lottery_id, P2.lottery_id, P1.ball1, P2.ball1
    ORDER BY pairCount DESC LIMIT 20;
    The LIMIT there is arbitrary. LIMIT 1 or LIMIT 5 or however many top results you want.

    ************

    EDIT: The use of P2.ball2 in the SELECT and GROUP BY was actually a typo. I meant to use P1.ball2 both places. But since P1.ball2 = P2.ball2, it doesn't matter. Will work the same either way.

    RE-EDIT: This query is *WRONG*!!! See the revised query in post #9.
    Last edited by Old Pedant; 12-15-2010 at 08:24 PM.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #6
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    wow, I appreciate the help here. i am trying to follow what exactly you are doing in this statement but i am completely lost. Can you break it down in laymens terms what you are doing. Thanks.

  • #7
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Okay...when you create a VIEW, it's just kind of like a macro. You can then later use the view multiple times in another query without repeating all the code in the VIEW.

    SO:
    Code:
    create a view and give it a name:
    CREATE VIEW v_ball_pairs AS
        we want to get all possible pairs of ball numbers for *each* possible lottery
        so of course we need a lottery_id and two ball numbers:
        SELECT L1.lottery_id, L1.ball_number AS ball1, L2.ball_number AS ball2
        we have to join the lottery_results table TO ITSELF, because each record
        in that table only holds one ball_number; by joining to itself, we can get
        two ball_number values
        FROM lottery_results AS L1, lottery_results AS L2
        we join the table to itself, but we insist that the two copies of the table
        use the same lottery_id, because that's what we want: to balls from same lottery
        WHERE L1.lottery_id = L2.lottery_id
        and here is where we specify that the two ball numbers can *NOT* match
        by using < instead of !=, we ensure that we get each pair only once, with
        the first ball being the lower numbered one
        AND L1.ball_number < L2.ball_number
        and then we just make sure that both balls are white balls
        AND L1.ball_type = 'W'
        AND L2.ball_type = 'W';
    That's part one. Have any trouble following that?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #8
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Code:
    so this is the same principle as the view, but since this
    gets the final result we don't make it a view.
    Again we want two lottery numbers (ids) and two ball numbers,
    where the two ball numbers are both found, as a pair, in each lottery
    SELECT P1.lottery_id AS lottery1, P2.lottery_id AS lottery2, P1.ball1, P2.ball2
    but this time, instead of joining two tables, we join two identical views!
    FROM v_ball_pairs AS P1, v_ball_pairs AS P2
    The rest is the same idea:  We want a pair of lotteries, and to make sure
    we don't get each pair twice we get the lower numbered lottery first
    WHERE P1.lottery_id < P2.lottery_id
    and then we insist that the pair of balls from each one be the same
    AND P1.ball1 = P2.ball1
    AND P1.ball2 = P2.ball2
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #9
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Finally, my TOTALS query was *WRONG*!!! WAY WAY too complex. I was after a correlation index, which is far beyond what you needed.

    Much simpler to get what you wanted!

    Here it is revised:
    Code:
    SELECT ball1, ball2, COUNT(*) AS pairCount
    FROM v_ball_pairs
    GROUP BY ball1, ball1
    ORDER BY pairCount DESC LIMIT 20;
    That will show you all possible pairs of balls *AND* how many times that pair appears in any lotteries. It orders the results by the count descending, meaning it shows those pairs that occur more often first. And it limits the results to (arbitrary...you can change) 20.

    Explaining:
    Code:
    we just need to know what the ball numbers are and how many times
    that particular pair of numbers appeared
    SELECT ball1, ball2, COUNT(*) AS pairCount
    we again use the view, for convenience.  Recall, the view shows us
    all possible pairs in all possible lotteries.
    FROM v_ball_pairs
    to get the count for each pair, we just need to GROUP BY that particular pair
    GROUP BY ball1, ball2
    and then we want those pairs that occur most often to show up first
    we limit the results to the top 20.
    ORDER BY pairCount DESC LIMIT 20;
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #10
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    Ok thanks for that explanation. I can somewhat understand that. However, I am thinking that there is a flaw unless you tell me otherwise. If i use the database structure that you set up, it does not contain the date the number was drawn. So, how will your query distinguish between The number 7 from August 10th versus the number 7 from December 10th. The pairing needs to come from the same drawing date. If i am wrong let me know, then i must not have understood thoroughly. Thanks a lot so far for the help.

  • #11
    New Coder
    Join Date
    Oct 2007
    Posts
    84
    Thanks
    0
    Thanked 8 Times in 8 Posts
    You have your parent table lottery_drawings that is referenced by your results table's lottery_id column. That parent has the date drawn as well as a field to add a 'location', although you could go a step further and have another table lottery location and have that referenced by a foreign key.

  • Users who have thanked toddandrae for this post:

    macleodjb (12-16-2010)

  • #12
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,118
    Thanks
    80
    Thanked 4,554 Times in 4,518 Posts
    Translating Todd's comment:

    If you use his main table, as given:
    Code:
    create table lottery_drawings (
    lottery_id int not null auto_increment,
    draw_date date,
    location text,
    primary key (lottery_id)) engine = innodb;
    Then you will have records in there such as
    Code:
    lottery_id   draw_date    location 
             1   2010-9-10    NY
             2   2010-9-10    NJ
             3   2010-9-11    NY
    etc.
    And then in the other table (modified to show the relationship, as you really should do it):
    Code:
    create table lottery_results (
     lottery_id int not null references lottery_drawings(lottery_id),
     ball_number int not null,
     ball_type varchar(1) not null,
     primary key(lottery_id, ball_number) ) engine = innodb;
    you will have records such as:
    Code:
    lottery_id  ball_number  ball_type
              1           23    W
              1           31    W 
              1           67    W
     ...
             17           2     W
             17          31     W
             17          23     W
    ...
    So the lottery_id identifies WHICH drawing the balls come from.

    That VIEW that I showed you would return results such as
    Code:
    lottery_id  ball1  ball2
             1     23     31
             1     23     67
             1     31     67
           ...
            17      2     23
            17      2     31
            17     23     31
           ...
    Okay?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    macleodjb (12-16-2010)

  • #13
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    Excellent, I believe i understand it now. Thanks for taking the time to break it down for an idiot like me. Very much appreciated.

  • #14
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    I had a couple of errors when i put this into sql so i had a make a couple changes. I'm not sure if this altered any of your code or not but it was the best i could come up with my knowledge.

    Code:
    SELECT        TOP (20) ball1, ball2, COUNT(*) AS pairCount
    FROM            WhiteBallPairs
    GROUP BY ball1, ball2
    ORDER BY pairCount DESC
    I was trying to get triple pairs but I can't get the syntax correct for the view

    This is what i had
    Code:
    SELECT        L1.drawing_id, L1.result_number AS ball1, L2.result_number AS ball2, L3.result_number AS ball3
    FROM            dbo.game_results AS L1 INNER JOIN
                             dbo.game_results AS L2 INNER JOIN
                           dbo.game_results AS L3 ON L1.drawing_id = L2.drawing_id AND L2.drawing_id= L3.drawing_id AND L1.result_number < L2.result_number AND L2.result_number < L3.result_number
    WHERE        (L1.result_type = 'WBall') AND (L2.result_type = 'WBall') AND (L3.result_type = 'WBall')
    Last edited by macleodjb; 01-07-2011 at 06:45 PM.

  • #15
    Regular Coder
    Join Date
    Apr 2007
    Posts
    317
    Thanks
    24
    Thanked 3 Times in 3 Posts
    Ok I think i got the triple pair down, not sure, let me know if this is correct.

    Code:
    SELECT        L1.drawing_id, L1.result_number AS ball1, L2.result_number AS ball2, L3.result_number AS ball3
    FROM            dbo.game_results AS L1 INNER JOIN
                             dbo.game_results AS L2 ON L1.drawing_id = L2.drawing_id AND L1.result_number < L2.result_number INNER JOIN
                             dbo.game_results AS L3 ON L2.drawing_id = L3.drawing_id AND L2.result_number < L3.result_number
    WHERE        (L1.result_type = 'WBall') AND (L2.result_type = 'WBall') AND (L3.result_type = 'WBall')


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •