Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,994
    Thanks
    120
    Thanked 76 Times in 76 Posts

    something for brain training

    Google visits my site and I log things:

    data
    Code:
    page1  date1
    page2  date2
    page3  date3
    page4  date4
    page1  date5
    page2  date6
    page1  date7
    Need to drill out average date difference
    for any page visit.
    so if date 1,5,7 would be 30.01.2001,10.01.2001,01.01.2001 respectively
    the result for page1 would be 15 (approx)

    how to do that, if possible at all ?
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,650
    Thanks
    80
    Thanked 4,636 Times in 4,598 Posts
    It's possible, but it won't be an efficient query. I had to do something like this once upon a time.

    Depending on how often you will do this and how much data there is to process, you might be best off doing it in a stored procedure and building a temporary table that holds the pairs.

    How much data is there? How often will you run the query?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #3
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,994
    Thanks
    120
    Thanked 76 Times in 76 Posts
    like 100.000 records now.
    once or twice, just to see how google responds to my
    <changefreq>
    suggestions.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,650
    Thanks
    80
    Thanked 4,636 Times in 4,598 Posts
    Here is one way to do it.

    You don't HAVE to create the views to make it work--you can do it with nested queries--but it's easier to see how it works by using the views.
    COMMENT: First, here is the data I am using. Like my table name?

    Code:
    mysql> select * from bubikol;
    +------+------------+
    | page | ondate     |
    +------+------------+
    |    1 | 2001-01-30 |
    |    2 | 2010-04-15 |
    |    3 | 2011-10-10 |
    |    1 | 2001-01-10 |
    |    2 | 2010-02-15 |
    |    3 | 2011-10-08 |
    |    1 | 2001-01-01 |
    +------+------------+
    COMMENT: This first view just organizes the data in a reason able way.
    Code:
    create view bview as
    select b1.page, b2.ondate AS startdate, b1.ondate AS enddate
    from bubikol as b1, bubikol as b2
    where b1.page = b2.page
    and b1.ondate > b2.ondate;
    
    COMMENT: Here's a "dump" so you can see what it is doing:
    
    mysql> select * from bview;
    +------+------------+------------+
    | page | startdate  | enddate    |
    +------+------------+------------+
    |    1 | 2001-01-10 | 2001-01-30 |
    |    2 | 2010-02-15 | 2010-04-15 |
    |    3 | 2011-10-08 | 2011-10-10 |
    |    1 | 2001-01-01 | 2001-01-30 |
    |    1 | 2001-01-01 | 2001-01-10 |
    +------+------------+------------+
    COMMENT: Then this second view throws out the rows from the first view that we don't want.
    Code:
    create view bview2 as
    select page, startdate, min(enddate) as enddate
    from bview
    group by page, startdate;
    
    COMMENT: Again, a "dump" so you can see what it did:
    
    mysql> select * from bview2;
    +------+------------+------------+
    | page | startdate  | enddate    |
    +------+------------+------------+
    |    1 | 2001-01-01 | 2001-01-10 |
    |    1 | 2001-01-10 | 2001-01-30 |
    |    2 | 2010-02-15 | 2010-04-15 |
    |    3 | 2011-10-08 | 2011-10-10 |
    +------+------------+------------+
    COMMENT: See how it worked? It chooses the enddate closest to the startdate in each case.


    ***************************************
    COMMENT: This next dump is just so you can see how to get the number of days in each case.
    We don't actually use this query, per se, in the answer.

    Code:
    select page, startdate, enddate, datediff(enddate,startdate) as daysdiff
    from bview2
    order by page, startdate;
    
    +------+------------+------------+----------+
    | page | startdate  | enddate    | daysdiff |
    +------+------------+------------+----------+
    |    1 | 2001-01-01 | 2001-01-10 |        9 |
    |    1 | 2001-01-10 | 2001-01-30 |       20 |
    |    2 | 2010-02-15 | 2010-04-15 |       59 |
    |    3 | 2011-10-08 | 2011-10-10 |        2 |
    +------+------------+------------+----------+
    *******************************

    COMMENT: And the final result:

    Code:
    select page, AVG(DATEDIFF(enddate,startdate)) as averagedays
    from bview2
    group by page
    order by page;
    
    +------+-------------+
    | page | averagedays |
    +------+-------------+
    |    1 |     14.5000 |
    |    2 |     59.0000 |
    |    3 |      2.0000 |
    +------+-------------+
    Okay?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,650
    Thanks
    80
    Thanked 4,636 Times in 4,598 Posts
    If you care, here the whole thing is as a single query, no views:
    Code:
    SELECT page, AVG(DATEDIFF(enddate,startdate)) as averagedays
    FROM ( 
           SELECT page, startdate, min(enddate) as enddate
           FROM (
                  SELECT b1.page, b2.ondate AS startdate, b1.ondate AS enddate
                  FROM bubikol as b1, bubikol as b2
                  WHERE b1.page = b2.page AND b1.ondate > b2.ondate 
                ) AS phase1
           GROUP BY page, startdate
         ) AS phase2
    GROUP BY page
    ORDER BY page;
    If it's not obvious, it is the "phase2" (the bview2) that will eat up performance time.

    And I tried adding indexes on both fields in my "bubikol" table and found that only the index on page makes any difference at all (according to MySQL's EXPLAIN feature) and even then not that much.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,650
    Thanks
    80
    Thanked 4,636 Times in 4,598 Posts
    NOTE: Adding an index on page field helps a little, according to EXPLAIN. Adding one on the ondate field does nothing.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    BubikolRamios (02-17-2012)

  • #7
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,994
    Thanks
    120
    Thanked 76 Times in 76 Posts
    aha, I did not thought of min(enddate)
    as
    and b1.ondate > b2.ondate;
    joins to multiple dates.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •