Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    New to the CF scene
    Join Date
    Dec 2011
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question SQLQuery for Trailing Days

    I am having trouble writing a sql statement. Here is the scenario:

    - My table contains the member_id and calendar_date that members of my social community logged in as follows:

    CREATE TABLE LOGIN_HISTORY
    (
    Login_Key int NOT NULL AUTO_INCREMENT, -- unique identifier for the login event
    Member_ID int NOT NULL, -- the member who logged in
    Login_Date DATETIME NOT NULL -- the date they logged in on
    )

    I am trying to write a SQL query that will allow me to create a trend chart of the following:

    - How many members who logged in during the past 7-14 days also logged in during the past 1-7 days
    - For example (given today is Jan 14), if a member logged into the community during Jan 1 - 7, they would get counted if they also logged in during Jan 8 - 14.

    Since this is a trend of percent of returning users over time, I believe I need to run a 7 trailing days calculation in SQL and also be able to calculate the denominator (all logins within the period) as well as the numerator (all who return during the next 7 days).

    Here is some sample data that would be in the source table (DDL specified above):
    Login_Key Member_ID Login_Date
    1 123 12/27/11
    2 123 12/27/11 (logged in twice on this date, could have multiple logins on a day but they we do not count more than the first)
    3 123 1/12/12
    4 123 1/13/12

    In case this question is difficult to follow, here is an example of what the result set might look like (if it is slightly different, I can make some adjustments, group by, etc): * Record_No included for the discussion below *

    Week_Ending_Date Member_ID Logged_In_Prior_Week Logged_In_Current_Week Record_No
    1/10/12 123 YES NO 1 (prior week = 12/27/11 - 1/2/12; current week = 1/3/12 - 1/10/12)
    1/11/12 123 NO NO 2 (prior week = 12/28/11 - 1/3/12; current week = 1/4/12 - 1/11/12)
    1/12/12 123 NO YES 3 (prior week = 12/29/11 - 1/4/12; current week = 1/5/12 - 1/12/12)
    1/13/12 123 NO YES 4 (prior week = 12/30/11 - 1/5/12; current week = 1/6/12 - 1/13/12)

    I will count the records that meet my criteria (record 1) and then divide (Logged_In_Current_Week/Logged_In_Prior_Week) and trend it across all of the days (even though Member_ID 123 did not contribute to 1/1/12, 1/11/12, and 1/13/12, other members likely would.

    This is not as simple as looking for the week portion of the date and grouping by that as each day the trend report would change (as opposed to changing every seven days). I think I need a trailing 7-day calculation or a moving window. I also need to keep this flexible as someone will invariably want the same calculation for a 30-day look back (with prior period = 60-30 days ago and current period = 30-1 days ago).

    I have a DATE dimension table that has a record for each day if that helps. If this is not possible with a query, then I will have to write a cursor to load a table and read from that but it seems possible with a query.

    Thank you for contributing some thoughts.

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,565
    Thanks
    80
    Thanked 4,620 Times in 4,583 Posts
    First, a minor comment: Your LOGIN_KEY field is a waste of time and space. By definition, it is that table's PRIMARY KEY (since an AUTO_INCREMENT field must be the primary key) and yet no other table uses it as a FOREIGN KEY, so it's actually just clogging up your database.

    A better way to have defined that table would probably have been
    Code:
    CREATE TABLE LOGIN_HISTORY
    (
        Member_ID int NOT NULL, -
        Login_Date DATETIME NOT NULL,
        PRIMARY KEY (Member_ID, Login_Date)
    )
    You'd probably get better performance from that.

    Anyway...

    I assume you want to do this trend analysis on a per MEMBER_ID basis, given the example you showed?

    Yes, having that separate table with all dates would be very handy. The query without it would be pretty complex, but with it it's not hard at all.

    I'll assume your all-dates table looks something like this:
    Code:
    CREATE TABLE allDates ( 
        theDate DATETIME );
    (and of course it could have other fields, but we don't need them for this query). And I'll also assume the existence of an "allMembers" table.

    So...let's tackle it:
    Code:
    SELECT A.theDate, 
           M.member_ID, 
           IF(COUNT(H2.*)>0,'yes','no') AS priorweek, 
           IF(COUNT(H1.*)>0,'yes','no') AS currentweek
    FROM allDates AS A 
    INNER JOIN allMembers AS M
    LEFT JOIN Login_History AS H1 
         ON DATE(H1.Login_Date) BETWEEN DATE_SUB(A.theDate,INTERVAL 6 DAY) AND A.theDate
            AND H1.member_ID = M.member_id
    LEFT JOIN Login_History AS H2 
         ON DATE(H2.Login_Date) BETWEEN DATE_SUB(A.theDate,INTERVAL 13 DAY) AND DATE_SUB(A.theDate,INTERVAL 7 DAY)
            AND H2.member_ID = M.member_id
    GROUP BY M.member_id, A.theDate
    ORDER BY M.member_id, A.theDate
    If you want to limit it to a single member_id, just add
    Code:
    WHERE M.member_id = $memid
    just before the GROUP BY.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #3
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,565
    Thanks
    80
    Thanked 4,620 Times in 4,583 Posts
    Above is untested, off the top of my head. It should work, but if not let me know.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #4
    New to the CF scene
    Join Date
    Dec 2011
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks. I agree with the ID column... point well taken.

    There is no "ON" clause for the first inner join. Is this intentional?

    I am going to test it out.

    Thanks again.

  • #5
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,565
    Thanks
    80
    Thanked 4,620 Times in 4,583 Posts
    > There is no "ON" clause for the first inner join. Is this intentional?

    Yes. And it's possible that MySQL may choke on that. If so, give it a dummy ON clause, such as ON 1 = 1.

    We *want* all possible days and all possible members, so we want the full cartesian product--all rows from both tables with no limitations.

    If it's not obvious, you can limit the dates test by adding
    Code:
    WHERE a.theDate BETWEEN '2012-1-1' AND '2012-1-31'
    or whatever date range you want.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.


  •  

    Tags for this Thread

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •