Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    New to the CF scene
    Join Date
    May 2010
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Search engine with random

    Hi guys!

    I don't know where is right place of this topic.

    I don't need code, I can write it myself.
    I need only algorithm.

    I want to create search engine with random, but there are some problems. Search result may be great than 50 000 or more, also there is "load more" function. So I need remember the results which are already displayed, because when user clicks "load more" button the script may be displayed same results.

    I thing, one of the nice way is that javascript can remember displayed results IDs (I'm using ajax in whole site), and when user send load more request, javascript tells to php about already displayed results using array. So I can get the array for example called $displayed and create new query:
    "SELECT * FROM table WHERE id NOT IN ($displayed)"

    But when user loads 10 000 results, array elements count will be more than 200 000.

    Is there any better way?

    Thanks a lot.

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,965
    Thanks
    79
    Thanked 4,429 Times in 4,394 Posts
    You really expect some human being to load and look at 10,000 results?????

    In general, I try to never show a user more than a screen full of data. So no more than 10 to 20 records at a time, though I suppose you might get to 100 with some kinds of displays.

    Yet, 10000 record ids at, say, 6 characters each, is a lot, but not overwhelming. I don't think that's a really bad solution, at all.

    You will have to use <form method=post> of course, as that's way too much data for a query string, but I can't see that this is bad.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #3
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    Sounds like your search results already have an ID associated with them, and depending on how random you need it, or whether it needs to be random between users and sessions, you can approach it slightly differently.

    The way I would do it is store the search ID, order ID and row ID in a table, then paginate that. Then the user's session can match a search ID, the order ID would be what it sorts by and the row ID will point to the result you want to display.

    Depending on how random it can be, you can reuse the search ID in a way by creating a new table when the time comes, and give new users the results stored in the new table, and when it comes time to rotate again (the old results expire), you would rotate the tables and drop the oldest one. The reason behind this is due to the cost of a DELETE operation on a large amount of rows in a single table. Creating and dropping the tables or truncating them is faster than selectively deleteing rows, though you can use InnoDB for the table and not care, but InnoDB will probably be slower for this operation compared to using MyISAM (assuming your using MySQL).

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,965
    Thanks
    79
    Thanked 4,429 Times in 4,394 Posts
    Wojjie: And what happens if you have 3,875 users? You would create a temporary table, in random order, for each one of them???? Not practical.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #5
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    The results per table is if you reuse those results for all users, not per user, and the second table is for new results for newer users or users that had their results expire.

    If you want random results for EACH user, you use a single table, or you can use the same two table setup, so you don't have to delete expired results, instead you would drop the second, older table, create a new table and rename the remaining table to be the second table.

    At no point do you have more than 2 tables at a time.

    table a: new results
    table b: old results and the brink of being deleted when table a rotates to table b

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,965
    Thanks
    79
    Thanked 4,429 Times in 4,394 Posts
    Okay, if his specifications would allow it, that's a reasonable scheme.

    But I can easily envision specs that wouldn't allow that.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #7
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    It can easily be modified to follow most specs, except where there is no ID relating to the row that needs to be displayed.

    The main problem with search results is speed, and you need to cache it to a table, and this is the least costly way to DELETE old rows.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •