Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 10 of 10
  1. #1
    Regular Coder
    Join Date
    Apr 2003
    Location
    Montreal, QC
    Posts
    340
    Thanks
    3
    Thanked 2 Times in 2 Posts

    Most efficient way to update multiple rows?

    I have a PHP application with a database where lots of rows are updated quite often. Currently, I am updating the rows by sending a bunch of different UPDATE statements. Does anyone know if it would be more efficient to use a switch statement like described here?

    Essentially, should I replace 500 queries like this:

    Code:
    UPDATE table SET row1=someval WHERE key=keyval;
    with 1 query like this:

    Code:
    UPDATE table SET row1=CASE key
    WHEN keyval THEN someval
    WHEN otherkeyval THEN someotherval
    ... 500 times....
    ELSE row1 END;
    Search for Laughter or Just Search?
    GiggleSearch.org
    Blog: www.johnbeales.com
    All About Ballet: www.the-ballet.com

  • #2
    Regular Coder
    Join Date
    Apr 2003
    Location
    Montreal, QC
    Posts
    340
    Thanks
    3
    Thanked 2 Times in 2 Posts

    The Answer:

    After some experimentation & research, it seems that the single query is faster IF you use a where clause saying WHERE key IN (list,of,key,values) AND the number of rows being updated is small enough. On my local system I was able to update 10000 rows in around 1 second, however, when I expand that to 100,000 rows it takes almost 200 seconds.

    In comparison, issuing 100,000 individual UPDATE queries takes about 15 seconds.
    Search for Laughter or Just Search?
    GiggleSearch.org
    Blog: www.johnbeales.com
    All About Ballet: www.the-ballet.com

  • #3
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    Wouldn't this solution also require that the column gets updated with the same value for all 100k rows? Where the use of CASE WHEN would give you the flexibility to update with a different value for each of the 100k rows?

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,603
    Thanks
    80
    Thanked 4,634 Times in 4,596 Posts
    I think he means that he does
    Code:
    UPDATE table SET row1 = 'aaa' WHERE key = 11;
    UPDATE table SET row1 = 'bbb' WHERE key = 33;
    UPDATE table SET row1 = 'ccc' WHERE key = 57;
    UPDATE table SET row1 = 'ddd' WHERE key = 85;
    ... etc. ...
    That is, 100K keys with 100K *different* values.

    So it probably makes lots of sense, especially if the key is a primary key. Each of those individual operations is about as simple a thing as the DB can possibly do, whereas the CASE WHEN almost surely involves the DB's creating some complex execution plan.

  • #5
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    I was referring to the single query WHERE key IN (x,y,z) which would only work if the column was being updated with a single value.

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,603
    Thanks
    80
    Thanked 4,634 Times in 4,596 Posts
    Oh, DOH on me. Yes, of course.

    Yeah, not sure how he expected that to work.

  • #7
    Regular Coder
    Join Date
    Apr 2003
    Location
    Montreal, QC
    Posts
    340
    Thanks
    3
    Thanked 2 Times in 2 Posts
    I'm back - and there's been some action!

    I did mean this:

    Code:
    UPDATE table SET row1 = 'aaa' WHERE key = 11;
    UPDATE table SET row1 = 'bbb' WHERE key = 33;
    UPDATE table SET row1 = 'ccc' WHERE key = 57;
    UPDATE table SET row1 = 'ddd' WHERE key = 85;
    ... etc. ...
    And, you would think that it would be the fastest way to do things, however, it *seems* that it is faster to use a single UPDATE query with the more complicated CASE structure.

    I'm guessing that this is because of the overhead involved with initiating 100k individual queries is worse than the single tricky query.
    Search for Laughter or Just Search?
    GiggleSearch.org
    Blog: www.johnbeales.com
    All About Ballet: www.the-ballet.com

  • #8
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    27,603
    Thanks
    80
    Thanked 4,634 Times in 4,596 Posts
    Ummmm...that's the opposite of what you said earlier.

    I quote your earlier post:
    when I expand that to 100,000 rows it takes almost 200 seconds.

    In comparison, issuing 100,000 individual UPDATE queries takes about 15 seconds.
    I assume you mis-measured the time, before?

    Yes, there is a *lot* of overhead making a query from a server side language to a DB. Each one of the queries has to be sent across the "wire" (probably shared memory or an anonymous pipe) from the Web server to the DB server.

    Can I ask a question? Is there possibly some way to do the UPDATEs algorithmically?? That is, "if the old value is between 17 and 33 then the new value is 27" or some sort of other rule-based system? If so, and if there are only a (relative) handful of such rules, then maybe doing all this in a Stored Proc would be by far the fastest?

  • #9
    Regular Coder
    Join Date
    Apr 2003
    Location
    Montreal, QC
    Posts
    340
    Thanks
    3
    Thanked 2 Times in 2 Posts
    "Ummmm...that's the opposite of what you said earlier."

    And that's what happens when I go out for the afternoon - I forget things!

    At 10,000 rows the single query is faster, at 100,000 multiple queries is faster, however, it seems that 100,000 rows takes WAY more than 10 times as long as 10,000 rows. So, at 10k rows updated a single query is faster, but somewhere between 10k and 100k rows multiple queries becomes faster. I am guessing that at some point in there the dataset gets big enough that a disk has to be used as temporary storage, or, the CASE structure becomes too tricky and simply is slower than the overhead.

    Unfortunately there's no algorithmic way to determine which rows need updating. It's recording the results of an outside process and is extremely random.
    Search for Laughter or Just Search?
    GiggleSearch.org
    Blog: www.johnbeales.com
    All About Ballet: www.the-ballet.com

  • #10
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    I would guess the same exact thing, that at some point MySQL has to start caching to disk which slows things down exponentially. It may be worth your while to experiment until you find the sweet spot number of rows you can update with one query before caching happens, then run a few of those; could save you a few seconds.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •