Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 15 of 15
  1. #1
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts

    MySQL Union query is EXTREMELY SLOW

    Ok, so here I go again. Once again the original programmers of this system made a bunch of views and then unioned two of them, then a final query to select ha ha

    So I have this huge view that is taking forever, I mean I run it on my Linux server and I have to stop it before it finishes it takes so long! Over 5 Min!!

    Of course the original on MSSQL didn't, but that's them, now on MySQL it is horrible. It runs on 4 tables, two of which have over 70K records in it. Yes all columns that it calls on are indexed and a few in each are PK's.

    Any ideas on how to optimize this bugger?

    Here is the horrible query:

    Code:
    SELECT     bank_movimientos_bancos_enca_temp.no_cuenta, bank_movimientos_bancos_enca_temp.codigo_empresa AS empresa, 
                          bank_movimientos_bancos_enca_temp.tipo_movimiento AS tipdoc, bank_movimientos_bancos_enca_temp.no_movimiento AS no_partida, 
                          bank_movimientos_bancos_enca_temp.fechacontable AS fecha, bank_movimientos_bancos_deta_temp.codigo_cuenta AS ccon, 
                          bank_movimientos_bancos_deta_temp.debito_credito AS tipotrn, bank_movimientos_bancos_deta_temp.correlativo AS linea, 
                          bank_movimientos_bancos_enca_temp.descripcion, bank_movimientos_bancos_deta_temp.valor AS monto, 
                          bank_movimientos_bancos_deta_temp.compra, bank_movimientos_bancos_deta_temp.venta, bank_movimientos_bancos_deta_temp.promedio
    FROM         bank_movimientos_bancos_enca_temp INNER JOIN
                          bank_movimientos_bancos_deta_temp ON 
                          bank_movimientos_bancos_enca_temp.codigo_empresa = bank_movimientos_bancos_deta_temp.codigo_empresa AND 
                          TRIM(bank_movimientos_bancos_enca_temp.no_cuenta) = TRIM(bank_movimientos_bancos_deta_temp.no_cuenta) AND 
                          bank_movimientos_bancos_enca_temp.no_movimiento = bank_movimientos_bancos_deta_temp.no_movimiento
    UNION ALL
    SELECT     bank_movimientos_bancos_enca_hist.no_cuenta, bank_movimientos_bancos_enca_hist.codigo_empresa AS empresa, 
                          bank_movimientos_bancos_enca_hist.tipo_movimiento AS tipdoc, bank_movimientos_bancos_enca_hist.no_movimiento AS no_partida, 
                          bank_movimientos_bancos_enca_hist.fechacontable AS fecha, bank_movimientos_bancos_deta_hist.codigo_cuenta AS ccon, 
                          bank_movimientos_bancos_deta_hist.debito_credito AS tipotrn, bank_movimientos_bancos_deta_hist.correlativo AS linea, 
                          bank_movimientos_bancos_enca_hist.descripcion, bank_movimientos_bancos_deta_hist.valor AS monto, bank_movimientos_bancos_deta_hist.compra, 
                          bank_movimientos_bancos_deta_hist.venta, bank_movimientos_bancos_deta_hist.promedio
    FROM         bank_movimientos_bancos_enca_hist INNER JOIN
                          bank_movimientos_bancos_deta_hist ON 
                          bank_movimientos_bancos_enca_hist.codigo_empresa = bank_movimientos_bancos_deta_hist.codigo_empresa AND 
                          TRIM(bank_movimientos_bancos_enca_hist.no_cuenta) = TRIM(bank_movimientos_bancos_deta_hist.no_cuenta) AND 
                          bank_movimientos_bancos_enca_hist.no_movimiento = bank_movimientos_bancos_deta_hist.no_movimiento
    It is just so horrible ha ha

    Thanks for any advise.
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,880
    Thanks
    79
    Thanked 4,421 Times in 4,386 Posts
    How long does it take to run the individual "branches" of the UNION?

    That is, how long does just THIS query take:
    Code:
    SELECT     bank_movimientos_bancos_enca_temp.no_cuenta, bank_movimientos_bancos_enca_temp.codigo_empresa AS empresa, 
                          bank_movimientos_bancos_enca_temp.tipo_movimiento AS tipdoc, bank_movimientos_bancos_enca_temp.no_movimiento AS no_partida, 
                          bank_movimientos_bancos_enca_temp.fechacontable AS fecha, bank_movimientos_bancos_deta_temp.codigo_cuenta AS ccon, 
                          bank_movimientos_bancos_deta_temp.debito_credito AS tipotrn, bank_movimientos_bancos_deta_temp.correlativo AS linea, 
                          bank_movimientos_bancos_enca_temp.descripcion, bank_movimientos_bancos_deta_temp.valor AS monto, 
                          bank_movimientos_bancos_deta_temp.compra, bank_movimientos_bancos_deta_temp.venta, bank_movimientos_bancos_deta_temp.promedio
    FROM         bank_movimientos_bancos_enca_temp INNER JOIN
                          bank_movimientos_bancos_deta_temp ON 
                          bank_movimientos_bancos_enca_temp.codigo_empresa = bank_movimientos_bancos_deta_temp.codigo_empresa AND 
                          TRIM(bank_movimientos_bancos_enca_temp.no_cuenta) = TRIM(bank_movimientos_bancos_deta_temp.no_cuenta) AND 
                          bank_movimientos_bancos_enca_temp.no_movimiento = bank_movimientos_bancos_deta_temp.no_movimiento
    And then how long for the other one?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #3
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts
    Wow, sorry I should have done that before I posted. What is amazing is that neither one of them would finish for me, I got bored before they did. The longest I waited was 30 min. for the first one and 15 for the second. Both on my server but neither would finish.

    I checked indexes and all tables have indexes on all columns referenced, plus 90% of the columns are PK's.

    Here is the record count, maybe that is it with the joins:

    bank_movimientos_bancos_deta_hist = 15342
    bank_movimientos_bancos_deta_temp = 80364
    bank_movimientos_bancos_enca_hist = 7598
    bank_movimientos_bancos_enca_temp = 37841
    So if that is it, what can I do?
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #4
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    If they all are defined in the primary key or regular keys, than the problem could lie with performing the TRIM() operation, as it might not use those keys. Is it possible to TRIM() the data before populating it?

  • Users who have thanked Wojjie for this post:

    jfreak53 (05-04-2011)

  • #5
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts
    Wow, I can't believe I didn't try that !! Thank you my friend!! I didn't think trim would take that much ha ha

    It must be late if I'm not trying things before posting, I need to stop looking at code for the day!! ha ha

    Thanks guys, alls working now!
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,880
    Thanks
    79
    Thanked 4,421 Times in 4,386 Posts
    Quote Originally Posted by Wojjie View Post
    If they all are defined in the primary key or regular keys, than the problem could lie with performing the TRIM() operation, as it might not use those keys. Is it possible to TRIM() the data before populating it?
    That was great intuition!

    MySQL is really really dumb about using indexes, compared to SQL Server and Oracle, say.

    It certainly wouldn't use an index on that TRIM()'ed field if the field was part of a composite index, for example.

    I wouldn't have thought that it couldn't use a simply-indexed field with TRIM(), but it sure doesn't surprise me that it doesn't.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #7
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    Quote Originally Posted by Old Pedant View Post
    That was great intuition!

    MySQL is really really dumb about using indexes, compared to SQL Server and Oracle, say.

    It certainly wouldn't use an index on that TRIM()'ed field if the field was part of a composite index, for example.

    I wouldn't have thought that it couldn't use a simply-indexed field with TRIM(), but it sure doesn't surprise me that it doesn't.
    Well with MySQL, any operation performed on any column throws out the use of the index, as it is no longer an exact match to the column. Though I guess other SQL servers have optimizations for that, and it would make sense for an SQL engine to index the trimmed data, and just keep track of the whitespace on either side separately somehow, but then again, that could slow down the engine just because of an assumption that a small percentage of the people might do.

  • #8
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts
    Question Ped, I don't think we have had this discussion before. Would it be hoove of me or might it increase speed if I switched to another DB structure like PgSQL? I just tried PgSQL on one of the queries without the trim and it is 4 seconds faster it seems.

    I have more testing to do of course, right now I am going to transfer over the entire 300MB MySQL DB to PgSQL and try it with both views in place and the final query to see if it's faster at all.

    What are your opinions? Would I gain anything that you know of from switching over to this database type?
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #9
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,880
    Thanks
    79
    Thanked 4,421 Times in 4,386 Posts
    Never used PgSQL, so can't offer an opinion on it.

    But I can't see why you shouldn't at least try the experiment.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #10
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,880
    Thanks
    79
    Thanked 4,421 Times in 4,386 Posts
    Quote Originally Posted by Wojjie View Post
    Well with MySQL, any operation performed on any column throws out the use of the index, as it is no longer an exact match to the column. Though I guess other SQL servers have optimizations for that...
    Yes. It's called "do the operation on the index if possible instead of insisting on doing it on the actual field in the record."

    If the optimize realizes that, by doing the TRIM( ) on the index, joining based on that, is going to be better than insisting on doing TRIM( ) on the actual field...well, the optimizer should do so.

    MySQL simply doesn't consider it worth while to analyze whether it is worth performing operations on index data vs. field data.

    Don't get me wrong, it was probably a really smart decision, back 10 years ago, to get something up and running that was at least pretty darned good. But that was 10 years ago.

    The problem is the same problem you see with many open source projects: After the first couple of versions, all that's left to do is ugly grunt work, and the founders of the project have gone on to more interesting things. There's no money to be made in making improvements, there's no big glory accruing to those who would do so, so they don't happen.

    Truthfully, it's nearly amazing how many improvements *HAVE* occurred in MySQL over the years.

    But neither is it surprising that nobody has spent tens of thousands of man hours on improving query optimization.

    Fun and glory are great motivators for new projects. Profit is a better motivator for continuing projects.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #11
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts
    Ha ha well said PED ha ha

    Now I ran a preliminary query on the PGSQL DB. I copied the tables over necessary for the two views and the final query, then I copied the Views over.

    Done. Now I ran the query on MySQL then on PgSQL on my local machine. On MySQL 2 Min 10Sec.!!! Holy crap ha ha

    On PgSQL on my local machine, 71 Sec.!!! OMG That is amazing, and the most amazing thing is, there are no indexes on these tables yet since for some reason it didn't copy those. So I am going to put in indexes and PK's on them all where they need to be then re-run test, I will let you know.
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #12
    Regular Coder jfreak53's Avatar
    Join Date
    May 2004
    Location
    Guatemala
    Posts
    477
    Thanks
    19
    Thanked 10 Times in 10 Posts
    Well surprisingly not a difference at all with the indexes and PK's. Still 70 seconds for the final query to display results. But compared to 2 Min!! Plus on the final server this will go faster since it's a server.

    Any opinions on this out there, anyone have any good or bad experience using PgSQL?? Especially compared to MySQL?
    "FORTRAN is not a language. It's a way of turning a multi-million dollar mainframe, into a $50 programmable scientific calculator."
    http://www.microfastcat.com -- FastCat Software, the fastest software on the NET!
    http://www.microthosting.com -- Free reseller web hosting, Hosting, VPS, FREE SMALL HOSTING!!!
    http://www.microtronix-tech.com -- Web design and programming

  • #13
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    Even if you can do a TRIM on the index, it still won't be as fast as it would be without having to do it, and I don't see the point of implementing support for TRIM on indexes or manipulation, as you should be optimizing for that anyway, if anything implementing that would promote laziness. Also, having to analyze the approach before every query would unnecessarily increase the amount of operations for most scenarios that do not require that form of optimization.

    One thing I do notice though, that the other, commercial, SQL databases tend to be more consistent in performance, while MySQL tends to fluctuate depending on the query performed, but can be faster. Though the instant you introduce something complex, with scaling, things become better for the commercial product, though that might have changed more recently, I don't know.

    I do agree though that improvements can still be made, there are many things I would change that make no sense. For instance, from my own experience, it is faster to have InnoDB on a tmpfs with enough buffer to hold the database in memory as well, than it is to have MEMORY tables. This mainly due to table level locking being used for MEMORY tables, which makes sense at first, but then you have to consider that most servers have multiple processors, and you are basically removing the ability to fully utilize all of them with write operations with table level locking.

    So what I would do is invest in removing table level locking completely from MyISAM/etc, and perhaps not go as far as row level locking, but perhaps "partition" level, so it still is quick in certain circumstances.

  • #14
    Regular Coder
    Join Date
    Apr 2011
    Posts
    286
    Thanks
    2
    Thanked 39 Times in 39 Posts
    Quote Originally Posted by jfreak53 View Post
    Well surprisingly not a difference at all with the indexes and PK's. Still 70 seconds for the final query to display results. But compared to 2 Min!! Plus on the final server this will go faster since it's a server.

    Any opinions on this out there, anyone have any good or bad experience using PgSQL?? Especially compared to MySQL?
    Doing a second post to answer you question, as the other one was long and not concerning this directly.

    PgSQL can be faster in quite a few situations than MySQL, but in my experience, when I tried it, I found that you had to "vacuum" regularly, or run out of hard drive space as fast as you were updating/inserting. The reason why is due to PgSQL (at the time that I tested it, might of changed since then) not reusing rows when updating/deleting/inserting, instead it would create a new row at the end of the table and mark the old one for deletion, but not physically removing it till you performed maintenance on the table.

    So if the same design exists, I would not suggest PgSQL on a heavily updated database unless you have time to figure out a good maintenance strategy for it. Then again, there might be a good solution to the problem I outlined, I only spent a week using PgSQL, and that was quite a few years ago.

    MySQL on the other hand can still be quite fast, just it tends to rely on heavy memory caching for some of its speed. Also, I would suggest trying InnoDB tables, I typically only use MyISAM on tables that mainly have INSERT operations, like for tables that are used as log files in a way, which sometimes I just rename the table and create a blank one in its place, which then I process the renamed table separately.

    Lastly, PgSQL seems to be more feature rich than MySQL, they almost seem like they have different goals or directions though that might have changed as well.

    Here is a good page comparing and giving various information about the two:
    http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL

  • #15
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,880
    Thanks
    79
    Thanked 4,421 Times in 4,386 Posts
    Quote Originally Posted by Wojjie View Post
    Even if you can do a TRIM on the index, it still won't be as fast as it would be without having to do it,
    Of course.

    and I don't see the point of implementing support for TRIM on indexes or manipulation, as you should be optimizing for that anyway, if anything implementing that would promote laziness.
    TRIM is just one example. Another example would be something like
    Code:
    SELECT * FROM table WHERE YEAR(somedatefield) = 2010
    Assuming somedatefield is indexed, this would be a trivial operation and would be easily optimized by, say, SQL Server. But not by MySQL.

    Yes, I know, there's a way around it:
    Code:
    SELECT * FROM table WHERE somedatefield >= '2010-1-1' AND somedatefield < '2011-1-1'
    (Or use BETWEEN if the field is truly date-only with no times.)

    But I shouldn't have to rewrite my queries like that. The optimizer should be able to perform operations on indexes as part of its standard work.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •