MySQL's braindead subselect handling...

| | Comments (3) | TrackBacks (0)
Or: Why MySQL is not a database but only a "database".

Imagine this query:

mysql> select summaries0_.bw_eventid as bw1_1_, summaries0_.bw_strid as bw2_1_, bwstring1_.bw_id as bw1_70_0_, bwstring1_.bwseq as bwseq70_0_, bwstring1_.bw_lang as bw3_70_0_, bwstring1_.bw_value as bw4_70_0_ from bw_event_summaries summaries0_ left outer join bw_strings bwstring1_ on summaries0_.bw_strid=bwstring1_.bw_id where summaries0_.bw_eventid in (select bweventobj1_.eventid from bw_recurrences bwrecurren0_ inner join bw_events bweventobj1_ on bwrecurren0_.masterid=bweventobj1_.eventid where (bwrecurren0_.bw_rstart_floating='T' and bwrecurren0_.start_date<'20100322T000000Z' and (bwrecurren0_.end_date>'20100315T000000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100315T000000Z') or (bwrecurren0_.bw_rstart_floating is null) and bwrecurren0_.start_date<'20100321T230000Z' and (bwrecurren0_.end_date>'20100314T230000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100314T230000Z')) and bweventobj1_.bw_col_path='/user/pascal.gienger/Training' );

+--------+--------+-----------+------------+-----------+--------------------------+
| bw1_1_ | bw2_1_ | bw1_70_0_ | bwseq70_0_ | bw3_70_0_ | bw4_70_0_                |
+--------+--------+-----------+------------+-----------+--------------------------+
|   2251 |   2674 |      2674 |          0 | NULL      | MLBf                     |
|   2493 |   2933 |      2933 |          0 | NULL      | Cvlqxpqz                 |
|   2848 |   3322 |      3322 |          0 | NULL      | Vlqehfhva                |
|   2927 |   3405 |      3405 |          0 | NULL      | fb.nt-Lkqivqm            |
|   3057 |   3557 |      3557 |          0 | NULL      | Glqsd                    |
|   3116 |   3616 |      3616 |          0 | NULL      | KGWG                     |
|   3212 |   3718 |      3718 |          0 | NULL      | MlwfQPE                  |
|   3361 |   3873 |      3873 |          0 | NULL      | Eklmzmmlsfeei            |
|   3573 |   4098 |      4098 |          0 | NULL      | HYX-Rjdmmtyubvbqeczvwxz  |
|   3622 |   4149 |      4149 |          0 | NULL      | Uayfhjjlge               |
|   3741 |   4283 |      4283 |          0 | NULL      | Sflnh QK/Flkudduiwmkbdqz |
+--------+--------+-----------+------------+-----------+--------------------------+
11 rows in set (1 min 1.69 sec)
1 Minute!!!!! After seperating the outer and inner join the same table displays just after some seconds:
1. Take the inner join:

mysql> select bweventobj1_.eventid from bw_recurrences bwrecurren0_ inner join bw_events bweventobj1_ on bwrecurren0_.masterid=bweventobj1_.eventid where (bwrecurren0_.bw_rstart_floating='T' and bwrecurren0_.start_date<'20100321T000000Z' and (bwrecurren0_.end_date>'20100314T000000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100314T000000Z') or (bwrecurren0_.bw_rstart_floating is null) and bwrecurren0_.start_date<'20100320T230000Z' and (bwrecurren0_.end_date>'20100313T230000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100313T230000Z')) and bweventobj1_.bw_col_path='/user/pascal.gienger/Training';

+---------+
| eventid |
+---------+
|    3622 |
|    3573 |
|    3212 |
|    3116 |
|    2251 |
|    2927 |
|    2493 |
|    3057 |
|    2848 |
|    3212 |
|    3361 |
|    2493 |
|    3057 |
|    3741 |
|    3212 |
+---------+
15 rows in set (0.03 sec)


0.03 seconds (!).

2. Take this as a result for the outer join:

mysql> select summaries0_.bw_eventid as bw1_1_, summaries0_.bw_strid as bw2_1_, bwstring1_.bw_id as bw1_70_0_, bwstring1_.bwseq as bwseq70_0_, bwstring1_.bw_lang as bw3_70_0_, bwstring1_.bw_value as bw4_70_0_ from bw_event_summaries summaries0_ left outer join bw_strings bwstring1_ on summaries0_.bw_strid=bwstring1_.bw_id WHERE summaries0_.bw_eventid IN (3622,3573,3212,3116,2251,2927,2493,3057,2848,3212,3361,2493,3057,3741,3212);

+--------+--------+-----------+------------+-----------+--------------------------+
| bw1_1_ | bw2_1_ | bw1_70_0_ | bwseq70_0_ | bw3_70_0_ | bw4_70_0_                |
+--------+--------+-----------+------------+-----------+--------------------------+
|   2251 |   2674 |      2674 |          0 | NULL      | MLBf                     |
|   2493 |   2933 |      2933 |          0 | NULL      | Cvlqxpqz                 |
|   2848 |   3322 |      3322 |          0 | NULL      | Vlqehfhva                |
|   2927 |   3405 |      3405 |          0 | NULL      | fb.nt-Lkqivqm            |
|   3057 |   3557 |      3557 |          0 | NULL      | Glqsd                    |
|   3116 |   3616 |      3616 |          0 | NULL      | KGWG                     |
|   3212 |   3718 |      3718 |          0 | NULL      | MlwfQPE                  |
|   3361 |   3873 |      3873 |          0 | NULL      | Eklmzmmlsfeei            |
|   3573 |   4098 |      4098 |          0 | NULL      | HYX-Rjdmmtyubvbqeczvwxz  |
|   3622 |   4149 |      4149 |          0 | NULL      | Uayfhjjlge               |
|   3741 |   4283 |      4283 |          0 | NULL      | Sflnh QK/Flkudduiwmkbdqz |
+--------+--------+-----------+------------+-----------+--------------------------+
11 rows in set (0.00 sec)

MySQL, please rest in peace.

Is there any tiny config option to change this behaviour?
Tables are in InnoDB format.

MySQL 5.1.44, icc version, innodb parameters:

innodb_data_home_dir = /var/mysql_5144/
innodb_data_file_path = ibdata1:100M:autoextend
innodb_log_group_home_dir = /var/mysql_5144/
innodb_buffer_pool_size = 400M
innodb_additional_mem_pool_size = 2M
innodb_log_file_size = 100M
innodb_log_buffer_size = 8M
innodb_lock_wait_timeout = 50
innodb_doublewrite=0
innodb_flush_log_at_trx_commit=0
innodb_thread_concurrency=8
innodb_flush_method=O_DIRECT

0 TrackBacks

Listed below are links to blogs that reference this entry: MySQL's braindead subselect handling....

TrackBack URL for this entry: http://southbrain.com/mt/mt-tb.cgi/164

3 Comments

If you push the subquery in to a "derived table" (a subquery in the FROM clause), then you will see better performance - it will only be executed once, rather than once for every row returned from the other statement (which is what happens in your case above):

select summaries0_.bw_eventid as bw1_1_, summaries0_.bw_strid as bw2_1_, bwstring1_.bw_id as bw1_70_0_, bwstring1_.bwseq as bwseq70_0_, bwstring1_.bw_lang as bw3_70_0_, bwstring1_.bw_value as bw4_70_0_ from bw_event_summaries summaries0_ left outer join bw_strings bwstring1_ on summaries0_.bw_strid=bwstring1_.bw_id join (select bweventobj1_.eventid from bw_recurrences bwrecurren0_ inner join bw_events bweventobj1_ on bwrecurren0_.masterid=bweventobj1_.eventid where (bwrecurren0_.bw_rstart_floating='T' and bwrecurren0_.start_date'20100314T000000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100314T000000Z') or (bwrecurren0_.bw_rstart_floating is null) and bwrecurren0_.start_date'20100313T230000Z' or bwrecurren0_.start_date=bwrecurren0_.end_date and bwrecurren0_.end_date>='20100313T230000Z')) and bweventobj1_.bw_col_path='/user/pascal.gienger/Training') s on summaries0_.bw_eventid = s.eventid;

Thank you for your reply!

The problem is as follows: It's not me generating this statement, it's the Hibernate-Framework which is used in this software product.

I did not expect a response from an Oracle guy here - and I am positively touched.

Problem remains that the MySQL parser/optimizer is the place where big performance gains could be made.

Indeed, I recognized the hibernate aliasing in your query.. ;)

Sometimes Hibernate does dumb things when it comes to the MySQL optimizer (well, maybe the MySQL optimizer could do with a little work too of course, but you deal with what you've got first and foremost, when it comes to abstraction layers, imho).

There is work underway for various subquery optimization tasks, this being one of them, such as:

http://forge.mysql.com/worklog/task.php?id=1110

Leave a comment

June 2010

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      

About

This blog is owned by:

Pascal Gienger
Kanzleistr. 14
78462 Konstanz
Phone +49 7531 584298
Fax +49 7531 584298-9

Phone USA 1-678-791-4182

YouTube Channel: pascalgienger