SQL - why does the LEFT JOIN screw up this SQL?

SQL - why does the LEFT JOIN screw up this SQL?

Author
Discussion

mystomachehurts

Original Poster:

11,669 posts

251 months

Monday 12th November 2007
quotequote all
Big events table, +70million records, has a protocol_id column which is indexed, that I want to join to the protocols table so that I get the protocol name instead of its number.

So if I select the DISTINCT protocol_id from the events table I get a very fast result of 9 rows.

But, as you can see, when I now include a JOIN to get the protocol name it seems to resort into using a full table scan.

I've tried an INNER JOIN, that reverses the table lookup but it still want's to pull 79million records from the events table.

How come it doesn't just do the DISTINCT lookup on the protocol_id (which is very very fast) and then go find the protocols.filter for the 9 rows it found?

Is there a way I can code around this?

Many thanks.




mysql> select distinct(events.protocol_id) from events;
+-------------+
| protocol_id |
+-------------+
| 98 |
| 99 |
| 101 |
| 104 |
| 110 |
| 118 |
| 121 |
| 123 |
| 124 |
+-------------+
9 rows in set (0.00 sec)

mysql> explain select distinct(events.protocol_id) from events;
+----+-------------+--------+-------+---------------+----------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+----------------+---------+------+------+--------------------------+
| 1 | SIMPLE | events | range | NULL | ind_protocolid | 8 | NULL | 111 | Using index for group-by |
+----+-------------+--------+-------+---------------+----------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)



mysql> explain select distinct(events.protocol_id), protocols.filter
from events
LEFT JOIN protocols on events.protocol_id = protocols.protocol_id;

+----+-------------+-----------+-------+-----------------+-----------------+---------+-------------------------------------+----------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+-------+-----------------+-----------------+---------+-------------------------------------+----------+------------------------------+
| 1 | SIMPLE | events | index | NULL | ind_protocolid | 8 | NULL | 78549035 | Using index; Using temporary |
| 1 | SIMPLE | protocols | ref | idx_protocol_id | idx_protocol_id | 8 | network_analyser.events.protocol_id | 1 | |
+----+-------------+-----------+-------+-----------------+-----------------+---------+-------------------------------------+----------+------------------------------+







mystomachehurts

Original Poster:

11,669 posts

251 months

Monday 12th November 2007
quotequote all
Sussed it... needed to use a sub-quey instead


mysql> select protocols.filter FROM protocols WHERE protocols.protocol_id IN (SELECT distinct(events.protocol_id) FROM events);
+--------+
| filter |
+--------+
| bgan |
| cncp |
| rhcp |
| mcpcp |
| atm |
| sscop |
| eth |
| udp |
| ip |
+--------+
9 rows in set (0.03 sec)


whygee02

3,377 posts

201 months

Monday 12th November 2007
quotequote all
Luckily you must be running a recent version of mysql, sub SELECTs were only introduced to mysql recently !!