18、读懂MySQL中的执行计划（下）

接上一讲

POSSIBLE_KEYS

possible_keys列显示查询单元能使用range、ref等访问路径访问的索引。执行计划最终选择的索引在keys列显示。是否使用索引，以及使用哪个索引，取决于优化器对各种访问方式的成本评估，还跟表连接的顺序和连接算法也有关系。

下面这个例子中，t1表的possible_keys有索引idx_abc，但是当t1表作为驱动表时，就无法使用索引idx_abc。

mysql> explain select * from tab t1, tab t2 where t1.a = t2.a and t1.b = t2.b;


+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key     | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | t1    | ALL  | idx_abc       | NULL    | NULL    | NULL              | 9913 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | ref  | idx_abc       | idx_abc | 8       | rep.t1.a,rep.t1.b |    1 |   100.00 | NULL  |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+

KEY

key列显示执行计划实际使用的索引。如果key列为NULL，则说明查询单元没有使用索引。对于index_merge访问路径，key列中会显示多个索引。

KEY_LEN

key_len列显示执行计划使用到的索引列的总长度。根据key_len可以推算出执行计划使用到了索引中的哪几个字段。key_len根据索引字段的类型和字段是否为空计算得到。对于字符类型如varchar、char，key_len为字符数乘以单个字符的最大可能字节数。对于每个可变长类型如varchar，key_len额外加2。对于每个可以为空的字段，key_len额外加1。

 create table t_k(
     a varchar(20), 
     b char(20), 
     key idx_a(a,b)
) engine=innodb charset=utf8mb4;




mysql> explain select * from t_k where  a='x';
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | t_k   | ref  | idx_a         | idx_a | 83      | const |    1 |   100.00 | Using index |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+




mysql> explain select * from t_k where  a='x' and b='x';


+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref         | rows | filtered | Extra                    |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
|  1 | SIMPLE      | t_k   | ref  | idx_a         | idx_a | 164     | const,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+

上面这个例子中，SQL 1使用索引中的字段a，字段A的类型为varchar(20)，字符集为utf8mb4，key_len为20 * 4 + 2 + 1 = 83。

SQL 2使用到了字段A和B，字段A的key_len为83，字段B的key_len为20 * 4 + 1 = 81，整体Key_len为字段A和字段B相加164。

REF

ref列显示用来进行索引查找的值，ref的取值可能是以下几种情况：

const：使用常量匹配
db.tab.c：使用驱动表的某个字段匹配
func：使用某个函数的计算结果匹配。可以在执行explain后使用show warnings命令查看转换后的SQL。

ROWS

查询单元需要访问的记录数。对于InnoDB引擎，这里的记录数是一个预估的行数，跟实际执行过程中真实访问的记录数可能会有一些差异。对于全表扫描和全索引扫描，这里的行数从统计信息中获取。对于索引访问（type为ref或range），rows通过访问索引评估得到，或通过索引的统计信息计算得到。对应派生表，rows通过一些规则评估得到。

一般来说，rows越大，说明查询单元需要访问的记录数越多，执行时间越长。

FILTERED

filtered字段单位为百分比，取值范围为0-100，表示经过where子句中的条件过滤后，满足条件的记录数相对于rows列中显示的行数所占的百分比。使用公示rows * filtered / 100可以得到优化器预估的查询单元返回的记录数。如果当前的查询单元作为驱动表，那么这里的记录数还决定了被驱动的查询单元需要执行多次。

优化器中有一系列固定的规则来计算filtered的取值。你可以在分析表（analyze table）的时候给字段添加直方图，使优化器能更精确地计算filtered。参数optimizer_switch中的选项condition_fanout_filter用来控制是否开启条件过滤。

EXTRA

Extra列中显示了执行计划额外的一些重要信息。

using where

如果访问路径为ALL或index，Extra中没用using where，说明查询需要读取整个表或索引的数据。

Range checked for each record (index map: 0x n)

如果Extra中出现了“Range checked for each record”，那么查询的性能很可能不太好。这里index map是索引编号的位图信息。

mysql> explain select * from tab a, tab b where  a.id > b.id and a.c > b.c;


+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
|  1 | SIMPLE      | a     | ALL  | PRIMARY       | NULL | NULL    | NULL | 9913 |   100.00 | NULL                                           |
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL | 9913 |    11.11 | Range checked for each record (index map: 0x1) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+

Using index; Using temporary

Extra中出现Using temporary，说明用到了临时表。

mysql> explain select b,a,count(*) from  tab group by b,a;


+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | tab   | index | idx_abc       | idx_abc | 12      | NULL | 9913 |   100.00 | Using index; Using temporary |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+

Using index for skip scan

查询条件没有传入索引的前缀字段，又用到了覆盖索引时，MySQL可能会使用skip scan。如果前缀列的唯一值很低，skip scan也可能会有不错的性能。

mysql> explain select c from tab where b=1;


+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 8       | NULL |  991 |   100.00 | Using where; Using index for skip scan |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+

Using index

使用了覆盖索引，也就是查询中所有列都包含在索引中。

no matching row in const table

说明表里面不存在满足条件的记录。

mysql> explain select * from tab where id = 12345;


+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table  type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                          |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
|  1 | SIMPLE      | NULL   NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | no matching row in const table |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+

Using index for group-by

mysql> explain select a, min(b) from tab group by a;


+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 4       | NULL |    4 |   100.00 | Using index for group-by |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+

LooseScan

LooseScan，以及FirstMatch、Start temporary、End temporary，都是子查询自动改写为表连接后的执行方式，我们在后续的子查询这一讲中再具体介绍。

mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)


mysql> explain select * from tab ta 
  where a=1 and c in (select a from tab tb);


+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra                  |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
|  1 | SIMPLE      | tb    | index | idx_abc       | idx_abc | 12      | NULL  | 9913 |     0.03 | Using index; LooseScan |
|  1 | SIMPLE      | ta    | ref   | idx_abc       | idx_abc | 4       | const | 3333 |    10.00 | Using index condition  |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+

FirstMatch(ta)

mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)


mysql> explain select * from tab ta where a=1 and c in (select c from tab tb);
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra                                                                   |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
|  1 | SIMPLE      | ta    | ref   | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL                                                                    |
|  1 | SIMPLE      | tb    | index | NULL          | idx_abc | 12      | NULL  | 9913 |    10.00 | Using where; Using index; FirstMatch(ta); Using join buffer (hash join) |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+

Start temporary, End temporary

mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)


mysql> explain select * from tab ta where id in (select b from tab tb where b < 10);


+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref      | rows | filtered | Extra                                     |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
|  1 | SIMPLE      | tb    | index  | NULL          | idx_abc | 12      | NULL     | 9913 |    33.33 | Using where; Using index; Start temporary |
|  1 | SIMPLE      | ta    | eq_ref | PRIMARY       | PRIMARY | 4       | rep.tb.b |    1 |   100.00 | End temporary                             |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+

Using index condition

使用到了索引下推条件。

mysql> explain select * from tab where a=1 and c=1;


+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | tab   | ref  | idx_abc       | idx_abc | 4       | const | 3333 |    10.00 | Using index condition |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+

Using filesort

说明查询需要排序。

mysql> explain select * from tab order by b;


+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | tab   | ALL  | NULL          | NULL | NULL    | NULL | 9913 |   100.00 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+

Using join buffer (hash join)

被驱动表缺少合适的索引时，MySQL会考虑使用Hash连接算法。

mysql> explain select * from tab t1, tab t2 where t1.a = 1 and t1.c=t2.c;


+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                                      |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | t1    | ref  | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL                                       |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL    | NULL    | NULL  | 9913 |    10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+

Using join buffer (Batched Key Access)

表关联时，使用了BKA优化。和MRR类似，BKA也是为了减少查询的随机IO的数量。

mysql> explain select /*+ BKA(tb) */ * 
  from tab ta, tab tb where ta.a = tb.a;


+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref      | rows | filtered | Extra                                  |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
|  1 | SIMPLE      | ta    | ALL  | idx_abc       | NULL    | NULL    | NULL     | 9913 |   100.00 | NULL                                   |
|  1 | SIMPLE      | tb    | ref  | idx_abc       | idx_abc | 4       | rep.ta.a | 3304 |   100.00 | Using join buffer (Batched Key Access) |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+

Using MRR

查询使用了MRR（Multi-Range Read），MRR主要是为了减少回表查询数据时随机IO的数量。下面这个例子中使用了BKA提示，强制优化器使用MRR。

mysql> explain select /*+ BKA(tab) */ * 
  from tab 
  where a=1 and b in (1,2,3);


+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 8       | NULL |    3 |   100.00 | Using index condition; Using MRR |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+

Using sort_union(…), Using union(…), Using intersect(…)

mysql> explain select * 
  from t_merge 
  where b=2 and c=2 and d=1;
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                       |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_cd,idx_bd | 8,8     | NULL |    2 |    66.99 | Using intersect(idx_cd,idx_bd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+




mysql> explain select * 
  from t_merge 
  where (b=2 and d=1) or (c=2 and d=1);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                   |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8     | NULL |  108 |   100.00 | Using union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+




mysql> explain select * 
  from t_merge 
  where (b=2 and d between 1 and 2) or (d=1 and c=2);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                        |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8     | NULL |  165 |   100.00 | Using sort_union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+

总结

出于完整性的考虑，这一讲中我们使用了四十多个SQL语句，演示并解释了MySQL执行计划的各种输出。在实际工作中，一般也不会遇到这里的每一种情况。

执行计划有几个信息要重点关注：

通过ID、SELECT_TYPE、TABLE这几列可以了解语句整体的连接、嵌套结构。
TYPE列为ref、range时，才是我们平时说的用到了索引。type为index时，实际上是使用了全索引扫描。
ROWS列是优化器评估的需要从到存储引擎里访问的记录的数量，这个数量对性能有直接的影响。
EXTRA列里面提供了执行计划的额外信息，对这里出现的内容要有大致的了解。

你也可以使用FORMAT=TREE，以树的形式显示执行计划，有时候这样可能会更直观。

思考题

这一讲中有这么一个SQL：

1	select id, a, (select avg(b) from tab where a=t1.a) from tab t1;

在我们的测试表中，字段A的唯一值有三个，所以理论上，最优的情况下只需要将3个不同的A的值分别传入子查询中（select avg(b) from tab where a = t1.a），并将计算结果缓存起来，这样子查询只需要执行3次。

MariaDB实际上就有这样的处理，因此在执行上面这个SQL时，速度比较快。

mariadb> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+---+---------------------------------------+
| id   | a | (select avg(b) from tab where a=t1.a) |
+------+---+---------------------------------------+
|    0 | 0 |                             4999.5000 |
|    1 | 1 |                             4999.0000 |
|    2 | 2 |                             5000.0000 |
......
| 9999 | 0 |                             4999.5000 |
+------+---+---------------------------------------+
10000 rows in set (0.01 sec)

但是MySQL中，同样的表结构和数据，配置一样的服务器，执行这个SQL时，执行时间是好几个数量级。

mysql> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+------+---------------------------------------+
| id   | a    | (select avg(b) from tab where a=t1.a) |
+------+------+---------------------------------------+
|    0 |    0 |                             4999.5000 |
|    1 |    1 |                             4999.0000 |
|    2 |    2 |                             5000.0000 |
......
| 9999 |    0 |                             4999.5000 |
+------+------+---------------------------------------+
10000 rows in set (4 min 17.90 sec)

对于这种情况，你会怎么解决呢？

期待你的思考，欢迎在留言区中与我交流。如果今天的课程让你有所收获，也欢迎转发给有需要的朋友。我们下节课再见！