MySQL 8.0 Reference Manual（读书笔记49节--Optimizing SELECT Statements(4)）-摩杜云开发者社区

11 Multi-Range Read Optimization

Reading rows using a range scan on a secondary index can result in many random disk accesses to the base table when the table is large and not stored in the storage engine's cache. With the Disk-Sweep Multi-Range Read (MRR) optimization, MySQL tries to reduce the number of random disk access for range scans by first scanning the index only and collecting the keys for the relevant rows. Then the keys are sorted and finally the rows are retrieved from the base table using the order of the primary key. The motivation【ˌmoʊtɪˈveɪʃn 动机;动力;诱因;】 for Disk-sweep MRR is to reduce the number of random disk accesses and instead achieve【əˈtʃiːv 实现;(凭长期努力)达到(某目标、地位、标准);完成;成功;】 a more sequential scan of the base table data.

The Multi-Range Read optimization provides these benefits:

• MRR enables data rows to be accessed sequentially rather than in random order, based on index tuples. The server obtains a set of index tuples that satisfy the query conditions, sorts them according to data row ID order, and uses the sorted tuples to retrieve data rows in order. This makes data access more efficient and less expensive.

• MRR enables batch processing of requests for key access for operations that require access to data rows through index tuples, such as range index scans and equi-joins that use an index for the join attribute. MRR iterates over a sequence of index ranges to obtain qualifying index tuples. As these results accumulate, they are used to access the corresponding data rows. It is not necessary to acquire all index tuples before starting to read data rows.

The MRR optimization is not supported with secondary indexes created on virtual generated columns. InnoDB supports secondary indexes on virtual generated columns.

The following scenarios illustrate【ˈɪləstreɪt 说明;(用示例、图画等)解释;显示…存在;加插图于;表明…真实;给(书等)做图表;】 when MRR optimization can be advantageous:

Scenario A: MRR can be used for InnoDB and MyISAM tables for index range scans and equi-join operations.

1. A portion of the index tuples are accumulated in a buffer.

2. The tuples in the buffer are sorted by their data row ID.

3. Data rows are accessed according to the sorted index tuple sequence.

Scenario B: MRR can be used for NDB tables for multiple-range index scans or when performing an equi-join by an attribute.

1. A portion of ranges, possibly single-key ranges, is accumulated in a buffer on the central node where the query is submitted.

2. The ranges are sent to the execution nodes that access data rows.

3. The accessed rows are packed into packages and sent back to the central node.

4. The received packages with data rows are placed in a buffer.

5. Data rows are read from the buffer.

When MRR is used, the Extra column in EXPLAIN output shows Using MRR.

InnoDB and MyISAM do not use MRR if full table rows need not be accessed to produce the query result. This is the case if results can be produced entirely on the basis on information in the index tuples (through a covering index); MRR provides no benefit.

Two optimizer_switch system variable flags provide an interface to the use of MRR optimization. The mrr flag controls whether MRR is enabled. If mrr is enabled (on), the mrr_cost_based flag controls whether the optimizer attempts to make a cost-based choice between using and not using MRR (on) or uses MRR whenever possible (off). By default, mrr is on and mrr_cost_based is on.

For MRR, a storage engine uses the value of the read_rnd_buffer_size system variable as a guideline for how much memory it can allocate for its buffer. The engine uses up to read_rnd_buffer_size bytes and determines the number of ranges to process in a single pass.

12 Block Nested-Loop and Batched Key Access Joins

In MySQL, a Batched Key Access (BKA) Join algorithm is available that uses both index access to the joined table and a join buffer. The BKA algorithm supports inner join, outer join, and semijoin operations, including nested outer joins. Benefits of BKA include improved join performance due to more efficient table scanning. Also, the Block Nested-Loop (BNL) Join algorithm previously used only for inner joins is extended and can be employed for outer join and semijoin operations, including nested outer joins.

The following sections discuss the join buffer management that underlies the extension of the original BNL algorithm, the extended BNL algorithm, and the BKA algorithm.

12.1 Join Buffer Management for Block Nested-Loop and Batched Key Access Algorithms

MySQL can employ join buffers to execute not only inner joins without index access to the inner table, but also outer joins and semijoins【半连接;】 that appear after subquery flattening. Moreover, a join buffer can be effectively used when there is an index access to the inner table.

The join buffer management code slightly more efficiently utilizes【ˈjuːtəlaɪzɪz 利用;使用;应用;运用;】 join buffer space when storing the values of the interesting row columns: No additional bytes are allocated in buffers for a row column if its value is NULL, and the minimum number of bytes is allocated for any value of the VARCHAR type.

The code supports two types of buffers, regular and incremental. Suppose that join buffer B1 is employed to join tables t1 and t2 and the result of this operation is joined with table t3 using join buffer B2:

• A regular join buffer contains columns from each join operand. If B2 is a regular join buffer, each row r put into B2 is composed of the columns of a row r1 from B1 and the interesting columns of a matching row r2 from table t3.

• An incremental join buffer contains only columns from rows of the table produced by the second join operand. That is, it is incremental to a row from the first operand buffer. If B2 is an incremental join buffer, it contains the interesting columns of the row r2 together with a link to the row r1 from B1.

Incremental join buffers are always incremental relative to a join buffer from an earlier join operation, so the buffer from the first join operation is always a regular buffer. In the example just given, the buffer B1 used to join tables t1 and t2 must be a regular buffer.

Each row of the incremental buffer used for a join operation contains only the interesting columns of a row from the table to be joined. These columns are augmented with a reference to the interesting columns of the matched row from the table produced by the first join operand. Several rows in the incremental buffer can refer to the same row r whose columns are stored in the previous join buffers insofar as all these rows match row r.

Incremental buffers enable less frequent copying of columns from buffers used for previous join operations. This provides a savings in buffer space because in the general case a row produced by the first join operand can be matched by several rows produced by the second join operand. It is unnecessary to make several copies of a row from the first operand. Incremental buffers also provide a savings in processing time due to the reduction in copying time.

In MySQL 8.0, the block_nested_loop flag of the optimizer_switch system variable works as follows:

• Prior to MySQL 8.0.20, it controls how the optimizer uses the Block Nested Loop join algorithm.

• In MySQL 8.0.18 and later, it also controls the use of hash joins.

• Beginning with MySQL 8.0.20, the flag controls hash joins only, and the block nested loop algorithm is no longer supported.

The batched_key_access flag controls how the optimizer uses the Batched Key Access join algorithms.

By default, block_nested_loop is on and batched_key_access is off.

12.2 Block Nested-Loop Algorithm for Outer Joins and Semijoins

The original implementation【ˌɪmpləmɛnˈteɪʃən 实施;执行;完成;贯彻;工具;生效;仪器;】 of the MySQL BNL algorithm was extended to support outer join and semijoin operations (and was later superseded by the hash join algorithm).

When these operations are executed with a join buffer, each row put into the buffer is supplied with a match flag.

If an outer join operation is executed using a join buffer, each row of the table produced by the second operand is checked for a match against each row in the join buffer. When a match is found, a new extended row is formed (the original row plus columns from the second operand) and sent for further extensions by the remaining join operations. In addition, the match flag of the matched row in the buffer is enabled. After all rows of the table to be joined have been examined, the join buffer is scanned. Each row from the buffer that does not have its match flag enabled is extended by NULL complements (NULL values for each column in the second operand) and sent for further extensions by the remaining join operations.

In MySQL 8.0, the block_nested_loop flag of the optimizer_switch system variable works as follows:

• Prior to MySQL 8.0.20, it controls how the optimizer uses the Block Nested Loop join algorithm.

• In MySQL 8.0.18 and later, it also controls the use of hash joins.

• Beginning with MySQL 8.0.20, the flag controls hash joins only, and the block nested loop algorithm is no longer supported.

In EXPLAIN output, use of BNL for a table is signified when the Extra value contains Using join buffer (Block Nested Loop) and the type value is ALL, index, or range.

12.3 Batched Key Access Joins

MySQL implements a method of joining tables called the Batched Key Access (BKA) join algorithm. BKA can be applied when there is an index access to the table produced by the second join operand. Like the BNL join algorithm, the BKA join algorithm employs a join buffer to accumulate the interesting columns of the rows produced by the first operand of the join operation. Then the BKA algorithm builds keys to access the table to be joined for all rows in the buffer and submits these keys in a batch to the database engine for index lookups. The keys are submitted to the engine through the Multi-Range Read (MRR) interface.After submission【səbˈmɪʃn 提交;(向法官提出的)看法，意见;屈服;投降;呈递;归顺;提交(或呈递)的文件、建议等;】 of the keys, the MRR engine functions perform lookups in the index in an optimal way, fetching the rows of the joined table found by these keys, and starts feeding the BKA join algorithm with matching rows. Each matching row is coupled with a reference to a row in the join buffer.

When BKA is used, the value of join_buffer_size defines how large the batch of keys is in each request to the storage engine. The larger the buffer, the more sequential access is made to the right hand table of a join operation, which can significantly improve performance.

For BKA to be used, the batched_key_access flag of the optimizer_switch system variable must be set to on. BKA uses MRR, so the mrr flag must also be on. Currently, the cost estimation for MRR is too pessimistic. Hence, it is also necessary for mrr_cost_based to be off for BKA to be used. The following setting enables BKA:

mysql> SET optimizer_switch='mrr=on,mrr_cost_based=off,batched_key_access=on';

There are two scenarios【sɪˈnɛrioʊz 方案;设想;预测;(电影或戏剧的)剧情梗概;】 by which MRR functions execute:

• The first scenario is used for conventional disk-based storage engines such as InnoDB and MyISAM. For these engines, usually the keys for all rows from the join buffer are submitted to the MRR interface at once. Engine-specific MRR functions perform index lookups for the submitted keys, get row IDs (or primary keys) from them, and then fetch rows for all these selected row IDs one by one by request from BKA algorithm. Every row is returned with an association reference that enables access to the matched row in the join buffer. The rows are fetched by the MRR functions in an optimal way: They are fetched in the row ID (primary key) order. This improves performance because reads are in disk order rather than random order.

• The second scenario is used for remote storage engines such as NDB. A package of keys for a portion【ˈpɔːrʃn 部分;(食物的)一份，一客;分享的部分;分担的责任;】 of rows from the join buffer, together with their associations, is sent by a MySQL Server (SQL node) to MySQL Cluster data nodes. In return, the SQL node receives a package (or several packages) of matching rows coupled with corresponding associations. The BKA join algorithm takes these rows and builds new joined rows. Then a new set of keys is sent to the data nodes and the rows from the returned packages are used to build new joined rows. The process continues until the last keys from the join buffer are sent to the data nodes, and the SQL node has received and joined all rows matching these keys. This improves performance because fewer key-bearing packages sent by the SQL node to the data nodes means fewer round trips between it and the data nodes to perform the join operation.

With the first scenario, a portion of the join buffer is reserved to store row IDs (primary keys) selected by index lookups and passed as a parameter to the MRR functions.

There is no special buffer to store keys built for rows from the join buffer. Instead, a function that builds the key for the next row in the buffer is passed as a parameter to the MRR functions.

In EXPLAIN output, use of BKA for a table is signified when the Extra value contains Using join buffer (Batched Key Access) and the type value is ref or eq_ref.

12.4 Optimizer Hints for Block Nested-Loop and Batched Key Access Algorithms

In addition to using the optimizer_switch system variable to control optimizer use of the BNL and BKA algorithms session-wide, MySQL supports optimizer hints to influence the optimizer on a per-statement basis.

To use a BNL or BKA hint to enable join buffering for any inner table of an outer join, join buffering must be enabled for all inner tables of the outer join.

13 Condition Filtering

In join processing, prefix【priːfɪks 前缀(缀于单词前以改变其意义的字母或字母组合);(人名前的)称谓;前置代号(置于前面的单词或字母、数字);】 rows are those rows passed from one table in a join to the next. In general, the optimizer attempts to put tables with low prefix counts early in the join order to keep the number of row combinations from increasing rapidly【'ræpɪdli 迅速;迅速地;高速;急速地;急促;】. To the extent that the optimizer can use information about conditions on rows selected from one table and passed to the next, the more accurately【ˈækjərətli 准确地;准确地，精确；正确，正确地;】 it can compute row estimates and choose the best execution plan.

Without condition filtering, the prefix row count for a table is based on the estimated number of rows selected by the WHERE clause according to whichever access method the optimizer chooses. Condition filtering enables the optimizer to use other relevant conditions in the WHERE clause not taken into account by the access method, and thus improve its prefix row count estimates. For example, even though there might be an index-based access method that can be used to select rows from the current table in a join, there might also be additional conditions for the table in the WHERE clause that can filter (further restrict) the estimate for qualifying rows passed to the next table.

A condition contributes to the filtering estimate only if:

• It refers to the current table.

• It depends on a constant value or values from earlier tables in the join sequence.

• It was not already taken into account by the access method.

In EXPLAIN output, the rows column indicates the row estimate for the chosen access method, and the filtered column reflects the effect of condition filtering. filtered values are expressed as percentages【pərˈsɛntɪdʒɪz 百分比;百分率;提成;利润的分成;】. The maximum value is 100, which means no filtering of rows occurred. Values decreasing from 100 indicate increasing amounts of filtering.

The prefix row count (the number of rows estimated to be passed from the current table in a join to the next) is the product of the rows and filtered values. That is, the prefix row count is the estimated row count, reduced by the estimated filtering effect. For example, if rows is 1000 and filtered is 20%, condition filtering reduces the estimated row count of 1000 to a prefix row count of 1000 × 20% = 1000 × .2 = 200.

Consider the following query:

SELECT *
 FROM employee JOIN department ON employee.dept_no = department.dept_no
 WHERE employee.first_name = 'John'
 AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01';

Suppose that the data set has these characteristics:

• The employee table has 1024 rows.

• The department table has 12 rows.

• Both tables have an index on dept_no.

• The employee table has an index on first_name.

• 8 rows satisfy this condition on employee.first_name:

employee.first_name = 'John'

• 150 rows satisfy this condition on employee.hire_date:

employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'

• 1 row satisfies both conditions:

employee.first_name = 'John'
AND employee.hire_date BETWEEN '2018-01-01' AND '2018-06-01'

Without condition filtering, EXPLAIN produces output like this:

+----+------------+--------+------------------+---------+---------+------+----------+
| id | table      | type   | possible_keys    | key     | ref     | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1  | employee   | ref    | name,h_date,dept | name    | const   | 8    | 100.00   |
| 1  | department | eq_ref | PRIMARY          | PRIMARY | dept_no | 1    | 100.00   |
+----+------------+--------+------------------+---------+---------+------+----------+

For employee, the access method on the name index picks up the 8 rows that match a name of 'John'. No filtering is done (filtered is 100%), so all rows are prefix rows for the next table: The prefix row count is rows × filtered = 8 × 100% = 8.

With condition filtering, the optimizer additionally takes into account conditions from the WHERE clause not taken into account by the access method. In this case, the optimizer uses heuristics【[hjuˈrɪstɪks 启发式;探索法;】 to estimate a filtering effect of 16.31% for the BETWEEN condition on employee.hire_date. As a result, EXPLAIN produces output like this:

+----+------------+--------+------------------+---------+---------+------+----------+
| id | table      | type   | possible_keys    | key     | ref     | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1  | employee   | ref    | name,h_date,dept | name    | const   | 8    | 16.31    |
| 1  | department | eq_ref | PRIMARY          | PRIMARY | dept_no | 1    | 100.00   |
+----+------------+--------+------------------+---------+---------+------+----------+

Now the prefix row count is rows × filtered = 8 × 16.31% = 1.3, which more closely reflects actual data set.

Normally, the optimizer does not calculate the condition filtering effect (prefix row count reduction) for the last joined table because there is no next table to pass rows to. An exception occurs for EXPLAIN: To provide more information, the filtering effect is calculated for all joined tables, including the last one.

To control whether the optimizer considers additional filtering conditions, use the condition_fanout_filter flag of the optimizer_switch system variable. This flag is enabled by default but can be disabled to suppress condition filtering (for example, if a particular query is found to yield better performance without it).

If the optimizer overestimates【ˌoʊvərˈestɪmeɪts 高估;】 the effect of condition filtering, performance may be worse than if condition filtering is not used. In such cases, these techniques may help:

• If a column is not indexed, index it so that the optimizer has some information about the distribution of column values and can improve its row estimates.

• Similarly, if no column histogram【ˈhɪstəɡræm 直方图;（统计学的）直方图，矩形图;】 information is available, generate a histogram.

• Change the join order. Ways to accomplish this include join-order optimizer hints，STRAIGHT_JOIN immediately following the SELECT, and the STRAIGHT_JOIN join operator.

• Disable condition filtering for the session:

SET optimizer_switch = 'condition_fanout_filter=off';

Or, for a given query, using an optimizer hint:

SELECT /*+ SET_VAR(optimizer_switch = 'condition_fanout_filter=off') */ ...

14 Constant-Folding Optimization

Comparisons between constants and column values in which the constant value is out of range or of the wrong type with respect to the column type are now handled once during query optimization rather row-by-row than during execution. The comparisons that can be treated in this manner are >, >=, <, <=, <>/!=, =, and <=>.

Consider the table created by the following statement:

CREATE TABLE t (c TINYINT UNSIGNED NOT NULL);

The WHERE condition in the query SELECT * FROM t WHERE c < 256 contains the integral constant 256 which is out of range for a TINYINT UNSIGNED column. Previously, this was handled by treating both operands as the larger type, but now, since any allowed value for c is less than the constant, the WHERE expression can instead be folded as WHERE 1, so that the query is rewritten as SELECT * FROM t WHERE 1.

This makes it possible for the optimizer to remove the WHERE expression altogether. If the column c were nullable (that is, defined only as TINYINT UNSIGNED) the query would be rewritten like this:

SELECT * FROM t WHERE ti IS NOT NULL

Folding is performed for constants compared to supported MySQL column types as follows:

• Integer column type. Integer types are compared with constants of the following types as described here:

• Integer value. If the constant is out of range for the column type, the comparison is folded to 1 or IS NOT NULL, as already shown.

If the constant is a range boundary【ˈbaʊndri 边界;界限;分界线;使球越过边界线的击球(得加分);】, the comparison is folded to =. For example (using the same table as already defined):

mysql> EXPLAIN SELECT * FROM t WHERE c >= 255;
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t
 partitions: NULL
 type: ALL
possible_keys: NULL
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 5
 filtered: 20.00
 Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
*************************** 1. row ***************************
 Level: Note
 Code: 1003
Message: /* select#1 */ select `test`.`t`.`ti` AS `ti` from `test`.`t` where (`test`.`t`.`ti` = 255)
1 row in set (0.00 sec)

• Floating- or fixed-point value. If the constant is one of the decimal types (such as DECIMAL, REAL, DOUBLE, or FLOAT) and has a nonzero decimal portion, it cannot be equal; fold accordingly. For other comparisons, round up or down to an integer value according to the sign, then perform a range check and handle as already described for integer-integer comparisons.

A REAL value that is too small to be represented as DECIMAL is rounded to .01 or -.01 depending on the sign, then handled as a DECIMAL.

• String types. Try to interpret the string value as an integer type, then handle the comparison as between integer values. If this fails, attempt to handle the value as a REAL.

• DECIMAL or REAL column. Decimal types are compared with constants of the following types as described here:

Integer value. Perform a range check against the column value's integer part. If no folding results, convert the constant to DECIMAL with the same number of decimal places as the column value, then check it as a DECIMAL (see next).
DECIMAL or REAL value. Check for overflow (that is, whether the constant has more digits in its integer part than allowed for the column's decimal type). If so, fold. If the constant has more significant fractional digits than column's type, truncate the constant. If the comparison operator is = or <>, fold. If the operator is >= or <=, adjust the operator due to truncation. For example, if column's type is DECIMAL(3,1), SELECT * FROM t WHERE f >= 10.13 becomes SELECT * FROM t WHERE f > 10.1. If the constant has fewer decimal digits than the column's type, convert it to a constant with same number of digits. For underflow of a REAL value (that is, too few fractional digits to represent it), convert the constant to decimal 0.
String value. If the value can be interpreted as an integer type, handle it as such. Otherwise, try to handle it as REAL.

• FLOAT or DOUBLE column. FLOAT(m,n) or DOUBLE(m,n) values compared with constants are handled as follows:

If the value overflows the range of the column, fold.

If the value has more than n decimals, truncate, compensating during folding. For = and <> comparisons, fold to TRUE, FALSE, or IS [NOT] NULL as described previously; for other operators, adjust the operator.

If the value has more than m integer digits, fold.

Limitations. This optimization cannot be used in the following cases:

1. With comparisons using BETWEEN or IN.

2. With BIT columns or columns using date or time types.

3. During the preparation phase for a prepared statement, although it can be applied during the optimization phase when the prepared statement is actually executed. This due to the fact that, during statement preparation, the value of the constant is not yet known.