Preface
In the early stage of application development, the amount of data is small, and developers pay more attention to the realization of functions when developing functions. With the growth of production data, many SQL statements begin to expose performance problems, which have an increasing impact on production, and sometimes it may be possible These problematic SQLs are the bottleneck of the overall system performance.
If you have different opinions, please leave a message to correct me and learn together!
SQL optimization general steps
- 1. Locate those SQL statements with low execution efficiency through slow log checking, etc.
- 2. Explain analyzes the execution plan of SQL
Need to focus on type, rows, filtered, extra .
Type from top to bottom, efficiency is getting higher and higher
- ALL full table scan
- index index full scan
- range index range scan, common words <,<=,>=,between,in and other operations
- ref uses a non-unique index scan or a unique index prefix scan to return a single record, which often appears in related queries
- eq_ref is similar to ref, the difference is that the unique index is used, and the associated query using the primary key
- const/system single record, the system will treat other columns in the matching row as constants, such as primary key or unique index query
- null MySQL does not access any table or index, and returns the result directly
Although the efficiency is getting higher and higher from top to bottom, according to the cost model, suppose there are two indexes idx1(a, b, c), idx2(a, c), SQL is select * from t where a = 1 and b in (1, 2) order by c; if idx1 is taken, the type is range, if idx2 is taken, then the type is ref; when the number of rows to be scanned, idx2 is approximately 5 times greater than idx1, idx1 is used, Otherwise, idx2 will be used.
recommend:
Extra
- Using filesort: MySQL requires an additional pass to find out how to retrieve rows in sorted order. Sorting is done by browsing all rows according to the connection type and saving the sort key and row pointers for all rows matching the WHERE clause. Then the keywords are sorted, and the rows are retrieved in sorted order.
- Using temporary: Temporary table is used to save intermediate results, performance is particularly poor and needs to be optimized
- Using index: Indicates that the covering index (Coveing Index) is used in the corresponding select operation, avoiding access to the data rows of the table, and the efficiency is good! If using where appears at the same time, it means that the data that meets the conditions cannot be queried directly through index search.
- Using index condition: the new ICP after MySQL5.6, using index condtion is to use ICP (index push down), data filtering at the storage engine layer, rather than filtering at the service layer, using the index to reduce the existing data back to the table The data.
show profile analysis
Understand the status of the thread of SQL execution and the time consumed.
The default is closed, open the statement “set profiling = 1;”
SHOW PROFILES ;
SHOW PROFILE FOR QUERY #{id};
trace
The trace analyzes how the optimizer chooses the execution plan. Through the trace file, we can further understand why the coupon chooses the A execution plan instead of the B execution plan.
set optimizer_trace="enabled=on";
set optimizer_trace_max_mem_size=1000000;
select * from information_schema.optimizer_trace;
Identify the problem and adopt corresponding measures
- Optimize index
- Optimize SQL statement: modify SQL, IN query segment, time query segment, filter based on the last data
- Use other implementation methods: ES, data warehouse, etc.
- Data fragmentation processing
Scene analysis
Case 1, the leftmost match
index
KEY `idx_shopid_orderno` (`shop_id`,`order_no`)
SQL statement
select * from _t where orderno=''
Query matching is matched from left to right. To use order_no to go to the index, the query condition must carry shop_id or index (shop_id, order_no) to change the order.
Case 2, implicit conversion
index
KEY `idx_mobile` (`mobile`)
SQL statement
select * from _user where mobile=12345678901
Implicit conversion is equivalent to doing operations on the index, which will invalidate the index. Mobile is a character type, and numbers are used, and string matching should be used, otherwise MySQL will use implicit substitution, which will cause the index to become invalid.
Case 3, big page
index
KEY `idx_a_b_c` (`a`, `b`, `c`)
SQL statement
select * from _t where a = 1 and b = 2 order by c desc limit 10000, 10;
For large paging scenarios, you can give priority to product optimization requirements. If there is no optimization, there are two optimization methods as follows:
One is to pass the last piece of data from the previous time, that is, the above c, and then do "c <xxx" processing, but this generally requires changing the interface protocol, which is not necessarily feasible.
The other is to use delayed association for processing to reduce SQL back to the table, but remember that the index needs to be completely covered to have an effect. The SQL changes are as follows
SELECT
t1.*
FROM
_t t1,
( SELECT id FROM _t WHERE a = 1 AND b = 2 ORDER BY c DESC LIMIT 10000, 10 ) t2
WHERE
t1.id = t2.id;
Case 4. in + order by
index
KEY `idx_shopid_status_created` (`shop_id`, `order_status`, `created_at`)
SQL statement
SELECT
*
FROM
_order
WHERE
shop_id = 1
AND order_status IN ( 1, 2, 3 )
ORDER BY
created_at DESC
LIMIT 10
n*m
at the bottom of MySQL through 0611ce83f25486, which is similar to union, but the efficiency is higher than that of union.
When the in query calculates the cost (cost = number of tuples * IO average value), the number of tuples is obtained by querying the values contained in in one by one, so this calculation process will be relatively slow, so MySQL settings A critical value (eq_range_index_dive_limit) is set. After 5.6, the cost of the column will not participate in the calculation after this critical value is exceeded.
Therefore, the execution plan selection will be inaccurate. The default is 200, that is, the in condition exceeds 200 data, which will cause problems in the cost calculation of in, and may cause the index selected by Mysql to be inaccurate.
Processing method, (order_status, created_at) can be exchanged before and after the order, and adjust the SQL to delay association.
recommend:
Case 5. Range query is blocked, and subsequent fields cannot be indexed
index
KEY `idx_shopid_created_status` (`shop_id`, `created_at`, `order_status`)
SQL statement
SELECT
*
FROM
_order
WHERE
shop_id = 1
AND created_at > '2021-01-01 00:00:00'
AND order_status = 10
Range query also has "IN, between"
Case 6, not equal to, does not include fast search that cannot use the index. (ICP can be used)
select * from _order where shop_id=1 and order_status not in (1,2)
select * from _order where shop_id=1 and order_status != 1
In the index, avoid using NOT、!=、<>、!<、!>、NOT EXISTS、NOT IN、NOT LIKE
etc.
Case 7. The optimizer chooses not to use the index
If the amount of data required to be accessed is small, the optimizer will still choose the auxiliary index, but when the accessed data accounts for a large part of the data in the entire table (usually about 20%), the optimizer will choose to find through the clustered index data.
select * from _order where order_status = 1
Check out all unpaid orders. Generally, such orders are rare. Even if an index is built, the index cannot be used.
Case 8. Complex query
select sum(amt) from _t where a = 1 and b in (1, 2, 3) and c > '2020-01-01';
select * from _t where a = 1 and b in (1, 2, 3) and c > '2020-01-01' limit 10;
If you are counting some data, you may use the data warehouse to solve it;
If it is a business that has such a complex query, it may not be recommended to continue to use SQL, but to solve it in other ways, such as using ES to solve it.
Case 9, mixed use of asc and desc
select * from _t where a=1 order by b desc, c asc
Mixed use of desc and asc will cause index failure
Case 10. Big data
For the data storage of the push service, the amount of data may be very large. If the solution is selected, the final choice is to store it on MySQL, and save it with a validity period of 7 days.
Then you need to pay attention that frequent cleaning of data will result in data fragmentation, and you need to contact the DBA for data fragmentation processing.
Author: Wolf Lord
cnblogs.com/powercto/p/14410128.html
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。