MySQL Optimization Learning Notes (3)

To be honest, this part of the relevant knowledge points have been prepared for a long time, but I just don't know the content of the organization. I thought about how to organize this part of the content last night.
The articles in this series are not explained, and are generally discussed in the InnoDB data engine.

Before starting to read this article, it is recommended to read:

SQL query model and troubleshooting
The SQL chapter of LeetCode's four chapters (1)
MySQL Optimization Learning Notes (1)
MySQL optimization study notes handbook (2)

The introduction of buffer pool buffer pool

Introduction

We have already talked about what happens when a SQL is sent from the client to the MySQL server. If you don't understand it, it is recommended to read "MySQL Optimization Learning Notes (1)"

So far, from a macro perspective, after the SQL is sent to the MySQL server, the MySQL storage engine extracts the corresponding data from the disk according to the SQL, but you know that the speed of the disk is very slow relative to the speed of the memory, Even solid state drives. It would be a bit too slow to fetch data from disk every time it is fetched. For tables with InnoDB as the storage engine, MySQL developers designed the cache to avoid extracting data from the cache pool every time the data is extracted. When the InnoDB storage engine processes the client's request, when it needs to access a record of a page, it will load the data of the complete page into the memory, that is to say, even if we only need to access a record of a page, then we need to Load the entire page of data into memory. After the entire page is loaded into the memory, read and write access can be performed. After the read and write access, the memory space corresponding to the page is not released in a hurry, but is cached, so that in the future, please request to access the page again. page, you can avoid fetching data directly from disk.

In order to cache the pages in the disk, when the MySQL server starts, it applies to the operating system for a piece of continuous memory. This piece of memory is also called the buffer pool.

SHOW ENGINE INNODB STATUS;

You can see the basic status of the buffer pool:

缓存池.png

The default cache page size in the Buffer Pool is the same as the default page size on disk, both 16KB. Each cache page will have some corresponding control information, including the page number, the address of the cache page in the Buffer Pool, etc. The size of the memory occupied by the control information corresponding to each cache page is the same, we call the memory occupied by the control information as control Blocks, control blocks and cache pages are in one-to-one correspondence, all located in the cache pool, and the control block is located before the cache pages. Like the following:

缓存页与控制块.png

What is this fragment? The requested memory space is allocated with a control block and a cache page, which is not enough for a control block and a cache page. Assuming that the memory occupied by the control block and the cache page is the same, let’s assume that the cache is extremely The pool is only 57KB, which is only enough for one control block and cache page, leaving 25KB, which is memory fragmentation.

Each control block occupies approximately 5% of the cache page size, which is 808 bytes in MySQL 5.7.21. The innodb_buffer_pool_size we set does not include the memory size occupied by this part of the control block, that is to say, when InnoDB applies for continuous memory space to the operating system in the Buffer Pool, this continuous memory space is generally about 5% larger than the value of innodb_buffer_pool_size . The size of the Buffer Pool can be specified through the MySQL configuration file my.ini. Now let's take a look at the configuration of the Buffer Pool under my MySQL:

InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and row data. The bigger you set this the less disk I/O is needed to
access data in tables. On a dedicated database server you may set this parameter up to 80% of the machine physical memory size. Do not set it
too large, though, because competition of the physical memory may cause paging in the operating system. Note that on 32bit systems you might be limited to 2-3.5G of user level memory per process, so do not set it too high.
The InnoDB storage engine uses a cache pool to cache index and row data. The larger the cache pool, the less disk IO. On a dedicated database server, you can adjust this parameter to more than 80% of the machine's physical memory. Don't set it too high, otherwise competition for physical memory between processes may lead to paging of the operating system (I have almost forgotten about the knowledge related to operating system management of memory. I plan to rebuild this year. If you know about this, you can leave a message in the comment area. ). Note that on 32-bit operating systems, each process may be limited to 2-3.5G of memory.
innodb_buffer_pool_size=8M
The total buffer pool size is 8M
innodb_buffer_pool_instances=8
This does not mean that there are 8 buffer pools

innodb_buffer_pool_size/innodb_buffer_pool_instances is the size of each buffer pool instance. When the value of innodb_buffer_pool_size is less than 1G, the parameter setting of innodb_buffer_pool_instances is invalid. MySQL officially recommends setting multiple buffer pool instances when innodb_buffer_pool_size is greater than 1G.

free linked list (free linked list)

When MySQL starts, it will complete the initialization process of the Buffer Pool, which is to first apply to the operating system for the memory space of the Buffer Pool, and then divide it into several pairs of control blocks and cache pages. The cache page has not yet stored information. As MySQL starts to receive query requests, InnoDB starts to fetch data from disk pages. The control block and cache pages of the cache pool start to store information. In order to determine which cache page to place the page fetched from the disk , to distinguish which cache pages are free and which cache pages have been used, the control block corresponding to the cache page comes in handy. MySQL developers put the control block corresponding to all free cache pages as a node. into a linked list, which we call a free linked list. All cache pages in the cache pool that has just been initialized are free, so the control block corresponding to each cache page will be added to the free linked list.

空闲链表.png

The yellow node is the head node of the linked list, which records the first and last nodes of the linked list, as well as the number of nodes in the current linked list. The memory space occupied by the head node of the linked list is not large. In MySQL 5.7.21, each head node only occupies 40 bytes. With this free linked list, things are easy to handle. Whenever a page needs to be loaded from the disk into the cache pool, a free cache page is taken from the free linked list, and the information of the control block corresponding to the cache page is stored. Fill in (that is, the information such as the page number corresponding to the page), and remove the free linked list node corresponding to the cache page from the linked list, indicating that the cache page has been used.

LRU linked list

Buffer Pool is essentially a continuous memory space that InnoDB applies to the operating system, which is limited. It is impossible for us to load all disk pages into memory, so how do we cache data pages? Or how do we formulate a caching strategy? The ideal situation is that when a data page is accessed, the data page is already in the cache pool. This is LRU, the full name of LRU: Least Recently Used, according to the least recently used, that is to say, when there are no more free cache pages in the Buffer Pool, it is necessary to eliminate some cache pages that are rarely used recently, so how do we Know which cached pages have been used frequently recently and which ones have been used seldom recently? We can adopt the LRU strategy with the help of linked lists:

When we need to visit a page:

If the page is not in the cache pool, when the page is loaded from the disk to the cache page in the cache pool, the control block corresponding to the cache page is used as the head node of the linked list.
If the page has been cached in the cache pool, directly move the control block corresponding to the page to the head of the linked list.

Then the tail of the linked list is the least recently used cache page. When the free cache pages in the Buffer Pool are used up, go to the tail of the LRU linked list to find some cache pages to eliminate.

This problem has not been completely solved. Although there is no problem with the LRU strategy from a macro point of view, we still need to make some small patches to the LRU strategy according to the actual situation to deal with some special situations:

A query statement that needs to scan the whole table (no index, no root WHERE clause), scanning the whole table means that all data pages of the table need to be accessed, assuming that there are many records in this table, and assuming that this table data If it is relatively large, it means that all pages in the cache pool have been changed once, and other query statements have to be loaded from the disk to the cache pool again when they are executed. The execution frequency of this full table scan statement is not high, but each execution will change the cache page in the cache pool once, which seriously affects the use of the cache pool by other queries, thus greatly reducing the cache hit rate (cached Hit ratio = the number of times the cached page is accessed divided by the number of times the cached page is in the cache).
MySQL's pre-judgment, just like when we play League of Legends will predict the opponent's operation, MySQL also has a pre-judgment for query requests, which we call pre-reading. The so-called pre-reading means that InnoDB will not only load the corresponding query requests , and some additional data pages will be loaded. Depending on the triggering method, pre-reading can be subdivided into the following two types:
- Linear read-ahead
- random read ahead
  To introduce these two types of pre-reading, you need to understand how MySQL organizes data. Let’s briefly introduce it here. So far, we know that InnoDB manages data with pages as the basic unit, and each index corresponds to a B+ tree. The B+ tree Each node is a data page, and the data pages are not located in a continuous memory space, because there is a doubly linked list between the data pages to maintain the order of these pages. The leaf nodes of InnoDB's clustered index store complete user records, that is, the so-called index is data, and data is index.

In order to better manage data pages, MySQL designs the concept of table space (table space is also called file space) on the basis of data pages. This table space is an abstract concept that can correspond to one or more A real file (the number of files corresponding to different tablespaces may be different). Each table space can be divided into many pages, and our table data is stored in the data pages of the table space.

InnoDB tablespaces are further divided into several types:

System tablespace: One thing to note is that in the MySQL server, there is only one copy of the system tablespace. From MySQL 5.5.7 to MySQL 5.6.6, the data in our table will be stored in this system tablespace by default.
independent tablespace

Where is this data stored on disk? We can pass:

SHOW VARIABLES LIKE 'datadir';

Command to view the data directory:

数据目录.png

Let's see what's in this data directory:

磁盘上的数据目录.png

A total of six folders correspond to six databases. Let's go to the studydatabase folder to take a look.

文件夹下的表.png

There are three tables in studydatabase: score, student, student_info, one table corresponds to two files, table name.frm stores the structure information of the table, and ibd stores the data information of the table.

For tablespaces, MySQL sets up the concept of an area to manage pages (English name: extent). For a 16KB page, 64 consecutive pages are an area. With the concept of zone, let's go back and introduce MySQL's prediction:

Linear read-ahead

If the pages of a certain area are sequentially accessed beyond the value of the innodb_read_ahead_threshold system variable, a request for all pages in the next area to the Buffer Pool will be triggered.

random read ahead

If 13 consecutive pages of an area have been cached in the Buffer Pool, regardless of whether these pages are read sequentially, MySQL will asynchronously load all pages in the area where these pages are located into the buffer pool. We can turn on random read-ahead through innodb_random_read_ahead. Default is off.

Pre-judgment is a good thing, but if the pre-judgment is wrong, these pre-read pages will be placed at the head of the LRU linked list. It happens that our cache pool is not large, which will cause the LRU linked list to be eliminated and greatly reduce the cache. hit rate.

For these two cases, MySQL divides the LRU list into two segments according to a certain proportion. They are:

Part of the cache pages that are used very frequently, we call this part of the linked list the hot data or young area
The other part stores cache pages that are not frequently used, so this part of the linked list is also called cold data, or old area.

The data of the hot area and the data of the cold area are not fixed, and the data of the cold area may also be converted into the data of the hot area. We can pass:

 SHOW VARIABLES LIKE 'innodb_old_blocks_pct';

to see the ratio of hot and cold areas,

热冷区域比例.png

By default, the old area accounts for 37%. With this division, the patches for the above two cases are easy to play:

For the situation of pre-reading, when the data page is loaded into the cache pool for the first time, it will be placed in the head of the old area, so that the pre-read data page that is not frequently accessed will be slowly eliminated from the tail of the linked list.
For the scenario where the query frequency is very low such as full table scan, when a cache page in the old area is accessed for the first time, the access time is recorded in its corresponding control block. If the subsequent access time is the same as the first access time If the time of a visit is within a certain time interval, then the page will not move from the old area to the head of the young area. This interval time is controlled by innodb_old_blocks_time, which is controlled by innoddb_old_blocks_time:
```
 SHOW VARIABLES LIKE 'innodb_old_blocks_time';
```

In fact, these two patches alone are not enough, we have to continue to play, but continuing to talk about it is not the subject of this article.

Management of flush linked list

If we modify a piece of data, first load the page into the cache page, and then modify the cache page directly, then the data of the cache page is inconsistent with the data of the disk page. Of course, we can also synchronize to the corresponding page on the disk immediately after modifying the cache page, but frequent interaction with the memory will affect the performance. After all, the speed of reading and writing on the disk is quite slow. So every time the cached page is modified, MySQL does not synchronize the modification to the disk immediately, but synchronizes it at a certain point in time.

But how do we know which pages in the Buffer Pool are dirty if we don't synchronize immediately? Comrade linked list, please appear again, all control blocks corresponding to dirty pages will be added to a linked list as nodes, because the cache pages corresponding to this linked list need to be flushed to the disk, so it is also called flush linked list.

InnoDB also has other forms of linked lists, such as the unzip LRU linked list used to manage decompression pages, etc. When we introduced the buffer pool configuration above, the default configuration is 8. Buffer Pool is essentially a buffer pool that InnoDB applies to the operating system. Continuous memory space, in a multi-threaded environment, access to the linked list of the cache pool needs to be locked. When the cache pool is relatively large and has high concurrent access, a single buffer pool may affect the processing speed, so if a single buffer pool is particularly large At times, we can split them into several small Buffer Pools. Each Buffer Pool is independent, and multi-threaded concurrent access will not affect each other, thereby improving the ability to handle concurrency.

Some notes on buffer pools

Before MySQL 5.7.5, the size of the Buffer Pool could only be specified when MySQL was started. After MySQL 5.7.5, including MySQL 5.7.5, MySQL supports adjusting the size of the buffer pool at runtime. MySQL operates in chunks. The system applies for memory address space, that is, the cache pool can be regarded as composed of several chunks. A chunk represents a contiguous memory address space. The size of this chunk is specified by innodb_buffer_pool_chunk_size in the configuration file.

To ensure that each buffer pool instance contains the same number of chunks, innodb_buffer_pool_size must be a multiple of innodb_buffer_pool_chunk_size times innodb_buffer_pool_instances. If you do not set it, MySQL will automatically adjust it. For example, innodb_buffer_pool_chunk_size times innodb_buffer_pool_instances = 2G, you specify 9G, and MySQL will automatically increase it to 10G.

If the service starts when, innodb_buffer_pool_size greater than innodb_buffer_pool_chunk_size take innodb_buffer_pool_instances, then innodb_buffer_pool_chunk_size value will be set to innodb_buffer_pool_size / innodb_buffer_pool_instances. For example, suppose you innodb_buffer_pool_size to 2G, innodb_buffer_pool_chunk_size = 128M, innodb_buffer_pool_instances = 32, which is 4G. Then innodb_buffer_pool_chunk_size will be adjusted to 64M.

in conclusion

The basic idea of writing this article is that the query speed of MySQL is relatively slow now. In the case of effectively using the index, how can the running speed of MySQL be improved? Read the Nuggets booklet "How MySQL Works: From In the buffer pool chapter of "Understanding MySQL", the final answer is to check whether the configuration of the buffer pool is reasonable. Most of the content is extracted from it, and the content is combined again in a way with questions.

MySQL Optimization Learning Notes (3)

The introduction of buffer pool buffer pool

Introduction

free linked list (free linked list)

LRU linked list

Management of flush linked list

Some notes on buffer pools

in conclusion

北冥有只鱼

引用和评论

从阻塞IO到io_uring: Linux IO模型的演进之路

Open WebUI：开源AI交互平台的全面解析

被 Manus 带火的 MCP 是什么｜一文看懂

MySQL × 向量数据库：大模型时代的黄金组合实战指南

MySQL Binlog 深度解析

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》