In the last "Take you into the SQL engine" course, I believe you have learned the basic knowledge of the SQL engine and have an overall understanding of the flow of SQL between modules. Among the modules included in the SQL engine, the SQL execution engine acts as its final "hand and foot", responsible for executing the execution plan given by the optimizer, reading the data through the storage engine and processing it, and returning the result to the client. Therefore, the execution engine can directly affect the execution performance of SQL.
At 19:30 on September 15th, the fifth phase of the internal verification tutorial will take you to understand the basic theoretical knowledge of the SQL execution engine and the implementation of the OceanBase execution engine. How to add, delete, modify and check in milliseconds. Through the learning of the implementation process of the OceanBase execution engine in this issue, it will be very helpful for students who want to participate in the OceanBase database competition and who want to engage in the research and development of the SQL layer.
What problems can this issue help you solve?
1. As an enterprise-level native distributed database, how does OceanBase realize its parallel execution capability?
2. How does OceanBase implement a vectorized execution engine.
3. Analysis of the MiniOB Date type of the MiniOB practical hand-held series.
Live broadcast content grabs "fresh" knowledge
In the last issue, you have learned about the various knowledge points of the SQL engine. We know that after the query optimizer parses the SQL statement and selects the execution plan with the lowest consumption from the execution plan, the specific execution will be handed over to the execution engine, then the SQL How is the execution engine designed and implemented? The content of this issue will give you a detailed explanation by introducing the implementation of OceanBase's SQL execution engine.
After years of evolution, OceanBase's SQL execution engine has evolved from the traditional volcano model from single-line iterative execution to vectorized execution, which greatly improves the single-core execution capability. At the same time, due to its ability to execute SQL in parallel, it can make full use of system resources to process user requests in parallel, thereby reducing RT.
As can be seen from Figure 1, OceanBase's second-generation execution engine continues the traditional volcano model, where data is iterated row by row, but the old execution engine is rewritten. The new implementation introduces strongly typed computation, performs memory pre-allocation, uses expression lists to describe row information, separates static meta information from data, and re-implements the expression and operator computation framework. The engine has been greatly improved.
Figure 1 Evolution of OceanBase SQL execution engine
From the second-generation data row-by-row iteration to the third-generation vectorized processing logic, great optimization has been produced. Because OceanBase's user scenarios include OLAP queries such as report analysis and business decision-making in addition to simple OLTP queries. However, OLAP query has a large amount of data processing and high time consumption. The traditional row-by-row iterative model needs to iterate once for each row, resulting in a relatively large virtual function call overhead and poor cache friendliness. To solve this problem, OceanBase implements a vectorization engine. The effect is also remarkable. In the TPC-H 30T scenario, the performance of the vectorized engine is nearly 3 times that of the non-vectorized engine. For the aggregated and computationally intensive SQL query Q1, the performance is improved by about 10 times. This tutorial will also introduce you to more OceanBase vectorization implementation details.
Figure 2 Single-line execution of OceanBase execution engine
Figure 3 Vectorized execution of OceanBase execution engine
For SQL requests in OLAP scenarios, a large amount of user data needs to be analyzed, and users often expect to give results as soon as possible, and the single-core serial execution capability is limited. In order to be able to get results as soon as possible, it is the key to make full use of system resources for parallel execution. OceanBase has realized the parallel execution capability very early, which can support large-scale high-concurrency processing and make full use of cluster machine resources. In the TPC-H 30T standard test scenario, we use 5120 CPU hyperthreads of 64 machines to serve each For a user request, a request that would have taken tens of minutes to complete is processed in a few seconds.
Figure 4 Parallel execution framework of OceanBase execution engine
OceanBase is a distributed database architecture of Share Noting. Parallel execution can meet the concurrent processing requirements of a single request in AP scenarios (large data volume and long processing time), but for TP scenarios (short time consumption, high user QPS), one SQL request The time-consuming may be only a few ms or even hundreds of us, and it is generally executed serially (the overhead of parallel execution scheduling is more time-consuming than TP). In order to efficiently process TP scenario requests in distributed scenarios, OceanBase provides multiple execution modes, including local execution, remote execution, and DAS execution.
Figure 5 Multiple execution modes of the OceanBase execution engine Currently, the OceanBase SQL execution engine can well meet the needs of users for HTAP execution capabilities. In the future, we will further optimize the execution engine capabilities for column storage and new hardware.
For more details, please pay attention to the official course of "From 0 to 1 Database Internal Verification Tutorial" at 19:30 on September 15.
appendix:
Inner Verification Tutorial Phase 1 | The first step to becoming a kernel developer: building a research and development environment
Internal Verification Tutorial Phase 2|Take you to uncover the mystery of the database storage structure
Internal Verification Tutorial Issue 3 | Why can indexes make queries faster?
Internal Verification Tutorial Issue 4|Take you into the database SQL engine
Course playback
Sign up for the live broadcast: https://open.oceanbase.com/activities/4921877?id=4921912
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。