9
头图
Related reading: words! Summary of Java basic interview questions/knowledge points! (The latest version for 2021)

This article was posted before, but I recently refactored it and fixed a lot of minor issues. So, sync again with the public account!

The content is very hard! It is strongly recommended that friends take about 10 minutes to read it again!

MySQL basics

Introduction to relational databases

As the name implies, a relational database is a database based on a relational model. The relational model indicates the connection between the data stored in the database (one-to-one, one-to-many, many-to-many).

In relational databases, our data is stored in various tables (such as user tables), and each row in the table stores a piece of data (such as a user's information).

Most relational databases use SQL to manipulate data in the database. In addition, most relational databases support the four major characteristics of transactions (ACID).

What are the common relational databases in

MySQL, PostgreSQL, Oracle, SQL Server, SQLite (SQLite is used for the storage of WeChat local chat records).......

Introduction to MySQL

MySQL is a relational database, mainly used for persistent storage of some data in our system, such as user information.

Since MySQL is an open source, free and relatively mature database, MySQL is widely used in various systems. Anyone can download it under the license of GPL (General Public License) and modify it according to individual needs. The default port number for MySQL is 3306 .

Storage engine

Storage engine related commands

View all storage engines provided by MySQL

mysql> show engines;

查看MySQL提供的所有存储引擎

From the above figure, we can see that the current default storage engine of MySQL is InnoDB, and among all storage engines in version 5.7, only InnoDB is a transactional storage engine, which means that only InnoDB supports transactions.

View MySQL's current default storage engine

We can also view the default storage engine through the following command.

mysql> show variables like '%storage_engine%';

View table storage engine

show table status like "table_name" ;

查看表的存储引擎

The difference between MyISAM and InnoDB

Before MySQL 5.5, the MyISAM engine was the default storage engine for MySQL, and it was a blessing for a while.

Although the performance of MyISAM is not bad, various features are also good (such as full-text indexing, compression, spatial functions, etc.). However, MyISAM does not support transactions and row-level locks, and the biggest flaw is that it cannot be safely recovered after a crash.

After version 5.5, MySQL introduced InnoDB (transactional database engine). After MySQL version 5.5, the default storage engine is InnoDB. Boy, be sure to remember this InnoDB, you use this storage engine every time you use MySQL database, right?

Closer to home! Let's briefly compare the two below:

1. Does it support row-level lock

MyISAM only has table-level locking, while InnoDB supports row-level locking and table-level locking. The default is row-level locking.

In other words, MyISAM locks the entire table with one lock, which is so simple in the case of concurrent writing! This is why InnoDB has better performance when writing concurrently!

2. Whether to support transaction

MyISAM does not provide transaction support.

InnoDB provides transaction support, with the ability to commit and rollback transactions.

3. Does it support foreign keys

MyISAM does not support it, but InnoDB supports it.

🌈 Expand it a bit:

Generally, we don't recommend using foreign keys at the database level, as it can be solved at the application level. However, this will threaten the consistency of the data. Whether to use foreign keys or not depends on your project.

4. Does it support safe recovery after an abnormal database crash?

MyISAM does not support it, but InnoDB supports it.

The database using InnoDB will ensure that the database is restored to the state before the crash when the database is restarted after an abnormal crash. This recovery process relies on redo log .

🌈 Expand it a bit:

  • The MySQL InnoDB engine uses redo log (redo log) ensure the durability of the , and undo log (rollback log) to ensure the atomicity of the .
  • MySQL InnoDB engine uses lock mechanism , MVCC and other means to ensure transaction isolation (the default supported isolation level is REPEATABLE-READ ).
  • After ensuring the durability, atomicity, and isolation of transactions, consistency can be guaranteed.

5. Whether to support MVCC

MyISAM does not support it, but InnoDB supports it.

To be honest, this comparison is a bit nonsense, after all, MyISAM does not even support row-level locks.

MVCC can be seen as an upgrade of row-level locks, which can effectively reduce locking operations and improve performance.

About the choice of MyISAM and InnoDB

Most of the time we use the InnoDB storage engine. In some read-intensive situations, using MyISAM is also appropriate. However, the premise is that your project does not mind the shortcomings of MyISAM not supporting transactions, crash recovery, etc. (but ~ we generally do!).

There is a sentence in "MySQL High Performance" that says:

Don't easily believe in empirical statements such as "MyISAM is faster than InnoDB". This conclusion is often not absolute. In many known scenarios, the speed of InnoDB can be beyond the reach of MyISAM, especially for applications that use a clustered index, or where the data that needs to be accessed can be put into memory.

Under normal circumstances, we choose InnoDB is no problem, but in some cases you do not care about scalability and concurrency, do not need transaction support, and do not care about security recovery after a crash, choosing MyISAM is a good choice . But under normal circumstances, we all need to consider these issues.

Therefore, for the business systems we develop daily, you can hardly find any reason to use MyISAM as the storage engine of your own MySQL database.

Locking mechanism and InnoDB lock algorithm

Locks used by MyISAM and InnoDB storage engines:

  • MyISAM uses table-level locking.
  • InnoDB supports row-level locking and table-level locking, the default is row-level locking

table-level lock and row-level lock comparison:

  • table-level lock: MySQL locks maximum granularity of . It locks the entire table in the current operation, which is simple to implement, consumes less resources, and locks quickly, without deadlocks. It has the largest locking granularity, the highest probability of triggering lock conflicts, and the lowest concurrency. Both MyISAM and InnoDB engines support table-level locks.
  • row-level lock: MySQL lock is a kind of lock with the smallest granularity of , which only locks the current operation row. Row-level locks can greatly reduce conflicts in database operations. Its locking granularity is the smallest, and the concurrency is high, but the overhead of locking is also the largest, and the locking is slow and deadlock will occur.

There are three lock algorithms for

  • Record lock: record lock, a lock on a single row record
  • Gap lock: gap lock, lock a range, excluding the record itself
  • Next-key lock: record+gap adjacent key lock, lock a range, including the record itself

Query cache

When the query statement is executed, the cache is first queried. However, it will be removed after MySQL 8.0 because this feature is not very practical

my.cnf added the following configuration, restart MySQL to open query cache

query_cache_type=1
query_cache_size=600000

MySQL can also open the query cache by executing the following command

set global  query_cache_type=1;
set global  query_cache_size=600000;

As above, will directly return the result 160dec78dbaf44 in the cache under the same query conditions and data after the query cache is . The query conditions here include the query itself, the current database to be queried, the client protocol version number and other information that may affect the results. Therefore, any difference in any character between any two queries will result in a cache miss. In addition, if the query contains any user-defined functions, stored functions, user variables, temporary tables, and system tables in the MySQL library, the query results will not be cached.

After the cache is established, MySQL's query cache system will track each table involved in the query. If these tables (data or structure) change, all cached data related to this table will be invalidated.

cache can improve the query performance of the database, the cache also brings additional overhead. A cache operation is required after each query, and it must be destroyed after invalidation. Therefore, you should be cautious when you open the query cache, especially for write-intensive applications. If it is turned on, pay attention to reasonably control the size of the cache space. Generally speaking, it is appropriate to set the size to tens of MB. In addition, can also use sql_cache and sql_no_cache to control whether a query statement needs to be cached:

select sql_no_cache count(*) from usr;

Affairs

What is a business?

In a word, the transaction is a logical set of operations, either all of them are executed or none of them are executed.

can give a simple example?

The most classic transaction is often cited as an example of transfer. If Xiao Ming wants to transfer 1,000 yuan to Xiaohong, this transfer will involve two key operations:

  1. Reduce Xiao Ming's balance by 1,000 yuan
  2. Increase Xiaohong's balance by 1,000 yuan.

The transaction will regard these two operations as a logical whole, and the operations contained in this whole will either succeed or fail.

In this way, there will be no situation where Xiao Ming's balance decreases while Xiao Hong's balance does not increase.

What is a database transaction?

Database affairs are the most exposed in our daily development. If your project is a monolithic architecture, you are often exposed to database transactions.

Usually, when we are talking about transactions, if we do not specifically refer to the distributed transaction , it often refers to the database transaction .

What is the role of database transactions?

Simply put: database transactions can ensure that multiple operations on the database (that is, SQL statements) form a logical whole. The database operations that constitute this logical whole follow: either all executed successfully, or not executed at all.

# 开启一个事务
START TRANSACTION;
# 多条 SQL 语句
SQL1,SQL2...
## 提交事务
COMMIT;

In addition, relational databases (for example: MySQL , SQL Server , Oracle etc.) transactions have ACID characteristics:

事务的特性

What is the ACID feature?

  1. atomicity ( Atomicity ): The transaction is the smallest unit of execution and division is not allowed. The atomicity of the transaction ensures that the actions are either all completed or completely ineffective;
  2. consistency ( Consistency ): Before and after the transaction is executed, the data remains consistent. For example, in the transfer business, the total amount of the transferor and the payee should be unchanged regardless of whether the transaction is successful or not;
  3. isolation ( Isolation ): When concurrently accessing the database, a user's transaction is not interfered by other transactions, and the database is independent between concurrent transactions;
  4. Persistence ( Durabilily ): After a transaction is committed. Its changes to the data in the database are persistent, and even if the database fails, it should not have any impact on it.

the realization principle of 160dec78dbb1de data transaction?

Let's take MySQL's InnoDB engine as an example to briefly talk about it.

The MySQL InnoDB engine uses redo log (redo log) ensure the durability of the , and undo log (rollback log) to ensure the atomicity of the .

The MySQL InnoDB engine uses lock mechanism , MVCC and other means to ensure transaction isolation (the default isolation level supported is REPEATABLE-READ ).

After ensuring the durability, atomicity, and isolation of transactions, consistency can be guaranteed.

What problems do concurrent transactions bring?

In a typical application, multiple transactions run concurrently, often operating the same data to complete their tasks (multiple users operate on the same data). Although concurrency is necessary, it may cause the following problems.

  • dirty read (Dirty read): when a transaction is accessing data and modify the data, and this modification has not been committed to the database, then another transaction also accessed the data, and then used the data. Because this data is data that has not yet been committed, the data read by another transaction is "dirty data", and the operation based on "dirty data" may be incorrect.
  • Lost to modify: means that when a transaction reads a data, another transaction also accesses the data, then after the data is modified in the first transaction, the second transaction also modifies this data. In this way, the modification result in the first transaction is lost, so it is called a lost modification. For example: Transaction 1 reads data A=20 in a table, transaction 2 also reads A=20, transaction 1 modifies A=A-1, transaction 2 also modifies A=A-1, the final result A=19, transaction The modification of 1 is lost.
  • Unrepeatable read: refers to reading the same data multiple times in one transaction. Before this transaction is over, another transaction also accesses the data. Then, between the two read data in the first transaction, the data read twice in the first transaction may be different due to the modification of the second transaction. This happens when the data read twice in a transaction is different, so it is called non-repeatable read.
  • Phantom read: Phantom read is similar to non-repeatable read. It occurs when a transaction (T1) reads a few rows of data, and then another concurrent transaction (T2) inserts some data. In the subsequent query, the first transaction (T1) will find that there are more records that do not exist originally, as if an illusion has occurred, so it is called a phantom reading.

difference between non-repeatable reading and phantom reading:

The focus of non-repeatable reading is modification. For example, reading a record multiple times finds that the value of some of the columns is modified. The focus of phantom reading is adding or deleting. For example, reading a record multiple times finds that the number of records increases or decreases.

What are the transaction isolation levels?

The SQL standard defines four isolation levels:

  • READ-UNCOMMITTED (read uncommitted): lowest isolation level, allowing to read uncommitted data changes, may cause dirty reads, phantom reads or non-repeatable reads .
  • READ-COMMITTED (read committed): allows to read data that has been committed by concurrent transactions, can prevent dirty reads, but phantom reads or non-repeatable reads may still occur .
  • REPEATABLE-READ (repeatable read): the same field multiple times, the results are consistent, unless the data is modified by its own transaction, can prevent dirty reads and non-repeatable reads, but there are still phantom reads May occur .
  • SERIALIZABLE (serializable): highest isolation level, fully compliant with the ACID isolation level. All transactions are executed one by one, so that there is no possibility of interference between transactions. That is, can prevent dirty reads, non-repeatable reads, and phantom reads .

Isolation levelDirty readNon-repeatablePhantom reading
READ-UNCOMMITTED
READ-COMMITTED×
REPEATABLE-READ××
SERIALIZABLE×××

What is the default isolation level of MySQL?

The default isolation level supported by the MySQL InnoDB storage engine is REPEATABLE-READ (re- ) 160dec78dbb609. We can use the SELECT @@tx_isolation; command to view, the MySQL 8.0 command is changed to SELECT @@transaction_isolation;

mysql> SELECT @@tx_isolation;
+-----------------+
| @@tx_isolation  |
+-----------------+
| REPEATABLE-READ |
+-----------------+

needs to be noted here: the difference from the SQL standard is that the InnoDB storage engine REPEATABLE-READ (re- ) 160dec78dbb677 transaction isolation level, so the generation of phantom reads can be avoided. It is different from other database systems (such as SQL Server). Therefore, the default isolation level supported by the InnoDB storage engine is REPEATABLE-READ (re- ) 160dec78dbb67b can fully guarantee the isolation requirements of the transaction, that is, it has reached the SQL standard SERIALIZABLE (serializable) isolation level.

🐛 Problem correction: MySQL InnoDB's REPEATABLE-READ (repeatable) does not guarantee the avoidance of phantom reads, and requires the application to use locked reads to ensure. The mechanism used by this lock is Next-Key Locks.

Because the lower the isolation level, the fewer locks the transaction requests, so the isolation level of most database systems is READ-COMMITTED (read submitted content) , but what you need to know is that the InnoDB storage engine uses REPEATABLE-READ by default ( ) 160dec78dbb6c5 will not have any performance loss.

InnoDB storage engine SERIALIZABLE (serializable) isolation level in the case of distributed transaction

🌈 Expand it (the following content is taken from "MySQL Technical Insider: InnoDB Storage Engine (2nd Edition)" Chapter 7.7):

The InnoDB storage engine provides support for XA transactions and supports the implementation of distributed transactions through XA transactions. Distributed transaction refers to allowing multiple independent transactional resources to participate in a global transaction. Transaction resources are usually relational database systems, but they can also be other types of resources. The global transaction requires that all participating transactions are either committed or rolled back, which improves the original ACID requirements of the transaction. In addition, when using distributed transactions, the transaction isolation level of the InnoDB storage engine must be set to SERIALIZABLE.

postscript

Finally, I recommend a very good Java tutorial open source project: JavaGuide . When I was preparing for the autumn recruitment interview in my junior year, I created the JavaGuide project. Currently this project has 100k+ stars, related reading: "1049 days, 100K! Simple review! " .

It is very helpful for you to learn Java and prepare for the interview in the direction of Java! As the author said, this is a: Java learning + interview guide covering the core knowledge that most Java programmers need to master!

related suggestion:

I am Guide brother, embrace open source and like cooking. Open source project JavaGuide author, Github: Snailclimb-Overview . In the next few years, I hope to continue to improve JavaGuide, and strive to help more friends who learn Java! mutual encouragement! Hoo! Click to view my 2020 work report!

reference


JavaGuide
9k 声望1.9k 粉丝

你好,我是 Guide,开源项目 JavaGuide 作者。