The database is a hurdle that has to be crossed in the development of the full stack. It is inevitable that everyone will learn and use relevant knowledge. When I checked the information recently, I found that many contents on the Internet are either very deep or out of structure. For some hopes, it is only used. The people below are not very friendly. I just had the opportunity to learn MySQL recently. I concentrated on learning MySQL for some time and made some notes. Each concept basically has code examples, and each line is typed. Readers can copy the code directly to the command line to run it. Hope it helps everyone~ 😜
The knowledge introduced in this article is not very deep. The target users are those who have zero or weak foundations of MySQL. They can help establish some concepts of MySQL. At least they know how to get to Baidu when they encounter related problems, and they will not encounter back-end feedback. The database file cannot be read.
For those who don’t know anything about Docker and CentOS, you can check out <Hand-in-hand with Docker from entry to practice> and about the basic knowledge necessary for getting started with CentOS> 16136d5c2e93ab two articles, anyway, Docker and CentOS will also be used sooner or later😂
All codes are stored on Github , and can be cloned to read and execute by themselves.
CentOS version: 7.6
MySQL version: 8.0.21
The above mind map can be added to the official account at the end of the article to reply "mysql mind map" to obtain the xmind source file.
1. What is a database
A database is a collection of data stored in an organized way. Think of it as a file cabinet.
1.1 Basic information
MySQL database belongs to MySQL AB, headquartered in Sweden, and later acquired by oracle. Is currently the most popular relational database.
advantage:
- Low cost: open source code, generally free trial;
- High performance: fast execution;
- Simple: It is easy to install and use.
1.2 MySQL installation
MySQL recommends using Docker to install, just a few lines of commands will be installed, see <Hand-in-hand with Docker from entry to practice>
My command here is:
# 创建mysql容器
docker run -d -p 3307:3306 -e MYSQL_ROOT_PASSWORD=888888 \
-v /Users/sherlocked93/Personal/configs/mysql.d:/etc/mysql/conf.d \
-v /Users/sherlocked93/Personal/configs/data:/var/lib/mysql \
--name localhost-mysql mysql
# 创建好之后进入 mysql 容器:
docker exec -it localhost-mysql bash
# 登录 mysql
mysql -u root -p888888
If you have installed navicate on your machine, you can refer to the following configuration
Fill in the configuration after selecting New Connection:
You can see the contents of your database.
That's it, the effect is as follows:
Without Docker, you can go to the official website MySQL Community Server download the corresponding version of the MySQL installation package. The Community Server version is free. You can download and install it. Basically, just continue to the next step.
Stop talking nonsense, let's start the knowledge irrigating directly below!
2. MySQL is simple to use
2.1 Database related terms
Concepts and terms related to the database:
- database (database) A container for storing organized data;
- table (table) A structured list of a certain type of data;
- schema (schema) Information about the layout and characteristics of databases and tables;
- column (column) a field in the table, all tables are composed of one or more columns;
- data type (datatype) the type of data allowed;
- row (row) a record in the table;
- primary key (primary key) A column (or a group of columns) whose value can uniquely distinguish each row in the table;
- Foreign key (foreign key) A column in the table that contains the primary key value of another table and defines the relationship between the two tables.
- clause (clause) SQL statement is composed of clauses, some clauses are required, and some are optional. For example, the from clause of the select statement.
2.2 Primary key
The concept of the primary key is very important. It uniquely identifies a single or multiple columns in each row of the table called the primary key. The primary key is used to represent a specific row.
Although the primary key is not always required, you should try to ensure that each table has a primary key defined to facilitate future data manipulation and management. Without a primary key, different rows cannot be distinguished, and it is difficult to update or delete specific rows in the table.
Any column in the table can be used as the primary key, as long as it meets the following conditions:
- No two rows have the same primary key value;
- Each row must have a primary key value (primary key columns do not allow NULL values).
When using multiple columns as the primary key, the above conditions must be applied to all the columns that make up the primary key, and the combination of all column values must be unique (the value of a single column may not be unique).
A few generally recognized best habits are:
- Do not update the value in the primary key column;
- Do not reuse the value of the primary key column;
- Do not use values that may change in the primary key column.
2.3 grammar specification
Grammar specification:
- Enter
help
or\h
get help; - It is not case sensitive, but it is recommended that keywords be capitalized, and table names and column names are lowercase;
- It is best to end each command with a semicolon
;
or\g
, just press Enter without executing the command; - Each command can be indented and line-wrapped as needed;
- Use
#
for multi-line comments, and/* ... */
for multi-line comments; - Enter
quit
orexit
launch the MySQL command line;
Grammatical characteristics:
- Case insensitive;
- It can be written on one or more lines, and can be divided into multiple lines for easy reading and debugging;
- Keywords cannot be abbreviated or branched;
- Each clause is generally written separately;
- Recommend the use of indentation to improve the readability of sentences;
Common simple commands:
mysql -u root -p # –h 主机名 –u 用户名 -P 端口号 –p密码,注意-p跟密码之间不能加空格其他可以加可以不加
select version(); # 查看 mysql 服务版本
show databases; # 查看所有数据库,注意最后有 s
create database [库名]; # 创建库
use [库名]; # 打开指定的库
show tables; # 查看当前库的所有表
show tables from [库名]; # 查看其他库的所有表
desc [表名]; # 查看表结构
create table [表名] ( # 创建表
列名 列类型,
列名 列类型,
);
drop database [库名]; # 删除库
drop table [表名]; # 删除表
exit; # 退出
2.4 Create a table and fill in the data
First of all, we have the whole point of data to facilitate the code demonstration later.
mysql -u root -p888888 # 输入用户名密码,进入mysql命令行
Then download the file create.sql on Github and run it (you can also copy the contents of the file directly to the MySQL command line for execution).
If you are using navicate, after creating a connection to localhost-mysql in the previous chapter, just run it:
Similarly, run another file populate.sql to fill in the data in each table.
After running, you can see the table in the library in the command line show tables
2.5 Relationship table
Briefly introduce the table just created.
For data classification and processing, customers, vendors, order orders, order information orderitems, product records productnotes, and products tables store different information respectively.
For example, in the vendor information table vendors, each vendor has a unique identifier, that is, the primary key vend_id, and each product in the products product table also has a primary key prod_id, and a field vendor_id vendor ID and vendor table vend_id correspondence, which is foreign key .
If you want to find the corresponding supplier information through the product ID, then use the foreign key to find the information in another table. The foreign key prevents each product from repeatedly saving the detailed information of the supplier. Just save the ID of the supplier. When the supplier's information changes, such as the mailbox and address, there is no need to change the data of each row one by one, just change the supply. Corresponding supplier information in the quotient table.
The benefits of this:
- Supplier information is not repeated, so no time and space are wasted;
- If the supplier information changes, only a single record in the vendors table can be updated, and the data in the related table does not need to be changed;
- Since there is no duplication of data, it is obvious that the data is consistent, which makes it easier to process the data.
scalability (scale), able to adapt to the increasing workload without failure. A well-designed database or application is called scale well.
2.6 Data Type
The MySQL data type defines the rules for what data can be stored in a column and how the data is stored.
Numerical
Integer: Tinyint
, Smallint
, Mediumint
, Int
( Integer
), Bigint
, which can be unsigned and signed, and signed by default.
- If you don’t set the unsigned, the default is signed. If you want to set unsigned, you can add the keyword
unsigned
- If the inserted value exceeds the range of the integer, an out of range exception will be reported, and the critical value will be inserted;
- If the length is not set, there will be a default length.
Decimal
- Fixed points:
dec(M,D)
,decimal(M,D)
- Floating point numbers:
float(M, D)
,double(M, D)
M is the integer part + the decimal part, and D is the decimal part. Both M and D can be omitted. If it is decimal
, M defaults to 10 and D defaults to 0.
Character type
- Shorter text:
char(n)
andvarchar(n)
represents the number of characters, not the number of bytes. - Longer text:
text
(long text data),blob
(longer binary data). binary
andvarbinary
used to save short binary.enum
used to save the enumeration.set
used to save the collection.
Date and time type
date
format YYYY-MM-DD, save the date;time
format HH:MM:SS, save time;year
format YYYY, preservation year;datetime
format YYYY-MM-DD HH:MM:SS, save date + time, range1000-9999
, not subject to time zone impression;timestamp
timestamp, the format saves the date + time, the range is1970-2038
, which is affected by the time zone;
3. Retrieve data select
select
statement used to query is probably the most commonly used. It is used to retrieve information from one or more tables. A select
statement must give at least two pieces of information: what you want to choose and where to choose.
# 基本语法
select [查询列表] from [表名];
# 查询单个/多个/所有字段
select cust_name from customers;
select cust_name,cust_city,cust_address from customers;
select `select` from customers; # 如果某字段是关键字,可以用 ` 符号包起来
select * from customers; # * 表示所有
# 查询常量值/表达式/函数
select 100;
select 'zhangsan';
select 100%98;
select version();
3.1 Deduplication distinct
The result of the query may have multiple duplicate values, you can use the 06136d5c2ea9df keyword to distinct
select order_num from orderitems; # 有一些重复值
select distinct order_num from orderitems; # 将重复值去重
3.2 Limit the result limit
The select statement returns all matching rows, which may be every row in the specified table. In order to return to the first or previous rows, the clause limit
limit m
means returning the first m rows found, limit m,n
means returning the n rows starting from the mth row, and you can also use limit m offset n
have the same meaning as before.
Note that the index of the first row retrieved is row 0.
3.3 fully qualified table names and column names
In some cases, the statement may use fully qualified listings and table names:
select orderitems.order_num from mysql_demo1.orderitems;
# 上面这句等价于
select order_num from orderitems;
4. Sort and retrieve data order by
In the previous chapter orderitems
is not sorted. Under normal circumstances, the returned order is the order in which it appears in the underlying table. You can sort the retrieved data order by
asc
and desc
can be used to specify the sorting direction. order by asc
ascending order, order by desc
descending order, if not written, the default is ascending order.
order by
clause can support a single field, multiple fields, expressions, functions, and aliases, which are generally placed at the end of the sentence, except for limit
.
select * from orderitems order by item_price; # 按item_price升序排列
# 先按 quantity 升序排列,quantity 的值一样时按 item_price 的值升序排列
select * from orderitems order by quantity,item_price;
# 先按 quantity 降序排列,quantity 的值一样时按 item_price 的值升序排列
select * from orderitems order by quantity desc,item_price;
# 找到最贵订单
select * from orderitems order by item_price desc limit 1;
5. Filter data where
Use the where
from
clause to add filter conditions and filter data.
# 基本语法
select [查询列表] from [表名] where [筛选条件] order by [排序条件];
Filter by conditional expression >
, =
, <
, >=
, <=
, !=
, <>
, <=>
# 找出产品价格为 2.5 的产品名字
select prod_name, prod_price from products where prod_price=2.5;
# 找出产品价格小于等于 10 的产品名字,并按产品价格降序排列
select prod_name, prod_price from products where prod_price <= 10 order by prod_price desc;
# 找到供应商 id 不为 1003 的产品,!= 和 <> 含义一样,都是不等于
select vend_id, prod_name from products where vend_id <> 1003;
5.1 Range check between and
Use between ... and ...
specify the start value and end value of the required range to achieve the effect of range query.
Note that the between and
around 06136d5c2eacf2 are in the order of small and large, so they cannot be adjusted.
# 查询产品价格在 3 到 10 内的产品
select prod_name, prod_price from products where prod_price between 3 and 10;
# 单独使用 and 也可以打到这个效果
select prod_name, prod_price from products where prod_price <= 10 and prod_price >= 3;
5.2 Null value check is (not) null
When creating a table, you can specify that certain columns may not contain a value, that is, null
, null
represents no value, which is different from a field containing 0, an empty string, or only containing spaces.
Using is null
or is not null
can be used to determine whether a value is null
.
illustrate:
- Equal to
=
and not equal to<>
,!=
cannot be used to judgenull
is null
andis not null
can be used to judgenull
<=>
security equal sign can be used to determinenull
# 找出顾客中邮箱不是 null 的顾客信息
select * from customers where cust_email is not null;
# 使用安全等于号来判断 null 也是可以的
select * from customers where cust_email <=> null;
5.3 The logical AND operator and
operator (operator) is used to connect or change the keywords in the where clause. It is also called the logical operator (logical operator).
The above mentioned the and
operator, through this operator can increase the qualification:
# 找出供应商为 1003 提供的价格小于等于 10 的产品
select * from products where vend_id = 1003 and prod_price <= 10;
5.4 logical OR operator or
or
operator is the and
operator. This is a logical OR operator, which returns rows that match any condition:
# 找出id为 1003 或 1001 的供应商
select * from products where vend_id = 1003 or vend_id = 1001;
When and
and or
appear at the same time, and
will be processed first, such as this sentence:
select * from products where vend_id = 1001 or vend_id = 1003 and prod_price >= 10;
This sentence will process and
first, which means that vend_id
is 1003 and prod_price
greater than or equal to 10, or vend_id
is 1001.
In this case, you can add parentheses:
select * from products where (vend_id = 1001 or vend_id = 1003) and prod_price >= 10;
The result of this search is a vend_id
is 1001 or 1003 and whose prod_price
greater than or equal to 10.
and
you use the where
clause with the 06136d5c2eb085 and or
operators, you should use parentheses to explicitly group the operators. Don't rely too much on the default calculation order, even if it is what you want, and using parentheses can eliminate ambiguity and increase readability.
5.5 Range operator in (set)
Use the in
operator to specify the range of conditions, and each condition in the range can be matched. in
The value to be matched is placed in parentheses:
# 找出id为 1003 或 1001 的供应商
select * from products where vend_id in (1001, 1003);
in
operator can be replaced with or
. It is recommended to use in
in the following situations:
- When there are more options,
in
is clearer and more intuitive; - When using
in
, the order of calculation is easier to manage (because fewer operators are used); in
generally faster than theor
operator list;in
is that it can contain otherselect
statements, making it possible to build thewhere
clause more dynamically.
5.6 logical negation operator not
not
operator can be in
and between and
to indicate the negation of the range:
# 找出id不为 1001 1003 的产品
select * from products where vend_id not in (1001, 1003);
# 选择产品价格不在 5 到 15 之间的产品
select * from products where prod_price not between 5 and 15;
5.7 like operator
For example, if you want to find anvil
, you can search through the like
operator. like
indicates that the following search pattern uses wildcard matching instead of direct equal matching.
Operator%
The most commonly used wildcard is the %
operator. %
represents any number of characters, including no characters.
# 找出产品名字以 jet 开头的产品
select * from products where prod_name like 'jet%';
# 找出产品名中含有 on 的产品
select * from products where prod_name like '%on%';
# 找出产品名以 s 开头,以 e 结束的产品
select * from products where prod_name like 's%e';
Note that %
cannot match null
.
Operator_
_
represents any single character.
select * from products where prod_name like '_ ton anvil';
In addition, the translation uses \
, such as \_
# 找到描述中有 % 的产品
select * from products where prod_desc like '%\%%';
Notice:
- Don't overuse wildcards. If other operators can achieve the same purpose, other operators should be used.
- When you really need to use wildcards, don't use them at the beginning of the search pattern unless absolutely necessary. Put the wildcard at the beginning of the search pattern, the search is the slowest.
5.8 Regular expression regexp
Regarding regular expressions, you can take a brief look at the "Regular Expressions Must Know and Know" .
Use the regexp
keyword to indicate the regular expression that matches the following:
# 找到产品名以 1000 或 anvil 结尾的产品
select * from products where prod_name regexp '1000|anvil$';
Regular expressions are translated using \\
, for example, you want to find .
this character instead of the regular .
wildcard, use \\.
, in order to transfer \
this character, use \\\
# 找到产品名以 . 字符开头的产品
select * from products where prod_name regexp '^\\.';
6. Calculated fields
Sometimes we need to retrieve the converted, calculated, or formatted data directly from the database instead of retrieving the data and then reformatting it in the client application or reporting program. At this time, we need to calculate the field.
6.1 Alias as
The queried virtual table can be given an alias for easy understanding and good readability. In addition, if the query field has the same name, you can use the alias as
to distinguish it.
# 使用 as 关键字
select cust_name as name from customers;
# as 关键字也可以直接省略
select cust_name name from customers;
# 可以给不同字段分别起别名
select cust_name name, cust_city location from customers;
6.2 Splicing concat
If you want to connect multiple fields into one field, you can use the splicing field function concat
:
# 将供应商的名字和地点拼接好后返回,并命名为 vend
select concat(vend_name, '(', vend_country, ')') vend from vendors;
Note that if any data in the middle is null
, the result of splicing is also null
.
So for some null
, you need to use the ifnull
function to judge. The first parameter is the field to be judged, and the second parameter is the result you want to return null
# 将顾客信息拼接起来
select concat(cust_name, '(', ifnull(cust_email, '-'), ')') customerInfo from customers;
If there are spaces before and after the data in the table, you can use the rtrim()
function to remove the spaces on the right, ltrim()
remove the spaces on the left, or trim()
remove the spaces before and after:
# 将顾客信息处理后拼接起来
select concat(rtrim(vend_name), '(', trim(vend_country), ')') vend from vendors;
6.3 Arithmetic calculation+-*/
Basic arithmetic operators are also supported select
# 计算订单每种总额,并按照总金额降序排列
select prod_id as id, quantity, quantity*item_price as totalPrice
from orderitems order by totalPrice desc;
The basic operators of addition, subtraction, multiplication and division are all supported +
, -
, *
, /
.
7. Data processing functions
trim()
function that removes the first space of the data introduced earlier is the data processing function. In addition to this, there are many other types of data processing functions:
- Text functions used to process text strings, such as deleting or filling values, and converting values to uppercase or lowercase.
- Numerical functions used to perform arithmetic operations on numeric data, such as returning absolute values and performing algebraic operations.
- Date and time functions used to process date and time values and extract specific components from these values, for example, return the difference between two dates, check date validity, etc.
- System functions, such as returning user login information, checking version details.
If you don’t know how to use a function, you can use the help
command. For example, help substr
can get the usage and examples of substr
7.1 Character functions
function | illustrate |
---|---|
left() 、right() | Returns the characters on the left and right of the string |
length() | Return string length |
lower() 、upper() | Returns the lowercase and uppercase of the string |
rtrim() 、ltrim() 、trim() | Remove the spaces on the right, left, and both sides |
locate() | Find a substring of a string |
soundex() | Returns the sundex value of the string |
substring() | Returns the character of the substring |
subset() | Returns the character of the substring ( substring use of 06136d5c30b389) |
instr() | Returns the index of the first occurrence of the substring, without returning 0 |
replace() | String replacement |
lpad() 、rpad() | Left fill, right fill |
Example:
# upper、lower 将姓变大写,名变小写,然后拼接
select concat(upper(last_name), lower(first_name)) 姓名 from employees;
# 姓名中首字符大写,其他字符小写然后用_拼接,显示出来
select concat(upper(substr(last_name, 1, 1)), '_', lower(substr(last_name, 2))) from employees;
# substr 截取字符串,sql 中索引从 1 开始而不是0
select substr('hello world', 3); # llo world
select substr('hello world', 2, 3); # ell
# instr 返回子串第一次出现的索引,没找到返回 0
select instr('abcdefg', 'cd'); # 3
# trim 减去字符串首尾的空格或指定字符
select trim(' hello '); # hello
select trim('aa' from 'aaabaabaaaaaa'); # abaab
# lpad 用指定的字符实现左填充指定长度
select lpad('he', 5, '-'); # ---he
# rpad 用指定的字符实现左填充指定长度
select rpad('he', 5, '-*'); # he-*-
# replace 替换
select replace('abcabcabc', 'bc', '--'); # a--a--a--
7.2 Mathematical functions
function | illustrate |
---|---|
round() | rounding |
ceil() | Rounded up |
floor() | Round down |
truncate() | Keep a few decimal places |
mod() | Take the remainder |
abs() | Return absolute value |
rand() | Returns a random number |
Example:
# round 四舍五入,第二个参数是小数点后保留的位数
select round(-1.55); # -2
select round(1.446, 2); # 1.45
# ceil 向上取整
select ceil(1.001); # 2
select ceil(1.000); # 1
select ceil(-1.001); # -1
# floor 向下取整
select floor(1.001); # 1
select floor(1.000); # 1
select floor(-1.001); # -2
# truncate 小数点后截断几位
select truncate(1.297, 1); # 1.2
select truncate(1.297, 2); # 1.29
# mod 取余和%同理,符号与被除数一样
select mod(10, -3); # 1
select mod(-10, -3); # -1
select mod(-10, 3); # -1
select 10%3; # 1
7.3 Date functions
function | illustrate |
---|---|
now() | Returns the current system date and time |
curate() 、current_date | Returns the current system date, excluding time |
curtime() 、current_time | Returns the current time, excluding the date |
year() 、month() 、day() 、hour() 、minute() 、second() | Get the specified part of the time, year, month, day, hour, minute, second |
str_todate() | Convert the characters in the date format to the date in the specified format |
date_format() | Convert date to specified format character |
Example:
# now 返回当前系统日期和时间
select now(); # 2020-07-08 12:29:56
# curdate,current_date 返回当前系统日期,不包括时间
select curdate(); # 2020-07-08
# curtime,current_time 返回当前时间,不包括日期
select curtime(); # 12:29:56
# year... 获取时间指定部分,年、月、日、小时、分钟、秒
select year(now()); # 2020
select month(now()); # 7
select monthname(now()); # July
select day(now()); # 8
select dayname(now()); # Wednesday
select hour(now()); # 12
select minute(now()); # 29
select second(now()); # 56
select month(order_date) from orders;
# str_to_date 将日期格式的字符转换成指定格式的日期
select str_to_date('1-9-2021', '%d-%c-%Y'); # 2020-09-01
select * from orders where order_date = str_to_date('2005-09-01', '%Y-%m-%d');
# date_format 将日期转换成指定格式的字符
select date_format(now(), '%Y年%m月%d日'); # 2020年09月01日
select order_num orderId,date_format(order_date, '%Y/%m') orderMonth from orders;
Date format:
Format specifier | Function |
---|---|
%Y | Four years |
%y | Two years |
%m | Month(01,02,...12) |
%c | Month (1,2,...12) |
%d | Day (01, 02,...) |
%e | Day (1,2,...) |
%H | Hour (24 hour clock) |
%h | Hour (12-hour clock) |
%i | Minutes (00,01,...59) |
%s | Seconds (00,01,...59) |
7.4 Aggregate functions
Aggregate function (aggregate function) A function that runs on a row group to calculate and return a single value.
function | illustrate |
---|---|
avg() | Returns the average value of a column |
count() | Returns the number of rows in a column |
max() 、min() | Return the maximum and minimum values of a column (null values are ignored) |
sum() | Returns the sum of a column (ignoring null values) |
Example:
# 计算产品价格平均值
select avg(prod_price) as avgPrice from products;
# 计算供应商id为 1003 提供的产品的平均价格
select avg(prod_price) as avgPrice from products where vend_id = 1003;
# 计算价格最大的产品价格
select max(prod_price) as maxPrice from products;
# 计算顾客总数
select count(*) from customers;
# 计算具有 email 的顾客数
select count(cust_email) from cutomers;
# 计算产品价格总和
select sum(prod_price) from products;
# 计算订单为 20005 的订单总额
select sum(item_price * quantity) totalPrice from orderitems where order_num = 20005;
# 计算产品具有的不同的价格的平均数
select avg(distinct prod_price) avgPrice from products where vend_id = 1003;
# 同时计算产品总数、价格最小值、最大值、平均数
select count(*) prodNums, min(prod_price) priceMin, max(prod_price) priceMax, avg(prod_price) priceAvg from products;
8. Packet data
The previous aggregate functions are where
clause, such as querying the average price of a supplier’s products, but if you want to return the average price of the products provided by each supplier, what should you do? Deal with it. This should use grouping, which allows data to be divided into multiple logical groups, so that each group can be aggregated and calculated.
8.1 Create group by
Using the group by
clause can instruct MySQL to sort and group the data by a certain data, and then aggregate each group instead of the entire result set.
# 分别查询每个供应商提供的产品种类数
select vend_id, count(*) num_prods from products group by vend_id;
# 查询每个供应商提供的产品的平均价格
select vend_id, avg(prod_price) avg_price from products group by vend_id;
Notice:
group by
clause can contain any number of columns. This makes it possible to nest groups and provide more detailed control over data grouping.group by
is nested in the 06136d5c30ce9b clause, the data will be summarized on the last specified grouping. In other words, when creating a group, all the specified columns are calculated together (so data cannot be retrieved from individual columns).group by
clause must be a retrieval column or a valid expression (but not an aggregate function). If you use an expressionselect
, you must specify the same expressiongroup by
Cannot use aliases.- Except for aggregate calculation statements, each column in the
select
group by
clause. - If the grouping column has a null value, then null will be returned as a grouping. If there are multiple rows of null values in the column, they will be grouped together.
group by
clause must appear inwhere
after clause,order by
before clause.
8.2 Filtering group having
In addition to using group by
group data, MySQL also allows the use of the having
keyword to filter the groups, specifying which groups to include and which groups to exclude.
The grammatical sequence is as follows:
# 语法顺序
select [查询列表] from [表名] where [筛选条件] group by [分组列表] having [分组后筛选] order by [排序列表] limit [要检索行数];
where
filtering does not have the concept of grouping. It specifies rows instead of groups. For grouping filtering, use the having
clause. In fact, all types of where
clauses learned so far can be replaced with having
.
Regarding having
and where
, here is another way to understand, where
before data grouping, and having
filters after data grouping. where
excluded by 06136d5c30d0a2 are not included in the grouping, which may change the calculated value, thereby affecting the groupings filtered out based on these values in the having
Can use pre-group screening where
, priority is given to pre-group screening.
# 找到提供大于 2 个产品的供应商,并列出其提供的产品数量,这里使用 having 来过滤掉产品数不大于2的供应商
select vend_id, count(*) prodCount from products group by vend_id having prodCount > 2;
# 找到供应商提供的商品平均价格大于 10 的供应商,并且按平均价格降序排列
select vend_id, avg(prod_price) avgPrice from products group by vend_id having avgPrice > 10 order by avgPrice desc;
9. Subqueries
subquery (subquery), a query nested in other queries.
9.1 Filtering using subqueries
When another complete select
statement is nested in select
statement is called a subquery or inner query, and the outer select
statement is called a main query or an outer query.
All the previous queries are in the same table. If the information we want to get is scattered in two or more tables, for example, we want to get the customer ID from the orders table, and then use the customer ID to go to the customer table custormers to find the corresponding customer information.
# 首先在 orderitems 表中找到产品 TNT2 对应的订单编号
select order_num from orderitems where prod_id = 'TNT2'
# 然后在 orders 表中找到订单编号对应的顾客 id
select cust_id from orders where order_num in (
select order_num from orderitems where prod_id = 'TNT2');
# 然后去 customers 表中找到顾客 id 对应的顾客名字
select cust_id, cust_name from customers where cust_id in (
select cust_id from orders where order_num in (
select order_num from orderitems where prod_id = 'TNT2'));
There are actually three statements here. The innermost subquery returns a list of order numbers, which is used in the where
clause of the outer subquery. The outer subquery returns a list of customer IDs, which are used in the where clause of the outermost query. The outermost query finally returns the required data.
There is no limit to the number of subqueries that can be nested, but due to performance limitations in actual use, too many subqueries cannot be nested.
9.2 Related subqueries
related subquery (correlated subquery) A subquery involving an external query.
Another way to use subqueries is to create calculated fields. Suppose you need to display the total number of orders for each customer in the customers table. The order and the corresponding customer ID are stored in the orders table.
# 首先找到用户 ID 对应的订单数量
select count(*) from orders where cust_id = 10003;
# 然后将其作为一个 select 子句,将用户 id
select cust_name, cust_state, (
select count(*) from orders where orders.cust_id = customers.cust_id) as order_count
from customers order by order_count desc;
Note the above where orders.cust_id = customers.cust_id
. This type of subquery is called a correlated subquery. At any time, as long as the column name may be ambiguous, you must use the fully qualified syntax (the table name and column name are separated by a period).
10. Join table
If the data to be checked is scattered in multiple tables, how to use a single select
statement to find the data? It can be done by using joins.
Connection is a mechanism for select
statement, so it is called a connection. Using a special syntax, you can join multiple tables to return a set of output, and join the correct rows in the associated table at runtime.
Maintain referential integrity : When using relational tables, it is very important to insert only valid data in relational columns. If you insert into the products table products produced by suppliers that have a supplier ID that does not appear in the vendors table, these products are inaccessible because they are not associated with a certain supplier.
To prevent this from happening, you can instruct MySQL to only allow legal values in the vendor ID column of the products table (that is, the vendors that appear in the vendors table). This is to maintain referential integrity, which is achieved by specifying the primary key and foreign key in the definition of the table.
10.1 Create a connection
The creation of the join is very simple, specify all the tables to be joined and how they are related.
# 列出产品的供应商及其价格
select vend_name, prod_name, prod_price from vendors, products
where vendors.vend_id = products.vend_id order by prod_price desc;
Here, after where
, the fully qualified column name is used to specify that MySQL matches the vender_id column of the vendor table and the vender_id field of the products table.
When the quoted column may be ambiguous, you must use the fully qualified column name, because MySQL does not know which column you are referring to.
When joining two tables, what is actually done is to pair each row of one table with each row of the other table, so the where
clause is used as the filter condition to filter out only the column where vendors.vend_id = products.vend_id
that contains the specified join condition, and there is no where clause. , Will return the length product of the two tables. This is called Cartesian product (cartesian product). You can run this sentence to see:
# 返回两个表长度乘积行
select vend_name, prod_name, prod_price from vendors, products;
All connections should always use the connection condition, otherwise a Cartesian product will be obtained.
10.2 Join multiple tables
A select
statement can also join multiple tables. For example, you need to list the product information, order information, and supplier information of an order, and the product information you are looking for is scattered in the three tables of supplier, product, and order information.
# 将订单 20005 的产品信息、订单信息、供应商信息找出来
select prod_name, vend_name, prod_price, quantity
from orderitems,
products,
vendors
where products.vend_id = vendors.vend_id
and orderitems.prod_id = products.prod_id
and order_num = 20005;
Here, and
used to connect multiple join conditions, which defines what is used as an association between the three tables.
Note: MySQL can be very resource intensive to associate multiple tables to process joins at runtime. Do not join unnecessary tables. The more tables that are connected, the more severe the performance degradation.
Here you can use joins to implement the example in Section 9.1. Previously, it was implemented using a subquery to obtain the customer ID from the orders table, and then use the customer ID to go to the customer table custormers to find the corresponding customer information.
# 使用联结来实现 9.1 的例子
select customers.cust_id, cust_name
from orders, customers, orderitems
where orders.order_num = orderitems.order_num
and customers.cust_id = orders.cust_id
and prod_id = 'TNT2'; # 由于三个表中只有一个表有 prod_id,所以不需要限定表名
Here to mention, not only columns can be aliased, but tables can also be aliased. The usage is the same as that of column aliases:
# 把前面这个句子起别名
select c.cust_id, cust_name
from orders o, customers c, orderitems oi
where o.order_num = oi.order_num
and c.cust_id = o.cust_id
and prod_id = 'TNT2';
This not only shortens the SQL statement, but also allows the select
statement. The aliases created at the same time can be used not only in the select
clause, but also in the where
, order by
clauses and other parts of the statement.
10.3 inner join
The join used so far is called equivalent join (equijoin), which is based on the equality test between two tables, also known as internal join . In fact, a slightly different syntax can be used for this type of connection to clearly specify the type of connection. The following select
statement returns exactly the same data as the previous example:
# 列出产品的供应商及其价格
select vend_name, prod_name, prod_price
from vendors
inner join products
on vendors.vend_id = products.vend_id;
The join condition here uses the on
clause instead of where
. Both of these grammars can achieve results. Although the use of the where
clause to define the connection is indeed relatively simple, the use of a clear connection grammar can ensure that the connection conditions are not forgotten, and sometimes this can also affect performance.
10.4 Self-connection
For example, a product has a quality problem, and now I want to find out all the product information provided by the supplier of this product. According to the subquery described earlier, we can first find the supplier of the corresponding product, and then find the list of products with this supplier ID:
# 先找到产品 ID 为 TNT1 的供应商 ID,然后找到对应供应商 ID 提供的产品列表
select prod_id, prod_name, vend_id
from products
where vend_id in (
select vend_id
from products
where prod_id = 'TNT1'
);
It can be achieved using subqueries, and it can be done using joins. This is self-joining:
# 自联结
select p1.prod_id, p1.prod_name, p1.vend_id
from products p1,
products p2
where p1.vend_id = p2.vend_id
and p2.prod_id = 'TNT1';
The two tables of the self-join query are the same table, so the products table needs to be aliased separately for distinction, and select
clause also need to be qualified, because the same fields appear in both tables.
Self-join is usually used as an external statement to replace the subquery statement used when retrieving data from the same table. Although the final result is the same, sometimes processing joins is much faster than processing subqueries. You should try two methods to determine which one has better performance.
10.5 Natural Connection
Whenever a table is joined, there should be at least one listed in more than one table (columns being joined). The standard join returns all data, even the same column appears multiple times. Natural joins exclude multiple occurrences, so that each column is returned only once.
Natural connection is that you select only those unique columns. This is generally done by using wildcards for the table and explicit subsets for all other table columns.
# 自选择唯一的通配符只对第一个表使用。所有其他列明确列出,所以没有重复的列被检索出来。
select v.*, p.prod_id
from vendors v,
products p
where v.vend_id = p.vend_id;
10.6 outer join
In some cases, the join contains rows that have no associated rows in the related table. This type of join is called external join .
for example:
- Count how many orders each customer has placed, including those customers who have not yet placed an order;
- List all products and the quantity ordered, including products that no one ordered;
- Calculate the average sales volume, including those customers who have not yet placed an order.
At this time, the link needs to contain those rows that have no associated rows.
For example, to retrieve all users and their orders, list those without orders:
# 内部联结,查找用户对应的订单
select c.cust_id, o.order_num
from customers c
inner join orders o
on c.cust_id = o.cust_id;
# 左外部联结,将没有下单过的顾客行也列出来
select c.cust_id, o.order_num
from customers c
left outer join orders o
on c.cust_id = o.cust_id;
# 右外部联结,列出所有订单及其顾客,这样没下单过的顾客就不会被列举出来
select c.cust_id, o.order_num
from customers c
right outer join orders o
on c.cust_id = o.cust_id;
When using the outer join
syntax, you must use the right
or left
keyword to specify the table that includes all its rows. right
refers to the table on the right of outer join
left
refers to the table on the left of outer join
The above uses left outer join
to select all rows from the left table custermers of the from
In order to select all rows from the table on the right, right outer join
should be used.
The left outer join can be from
or where
. Which one is more convenient for you.
10.7 Using joins with aggregate functions
For example, if you want to retrieve the number of orders placed by a customer, write 0 even if there is no one. At this time, use grouping and the count
to count the number:
# 找到每个顾客所下订单的数量,并降序排列
select c.cust_id, c.cust_name, count(o.order_num) count_orders
from customers c
left outer join orders o on c.cust_id = o.cust_id
group by c.cust_id
order by count_orders desc;
Because even if the customer does not place an order, it will still be in the result, so put the customer table on the left and use the left outer link.
11. Combination query
MySQL allows multiple select statements to be executed and the results are returned as a single query result set. These combined queries are usually called unions or compound queries.
There are two situations where you need to use combined queries:
- Return similar structured data from different tables in a single query;
- Perform multiple queries on a single table, and return data in a single query.
In most cases, the combined query can be replaced by a single query where
You can try these two methods for specific scenarios to see which one performs better for a particular query.
11.1 Create a combined query union
When the query results come from multiple tables, but there is no correlation between the multiple tables, a combined query is often used at this time. Put the keyword union
between each select
statement.
# 比如需要列出商品价格小于等于 10 而且是供应商 ID 为 1005 或 1003 的产品信息
select prod_id, prod_name, prod_price, vend_id from products where prod_price <= 10
union
select prod_id, prod_name, prod_price, vend_id from products where vend_id in (1005, 1003);
# 实际上这句也可以通过 where 语句代替
select prod_id, prod_name, prod_price from products where prod_price <= 10 or vend_id in (1005, 1003);
- In some cases, such as more complex filter conditions and the need to retrieve data from multiple tables,
union
may be simpler to use 06136d5c30dcb4. union
Each query must contain the same columns, expressions, and aggregate functions, but each column does not need to be listed in the same order.- The column data types must be compatible, the types need not be exactly the same, but must be implicitly converted by the database management system.
- The sorting of the combined query
order by
can only appear after the lastselect
statement, and the differentselect
statements cannot be sorted separately.
11.2 Include or cancel duplicate rows union (all)
Two lines of union
separate statements may return duplicate rows, but the actual result of the previous example does not contain duplicate rows. This is because the union
keyword automatically removes the duplicate rows. If you don’t want to remove duplicates, you can use the union all
keyword. .
# 不去重重复行
select prod_id, prod_name, prod_price, vend_id from products where prod_price <= 10
union all
select prod_id, prod_name, prod_price, vend_id from products where vend_id in (1005, 1003);
If duplicate rows are required, the where
keyword cannot be used to achieve the same effect at this time.
12. Addition, deletion and modification of data
What I said earlier is all about querying data. This chapter will add, delete, and modify data.
12.1 Insert into
Data insertion uses the insert
keyword, which can insert one row, multiple rows of data, or the results of certain queries.
# 插入一条数据到顾客表中
insert into customers
values (null, 'Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null);
Here insert a piece of data into the customer table, the data stored in each table column needs to values
, given in the order in which the table was created. If a column has no value, give null. Although the attribute of the first data corresponding to the cust_id column is not null
, this column is auto_increment, which is self-increasing. MySQL will automatically ignore the null you give and automatically increase the value and then fill it.
However, it is not safe to use the above values
clause, because this method of injecting data depends entirely on the order of the input data. If the table structure changes, it will cause the input data to be misplaced.
The safe way to insert data is like this:
# 安全但繁琐的插入方式
insert into customers(cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
values ('Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null);
Here, the column names corresponding to the data in the following parentheses are given in the previous parenthesis, so that even if the table structure or order changes, the data can be inserted correctly.
You can see that the column cust_id is omitted. When the following conditions are met, the column can be omitted:
- The column is defined to allow null values;
- The default value is given in this column when the table is defined, which means that if no value is given, the default value will be used.
If it cannot be omitted but it is omitted, an error will be reported.
insert
operation can be time-consuming, especially when there are many indexes that need to be updated, and it can reduce the performance of the select statement waiting to be processed. If data retrieval is the most important, you can by insert
and into
add keywords between low_priority
, reduce insert
priority statement, which also applies to the below mentioned update
and delete
statements.
12.2 Insert multiple rows
insert
statement introduced above can insert one row at a time. If you want to insert multiple rows at a time, listing the column names every time is more cumbersome. You can use the following method:
# 插入多个行
insert into customers(cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
values ('Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null),
('Li Si', '002 street', 'BeiJing', 'BJ', '878787', 'ZH', null, '123123@163.com');
values
clause, continue to enclose the field in parentheses to add a new line, with a comma in between. This can improve the performance of database processing, because a single insert
statement can handle multiple inserts faster than using multiple insert
statements.
12.3 Insert the retrieved data insert select
insert
can select
statement into the table, which is insert select
. For example, you want to insert data queried from another table into this table:
# 从别的表中找出数据,并插入 customers 表中
insert into customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
select id, name, address, city, state, zip, country, contact, email
from custnew;
select
can also omit the cust_id, so that a new cust_id can be automatically generated for insert
In addition, you can see that the column name of the select statement does not correspond to the column name after the insert into
insert select
uses position correspondence. The first column returned by the select statement corresponds to cust_id, and the second column corresponds to cust_name, which corresponds in turn . select
statement, you can also add the where
clause to filter.
12.4 Modify data update
update
statement is used to modify the data in the table. update
. Don't forget to add the where
clause, because if you are not careful, all rows in the table will be updated.
# 更新 id 为 10005 的用户的信息
update customers set cust_email = '888@qq.com' where cust_id = 10005;
where
clause is not used here update
will update the cust_email field of all rows in this table, so be careful.
To delete the value of a certain row and column, you can modify the value to null.
The way to update multiple fields is also very simple:
# 更新多个字段
update customers
set cust_email = '666@qq.com',
cust_contact = 'S Zhang'
where cust_id = 10005;
If you update
statement, and an error occurs when updating one or more of these rows, the entire update
operation is cancelled (all rows updated before the error occurred are restored to their original values). Even for the error occurs, continue to be updated, you can update
use after ignore
keyword.
update
statement can use subqueries to update the column data with the data retrieved select
12.5 Delete data delete
delete
statement can be used to delete specific rows or all rows from the table. Be careful when using the delete
statement. Don't forget to add the where
clause, because you will delete all rows in the table if you are not careful.
# 删除顾客表中顾客 id 为 10008 的行
delete from customers where cust_id = 10008;
If the where
clause is removed, then all rows in this table are deleted, but this table is not deleted, and another statement drop
is used to delete the table. In addition, the faster statement to delete all rows in a table is truncate table
, because delete
deletes data row by row, while truncate
deletes the original table and rebuilds a table.
Note that update
and delete
, because MySQL has not been revoked. It is recommended to use select
for testing before using it to prevent data loss due to incorrect clauses of where
13. Create and manipulate tables
13.1 create table
create.sql file that was executed for the whole point of data. It can be opened with VSCode/Webstorm/Navivate/ text. The first line of this file except for the comment is the table creation statement:
CREATE TABLE customers
(
cust_id int NOT NULL AUTO_INCREMENT,
cust_name char(50) NOT NULL ,
cust_address char(50) NULL ,
cust_city char(50) NULL ,
cust_state char(5) NULL ,
cust_zip char(10) NULL ,
cust_country char(50) NULL DEFAULT 'ZH', # 指定默认值
cust_contact char(50) NULL ,
cust_email char(255) NULL ,
PRIMARY KEY (cust_id) # 指定主键
) ENGINE=InnoDB;
From here you can see the format of create table
If you want to create a table does not exist at the time, it should be in front of the table name, create table
after adding if not exists
. This will first check whether the table name already exists, and create it if it does not exist.
For auto_increment
, each table can only have one auto_increment
, and it must be indexed. When y
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。