83
头图

The database is a hurdle that has to be crossed in the development of the full stack. It is inevitable that everyone will learn and use relevant knowledge. When I checked the information recently, I found that many contents on the Internet are either very deep or out of structure. For some hopes, it is only used. The people below are not very friendly. I just had the opportunity to learn MySQL recently. I concentrated on learning MySQL for some time and made some notes. Each concept basically has code examples, and each line is typed. Readers can copy the code directly to the command line to run it. Hope it helps everyone~ 😜

The knowledge introduced in this article is not very deep. The target users are those who have zero or weak foundations of MySQL. They can help establish some concepts of MySQL. At least they know how to get to Baidu when they encounter related problems, and they will not encounter back-end feedback. The database file cannot be read.

For those who don’t know anything about Docker and CentOS, you can check out <Hand-in-hand with Docker from entry to practice> and about the basic knowledge necessary for getting started with CentOS> 16136d5c2e93ab two articles, anyway, Docker and CentOS will also be used sooner or later😂

All codes are stored on Github , and can be cloned to read and execute by themselves.

CentOS version: 7.6

MySQL version: 8.0.21

The above mind map can be added to the official account at the end of the article to reply "mysql mind map" to obtain the xmind source file.

1. What is a database

A database is a collection of data stored in an organized way. Think of it as a file cabinet.

1.1 Basic information

MySQL database belongs to MySQL AB, headquartered in Sweden, and later acquired by oracle. Is currently the most popular relational database.

advantage:

  1. Low cost: open source code, generally free trial;
  2. High performance: fast execution;
  3. Simple: It is easy to install and use.

1.2 MySQL installation

MySQL recommends using Docker to install, just a few lines of commands will be installed, see <Hand-in-hand with Docker from entry to practice>

My command here is:

# 创建mysql容器
docker run -d -p 3307:3306 -e MYSQL_ROOT_PASSWORD=888888 \
-v /Users/sherlocked93/Personal/configs/mysql.d:/etc/mysql/conf.d \
-v /Users/sherlocked93/Personal/configs/data:/var/lib/mysql \
--name localhost-mysql mysql

# 创建好之后进入 mysql 容器:
docker exec -it localhost-mysql bash

# 登录 mysql
mysql -u root -p888888

If you have installed navicate on your machine, you can refer to the following configuration

选择 New Connection

Fill in the configuration after selecting New Connection:

navicat配置

You can see the contents of your database.

That's it, the effect is as follows:

image-20210830202717904

Without Docker, you can go to the official website MySQL Community Server download the corresponding version of the MySQL installation package. The Community Server version is free. You can download and install it. Basically, just continue to the next step.

Stop talking nonsense, let's start the knowledge irrigating directly below!

2. MySQL is simple to use

2.1 Database related terms

Concepts and terms related to the database:

  1. database (database) A container for storing organized data;
  2. table (table) A structured list of a certain type of data;
  3. schema (schema) Information about the layout and characteristics of databases and tables;
  4. column (column) a field in the table, all tables are composed of one or more columns;
  5. data type (datatype) the type of data allowed;
  6. row (row) a record in the table;
  7. primary key (primary key) A column (or a group of columns) whose value can uniquely distinguish each row in the table;
  8. Foreign key (foreign key) A column in the table that contains the primary key value of another table and defines the relationship between the two tables.
  9. clause (clause) SQL statement is composed of clauses, some clauses are required, and some are optional. For example, the from clause of the select statement.

2.2 Primary key

The concept of the primary key is very important. It uniquely identifies a single or multiple columns in each row of the table called the primary key. The primary key is used to represent a specific row.

Although the primary key is not always required, you should try to ensure that each table has a primary key defined to facilitate future data manipulation and management. Without a primary key, different rows cannot be distinguished, and it is difficult to update or delete specific rows in the table.

Any column in the table can be used as the primary key, as long as it meets the following conditions:

  1. No two rows have the same primary key value;
  2. Each row must have a primary key value (primary key columns do not allow NULL values).

When using multiple columns as the primary key, the above conditions must be applied to all the columns that make up the primary key, and the combination of all column values must be unique (the value of a single column may not be unique).

A few generally recognized best habits are:

  1. Do not update the value in the primary key column;
  2. Do not reuse the value of the primary key column;
  3. Do not use values that may change in the primary key column.

2.3 grammar specification

Grammar specification:

  1. Enter help or \h get help;
  2. It is not case sensitive, but it is recommended that keywords be capitalized, and table names and column names are lowercase;
  3. It is best to end each command with a semicolon ; or \g , just press Enter without executing the command;
  4. Each command can be indented and line-wrapped as needed;
  5. Use # for multi-line comments, and /* ... */ for multi-line comments;
  6. Enter quit or exit launch the MySQL command line;

Grammatical characteristics:

  1. Case insensitive;
  2. It can be written on one or more lines, and can be divided into multiple lines for easy reading and debugging;
  3. Keywords cannot be abbreviated or branched;
  4. Each clause is generally written separately;
  5. Recommend the use of indentation to improve the readability of sentences;

Common simple commands:

mysql -u root -p      # –h 主机名 –u 用户名 -P 端口号 –p密码,注意-p跟密码之间不能加空格其他可以加可以不加
select version();     # 查看 mysql 服务版本
show databases;       # 查看所有数据库,注意最后有 s

create database [库名]; # 创建库
use [库名];             # 打开指定的库

show tables;           # 查看当前库的所有表        
show tables from [库名];   # 查看其他库的所有表               
desc [表名];               # 查看表结构   

create table [表名] (      # 创建表            
    列名 列类型,
    列名 列类型,
);

drop database [库名];     # 删除库  
drop table [表名];        # 删除表   

exit;                    # 退出

2.4 Create a table and fill in the data

First of all, we have the whole point of data to facilitate the code demonstration later.

mysql -u root -p888888     # 输入用户名密码,进入mysql命令行

Then download the file create.sql on Github and run it (you can also copy the contents of the file directly to the MySQL command line for execution).

If you are using navicate, after creating a connection to localhost-mysql in the previous chapter, just run it:

Similarly, run another file populate.sql to fill in the data in each table.

After running, you can see the table in the library in the command line show tables

2.5 Relationship table

Briefly introduce the table just created.

For data classification and processing, customers, vendors, order orders, order information orderitems, product records productnotes, and products tables store different information respectively.

For example, in the vendor information table vendors, each vendor has a unique identifier, that is, the primary key vend_id, and each product in the products product table also has a primary key prod_id, and a field vendor_id vendor ID and vendor table vend_id correspondence, which is foreign key .

If you want to find the corresponding supplier information through the product ID, then use the foreign key to find the information in another table. The foreign key prevents each product from repeatedly saving the detailed information of the supplier. Just save the ID of the supplier. When the supplier's information changes, such as the mailbox and address, there is no need to change the data of each row one by one, just change the supply. Corresponding supplier information in the quotient table.

The benefits of this:

  1. Supplier information is not repeated, so no time and space are wasted;
  2. If the supplier information changes, only a single record in the vendors table can be updated, and the data in the related table does not need to be changed;
  3. Since there is no duplication of data, it is obvious that the data is consistent, which makes it easier to process the data.

scalability (scale), able to adapt to the increasing workload without failure. A well-designed database or application is called scale well.

2.6 Data Type

The MySQL data type defines the rules for what data can be stored in a column and how the data is stored.

Numerical

Integer: Tinyint , Smallint , Mediumint , Int ( Integer ), Bigint , which can be unsigned and signed, and signed by default.

  1. If you don’t set the unsigned, the default is signed. If you want to set unsigned, you can add the keyword unsigned
  2. If the inserted value exceeds the range of the integer, an out of range exception will be reported, and the critical value will be inserted;
  3. If the length is not set, there will be a default length.

Decimal

  1. Fixed points: dec(M,D) , decimal(M,D)
  2. Floating point numbers: float(M, D) , double(M, D)

M is the integer part + the decimal part, and D is the decimal part. Both M and D can be omitted. If it is decimal , M defaults to 10 and D defaults to 0.

Character type

  1. Shorter text: char(n) and varchar(n) represents the number of characters, not the number of bytes.
  2. Longer text: text (long text data), blob (longer binary data).
  3. binary and varbinary used to save short binary.
  4. enum used to save the enumeration.
  5. set used to save the collection.

Date and time type

  1. date format YYYY-MM-DD, save the date;
  2. time format HH:MM:SS, save time;
  3. year format YYYY, preservation year;
  4. datetime format YYYY-MM-DD HH:MM:SS, save date + time, range 1000-9999 , not subject to time zone impression;
  5. timestamp timestamp, the format saves the date + time, the range is 1970-2038 , which is affected by the time zone;

3. Retrieve data select

select statement used to query is probably the most commonly used. It is used to retrieve information from one or more tables. A select statement must give at least two pieces of information: what you want to choose and where to choose.

# 基本语法
select [查询列表] from [表名];

# 查询单个/多个/所有字段
select cust_name from customers;
select cust_name,cust_city,cust_address from customers;
select `select` from customers;               # 如果某字段是关键字,可以用 ` 符号包起来
select * from customers;                      # * 表示所有                

# 查询常量值/表达式/函数
select 100;
select 'zhangsan';
select 100%98;
select version();

3.1 Deduplication distinct

The result of the query may have multiple duplicate values, you can use the 06136d5c2ea9df keyword to distinct

select order_num from orderitems;           # 有一些重复值
select distinct order_num from orderitems;  # 将重复值去重

3.2 Limit the result limit

The select statement returns all matching rows, which may be every row in the specified table. In order to return to the first or previous rows, the clause limit

limit m means returning the first m rows found, limit m,n means returning the n rows starting from the mth row, and you can also use limit m offset n have the same meaning as before.

Note that the index of the first row retrieved is row 0.

image-20210831110913412

3.3 fully qualified table names and column names

In some cases, the statement may use fully qualified listings and table names:

select orderitems.order_num from mysql_demo1.orderitems;
# 上面这句等价于
select order_num from orderitems;

4. Sort and retrieve data order by

In the previous chapter orderitems is not sorted. Under normal circumstances, the returned order is the order in which it appears in the underlying table. You can sort the retrieved data order by

asc and desc can be used to specify the sorting direction. order by asc ascending order, order by desc descending order, if not written, the default is ascending order.

order by clause can support a single field, multiple fields, expressions, functions, and aliases, which are generally placed at the end of the sentence, except for limit .

select * from orderitems order by item_price;           # 按item_price升序排列

# 先按 quantity 升序排列,quantity 的值一样时按 item_price 的值升序排列
select * from orderitems order by quantity,item_price;  

# 先按 quantity 降序排列,quantity 的值一样时按 item_price 的值升序排列
select * from orderitems order by quantity desc,item_price;  

# 找到最贵订单
select * from orderitems order by item_price desc limit 1;

5. Filter data where

Use the where from clause to add filter conditions and filter data.

# 基本语法
select [查询列表] from [表名] where [筛选条件] order by [排序条件];

Filter by conditional expression > , = , < , >= , <= , != , <> , <=>

# 找出产品价格为 2.5 的产品名字
select prod_name, prod_price from products where prod_price=2.5;

# 找出产品价格小于等于 10 的产品名字,并按产品价格降序排列
select prod_name, prod_price from products where prod_price <= 10 order by prod_price desc;

# 找到供应商 id 不为 1003 的产品,!= 和 <> 含义一样,都是不等于
select vend_id, prod_name from products where vend_id <> 1003;

5.1 Range check between and

Use between ... and ... specify the start value and end value of the required range to achieve the effect of range query.

Note that the between and around 06136d5c2eacf2 are in the order of small and large, so they cannot be adjusted.

# 查询产品价格在 3 到 10 内的产品
select prod_name, prod_price from products where prod_price between 3 and 10;

# 单独使用 and 也可以打到这个效果
select prod_name, prod_price from products where prod_price <= 10 and prod_price >= 3;

5.2 Null value check is (not) null

When creating a table, you can specify that certain columns may not contain a value, that is, null , null represents no value, which is different from a field containing 0, an empty string, or only containing spaces.

Using is null or is not null can be used to determine whether a value is null .

illustrate:

  1. Equal to = and not equal to <> , != cannot be used to judge null is null and is not null can be used to judge null
  2. <=> security equal sign can be used to determine null
# 找出顾客中邮箱不是 null 的顾客信息
select * from customers where cust_email is not null;

# 使用安全等于号来判断 null 也是可以的
select * from customers where cust_email <=> null;

5.3 The logical AND operator and

operator (operator) is used to connect or change the keywords in the where clause. It is also called the logical operator (logical operator).

The above mentioned the and operator, through this operator can increase the qualification:

# 找出供应商为 1003 提供的价格小于等于 10 的产品
select * from products where vend_id = 1003 and prod_price <= 10;

5.4 logical OR operator or

or operator is the and operator. This is a logical OR operator, which returns rows that match any condition:

# 找出id为 1003 或 1001 的供应商
select * from products where vend_id = 1003 or vend_id = 1001;

When and and or appear at the same time, and will be processed first, such as this sentence:

select * from products where vend_id = 1001 or vend_id = 1003 and prod_price >= 10;

This sentence will process and first, which means that vend_id is 1003 and prod_price greater than or equal to 10, or vend_id is 1001.

In this case, you can add parentheses:

select * from products where (vend_id = 1001 or vend_id = 1003) and prod_price >= 10;

The result of this search is a vend_id is 1001 or 1003 and whose prod_price greater than or equal to 10.

and you use the where clause with the 06136d5c2eb085 and or operators, you should use parentheses to explicitly group the operators. Don't rely too much on the default calculation order, even if it is what you want, and using parentheses can eliminate ambiguity and increase readability.

5.5 Range operator in (set)

Use the in operator to specify the range of conditions, and each condition in the range can be matched. in The value to be matched is placed in parentheses:

# 找出id为 1003 或 1001 的供应商
select * from products where vend_id in (1001, 1003);

in operator can be replaced with or . It is recommended to use in in the following situations:

  1. When there are more options, in is clearer and more intuitive;
  2. When using in , the order of calculation is easier to manage (because fewer operators are used);
  3. in generally faster than the or operator list;
  4. in is that it can contain other select statements, making it possible to build the where clause more dynamically.

5.6 logical negation operator not

not operator can be in and between and to indicate the negation of the range:

# 找出id不为 1001 1003 的产品
select * from products where vend_id not in (1001, 1003);

# 选择产品价格不在 5 到 15 之间的产品
select * from products where prod_price not between 5 and 15;

5.7 like operator

For example, if you want to find anvil , you can search through the like operator. like indicates that the following search pattern uses wildcard matching instead of direct equal matching.

Operator%

The most commonly used wildcard is the % operator. % represents any number of characters, including no characters.

# 找出产品名字以 jet 开头的产品
select * from products where prod_name like 'jet%';

# 找出产品名中含有 on 的产品
select * from products where prod_name like '%on%';

# 找出产品名以 s 开头,以 e 结束的产品
select * from products where prod_name like 's%e';

Note that % cannot match null .

Operator_

_ represents any single character.

select * from products where prod_name like '_ ton anvil';

In addition, the translation uses \ , such as \_

# 找到描述中有 % 的产品
select * from products where prod_desc like '%\%%';

Notice:

  1. Don't overuse wildcards. If other operators can achieve the same purpose, other operators should be used.
  2. When you really need to use wildcards, don't use them at the beginning of the search pattern unless absolutely necessary. Put the wildcard at the beginning of the search pattern, the search is the slowest.

5.8 Regular expression regexp

Regarding regular expressions, you can take a brief look at the "Regular Expressions Must Know and Know" .

Use the regexp keyword to indicate the regular expression that matches the following:

# 找到产品名以 1000 或 anvil 结尾的产品
select * from products where prod_name regexp '1000|anvil$';

Regular expressions are translated using \\ , for example, you want to find . this character instead of the regular . wildcard, use \\. , in order to transfer \ this character, use \\\

# 找到产品名以 . 字符开头的产品
select * from products where prod_name regexp '^\\.';

6. Calculated fields

Sometimes we need to retrieve the converted, calculated, or formatted data directly from the database instead of retrieving the data and then reformatting it in the client application or reporting program. At this time, we need to calculate the field.

6.1 Alias as

The queried virtual table can be given an alias for easy understanding and good readability. In addition, if the query field has the same name, you can use the alias as to distinguish it.

# 使用 as 关键字
select cust_name as name from customers;

# as 关键字也可以直接省略
select cust_name name from customers;

# 可以给不同字段分别起别名
select cust_name name, cust_city location from customers;

6.2 Splicing concat

If you want to connect multiple fields into one field, you can use the splicing field function concat :

# 将供应商的名字和地点拼接好后返回,并命名为 vend
select concat(vend_name, '(', vend_country, ')') vend from vendors;

Note that if any data in the middle is null , the result of splicing is also null .

So for some null , you need to use the ifnull function to judge. The first parameter is the field to be judged, and the second parameter is the result you want to return null

# 将顾客信息拼接起来
select concat(cust_name, '(', ifnull(cust_email, '-'), ')') customerInfo from customers;

If there are spaces before and after the data in the table, you can use the rtrim() function to remove the spaces on the right, ltrim() remove the spaces on the left, or trim() remove the spaces before and after:

# 将顾客信息处理后拼接起来
select concat(rtrim(vend_name), '(', trim(vend_country), ')') vend from vendors;

6.3 Arithmetic calculation+-*/

Basic arithmetic operators are also supported select

# 计算订单每种总额,并按照总金额降序排列
select prod_id as id, quantity, quantity*item_price as totalPrice 
from orderitems order by totalPrice desc;

The basic operators of addition, subtraction, multiplication and division are all supported + , - , * , / .

7. Data processing functions

trim() function that removes the first space of the data introduced earlier is the data processing function. In addition to this, there are many other types of data processing functions:

  1. Text functions used to process text strings, such as deleting or filling values, and converting values to uppercase or lowercase.
  2. Numerical functions used to perform arithmetic operations on numeric data, such as returning absolute values and performing algebraic operations.
  3. Date and time functions used to process date and time values and extract specific components from these values, for example, return the difference between two dates, check date validity, etc.
  4. System functions, such as returning user login information, checking version details.

If you don’t know how to use a function, you can use the help command. For example, help substr can get the usage and examples of substr

7.1 Character functions

functionillustrate
left()right()Returns the characters on the left and right of the string
length()Return string length
lower()upper()Returns the lowercase and uppercase of the string
rtrim()ltrim()trim()Remove the spaces on the right, left, and both sides
locate()Find a substring of a string
soundex()Returns the sundex value of the string
substring()Returns the character of the substring
subset()Returns the character of the substring ( substring use of 06136d5c30b389)
instr()Returns the index of the first occurrence of the substring, without returning 0
replace()String replacement
lpad()rpad()Left fill, right fill

Example:

# upper、lower 将姓变大写,名变小写,然后拼接
select concat(upper(last_name), lower(first_name)) 姓名 from employees;

# 姓名中首字符大写,其他字符小写然后用_拼接,显示出来
select concat(upper(substr(last_name, 1, 1)), '_', lower(substr(last_name, 2))) from employees;

# substr 截取字符串,sql 中索引从 1 开始而不是0
select substr('hello world', 3);      # llo world
select substr('hello world', 2, 3);   # ell

# instr 返回子串第一次出现的索引,没找到返回 0
select instr('abcdefg', 'cd');        # 3

# trim 减去字符串首尾的空格或指定字符
select trim('   hello    ');              # hello
select trim('aa' from 'aaabaabaaaaaa');   # abaab

# lpad 用指定的字符实现左填充指定长度
select lpad('he', 5, '-');   # ---he

# rpad 用指定的字符实现左填充指定长度
select rpad('he', 5, '-*');   # he-*-

# replace 替换
select replace('abcabcabc', 'bc', '--');   # a--a--a--

7.2 Mathematical functions

functionillustrate
round()rounding
ceil()Rounded up
floor()Round down
truncate()Keep a few decimal places
mod()Take the remainder
abs()Return absolute value
rand()Returns a random number

Example:

# round 四舍五入,第二个参数是小数点后保留的位数
select round(-1.55);      # -2
select round(1.446, 2);   # 1.45

# ceil 向上取整
select ceil(1.001);   # 2
select ceil(1.000);   # 1
select ceil(-1.001);  # -1

# floor 向下取整
select floor(1.001);   # 1
select floor(1.000);   # 1
select floor(-1.001);  # -2

# truncate 小数点后截断几位
select truncate(1.297, 1);  # 1.2
select truncate(1.297, 2);  # 1.29

# mod 取余和%同理,符号与被除数一样
select mod(10, -3);  # 1
select mod(-10, -3); # -1
select mod(-10, 3);  # -1
select 10%3;         # 1

7.3 Date functions

functionillustrate
now()Returns the current system date and time
curate()current_dateReturns the current system date, excluding time
curtime()current_timeReturns the current time, excluding the date
year()month()day()hour()minute()second()Get the specified part of the time, year, month, day, hour, minute, second
str_todate()Convert the characters in the date format to the date in the specified format
date_format()Convert date to specified format character

Example:

# now 返回当前系统日期和时间
select now();    # 2020-07-08 12:29:56

# curdate,current_date 返回当前系统日期,不包括时间
select curdate();  # 2020-07-08

# curtime,current_time 返回当前时间,不包括日期
select curtime();  # 12:29:56

# year... 获取时间指定部分,年、月、日、小时、分钟、秒
select year(now());      # 2020
select month(now());     # 7
select monthname(now()); # July
select day(now());       # 8
select dayname(now());   # Wednesday
select hour(now());      # 12
select minute(now());    # 29
select second(now());    # 56
select month(order_date) from orders;

# str_to_date 将日期格式的字符转换成指定格式的日期
select str_to_date('1-9-2021', '%d-%c-%Y');    # 2020-09-01
select * from orders where order_date = str_to_date('2005-09-01', '%Y-%m-%d');

# date_format 将日期转换成指定格式的字符
select date_format(now(), '%Y年%m月%d日');   # 2020年09月01日
select order_num orderId,date_format(order_date, '%Y/%m') orderMonth from orders;

Date format:

Format specifierFunction
%YFour years
%yTwo years
%mMonth(01,02,...12)
%cMonth (1,2,...12)
%dDay (01, 02,...)
%eDay (1,2,...)
%HHour (24 hour clock)
%hHour (12-hour clock)
%iMinutes (00,01,...59)
%sSeconds (00,01,...59)

7.4 Aggregate functions

Aggregate function (aggregate function) A function that runs on a row group to calculate and return a single value.

functionillustrate
avg()Returns the average value of a column
count()Returns the number of rows in a column
max()min()Return the maximum and minimum values of a column (null values are ignored)
sum()Returns the sum of a column (ignoring null values)

Example:

# 计算产品价格平均值
select avg(prod_price) as avgPrice from products;

# 计算供应商id为 1003 提供的产品的平均价格
select avg(prod_price) as avgPrice from products where vend_id = 1003;

# 计算价格最大的产品价格
select max(prod_price) as maxPrice from products;

# 计算顾客总数
select count(*) from customers;

# 计算具有 email 的顾客数
select count(cust_email) from cutomers;

# 计算产品价格总和
select sum(prod_price) from products;

# 计算订单为 20005 的订单总额 
select sum(item_price * quantity) totalPrice from orderitems where order_num = 20005;

# 计算产品具有的不同的价格的平均数
select avg(distinct prod_price) avgPrice from products where vend_id = 1003;

# 同时计算产品总数、价格最小值、最大值、平均数
select count(*) prodNums, min(prod_price) priceMin, max(prod_price) priceMax, avg(prod_price) priceAvg from products;

8. Packet data

The previous aggregate functions are where clause, such as querying the average price of a supplier’s products, but if you want to return the average price of the products provided by each supplier, what should you do? Deal with it. This should use grouping, which allows data to be divided into multiple logical groups, so that each group can be aggregated and calculated.

8.1 Create group by

Using the group by clause can instruct MySQL to sort and group the data by a certain data, and then aggregate each group instead of the entire result set.

# 分别查询每个供应商提供的产品种类数
select vend_id, count(*) num_prods from products group by vend_id;

# 查询每个供应商提供的产品的平均价格
select vend_id, avg(prod_price) avg_price from products group by vend_id;

Notice:

  1. group by clause can contain any number of columns. This makes it possible to nest groups and provide more detailed control over data grouping.
  2. group by is nested in the 06136d5c30ce9b clause, the data will be summarized on the last specified grouping. In other words, when creating a group, all the specified columns are calculated together (so data cannot be retrieved from individual columns).
  3. group by clause must be a retrieval column or a valid expression (but not an aggregate function). If you use an expression select , you must specify the same expression group by Cannot use aliases.
  4. Except for aggregate calculation statements, each column in the select group by clause.
  5. If the grouping column has a null value, then null will be returned as a grouping. If there are multiple rows of null values in the column, they will be grouped together.
  6. group by clause must appear in where after clause, order by before clause.

8.2 Filtering group having

In addition to using group by group data, MySQL also allows the use of the having keyword to filter the groups, specifying which groups to include and which groups to exclude.

The grammatical sequence is as follows:

# 语法顺序
select [查询列表] from [表名] where [筛选条件] group by [分组列表] having [分组后筛选] order by [排序列表] limit [要检索行数];

where filtering does not have the concept of grouping. It specifies rows instead of groups. For grouping filtering, use the having clause. In fact, all types of where clauses learned so far can be replaced with having .

Regarding having and where , here is another way to understand, where before data grouping, and having filters after data grouping. where excluded by 06136d5c30d0a2 are not included in the grouping, which may change the calculated value, thereby affecting the groupings filtered out based on these values in the having

Can use pre-group screening where , priority is given to pre-group screening.

# 找到提供大于 2 个产品的供应商,并列出其提供的产品数量,这里使用 having 来过滤掉产品数不大于2的供应商
select vend_id, count(*) prodCount from products group by vend_id having prodCount > 2;

# 找到供应商提供的商品平均价格大于 10 的供应商,并且按平均价格降序排列
select vend_id, avg(prod_price) avgPrice from products group by vend_id having avgPrice > 10 order by avgPrice desc;

9. Subqueries

subquery (subquery), a query nested in other queries.

9.1 Filtering using subqueries

When another complete select statement is nested in select statement is called a subquery or inner query, and the outer select statement is called a main query or an outer query.

All the previous queries are in the same table. If the information we want to get is scattered in two or more tables, for example, we want to get the customer ID from the orders table, and then use the customer ID to go to the customer table custormers to find the corresponding customer information.

# 首先在 orderitems 表中找到产品 TNT2 对应的订单编号
select order_num from orderitems where prod_id = 'TNT2'

# 然后在 orders 表中找到订单编号对应的顾客 id
select cust_id from orders where order_num in (
  select order_num from orderitems where prod_id = 'TNT2');
  
# 然后去 customers 表中找到顾客 id 对应的顾客名字
select cust_id, cust_name from customers where cust_id in ( 
  select cust_id from orders where order_num in (  
    select order_num from orderitems where prod_id = 'TNT2'));

There are actually three statements here. The innermost subquery returns a list of order numbers, which is used in the where clause of the outer subquery. The outer subquery returns a list of customer IDs, which are used in the where clause of the outermost query. The outermost query finally returns the required data.

There is no limit to the number of subqueries that can be nested, but due to performance limitations in actual use, too many subqueries cannot be nested.

9.2 Related subqueries

related subquery (correlated subquery) A subquery involving an external query.

Another way to use subqueries is to create calculated fields. Suppose you need to display the total number of orders for each customer in the customers table. The order and the corresponding customer ID are stored in the orders table.

# 首先找到用户 ID 对应的订单数量
select count(*) from orders where cust_id = 10003;

# 然后将其作为一个 select 子句,将用户 id 
select cust_name, cust_state, (
  select count(*) from orders where orders.cust_id = customers.cust_id) as order_count 
  from customers order by order_count desc;

Note the above where orders.cust_id = customers.cust_id . This type of subquery is called a correlated subquery. At any time, as long as the column name may be ambiguous, you must use the fully qualified syntax (the table name and column name are separated by a period).

10. Join table

If the data to be checked is scattered in multiple tables, how to use a single select statement to find the data? It can be done by using joins.

Connection is a mechanism for select statement, so it is called a connection. Using a special syntax, you can join multiple tables to return a set of output, and join the correct rows in the associated table at runtime.

Maintain referential integrity : When using relational tables, it is very important to insert only valid data in relational columns. If you insert into the products table products produced by suppliers that have a supplier ID that does not appear in the vendors table, these products are inaccessible because they are not associated with a certain supplier.

To prevent this from happening, you can instruct MySQL to only allow legal values in the vendor ID column of the products table (that is, the vendors that appear in the vendors table). This is to maintain referential integrity, which is achieved by specifying the primary key and foreign key in the definition of the table.

10.1 Create a connection

The creation of the join is very simple, specify all the tables to be joined and how they are related.

# 列出产品的供应商及其价格
select vend_name, prod_name, prod_price from vendors, products 
where vendors.vend_id = products.vend_id order by prod_price desc;

Here, after where , the fully qualified column name is used to specify that MySQL matches the vender_id column of the vendor table and the vender_id field of the products table.

When the quoted column may be ambiguous, you must use the fully qualified column name, because MySQL does not know which column you are referring to.

When joining two tables, what is actually done is to pair each row of one table with each row of the other table, so the where clause is used as the filter condition to filter out only the column where vendors.vend_id = products.vend_id that contains the specified join condition, and there is no where clause. , Will return the length product of the two tables. This is called Cartesian product (cartesian product). You can run this sentence to see:

# 返回两个表长度乘积行
select vend_name, prod_name, prod_price from vendors, products;

All connections should always use the connection condition, otherwise a Cartesian product will be obtained.

10.2 Join multiple tables

A select statement can also join multiple tables. For example, you need to list the product information, order information, and supplier information of an order, and the product information you are looking for is scattered in the three tables of supplier, product, and order information.

# 将订单 20005 的产品信息、订单信息、供应商信息找出来
select prod_name, vend_name, prod_price, quantity
from orderitems,
     products,
     vendors
where products.vend_id = vendors.vend_id
  and orderitems.prod_id = products.prod_id
  and order_num = 20005;

Here, and used to connect multiple join conditions, which defines what is used as an association between the three tables.

Note: MySQL can be very resource intensive to associate multiple tables to process joins at runtime. Do not join unnecessary tables. The more tables that are connected, the more severe the performance degradation.

Here you can use joins to implement the example in Section 9.1. Previously, it was implemented using a subquery to obtain the customer ID from the orders table, and then use the customer ID to go to the customer table custormers to find the corresponding customer information.

# 使用联结来实现 9.1 的例子
select customers.cust_id, cust_name
from orders, customers, orderitems
where orders.order_num = orderitems.order_num
  and customers.cust_id = orders.cust_id
  and prod_id = 'TNT2';                                # 由于三个表中只有一个表有 prod_id,所以不需要限定表名

Here to mention, not only columns can be aliased, but tables can also be aliased. The usage is the same as that of column aliases:

# 把前面这个句子起别名
select c.cust_id, cust_name
from orders o, customers c, orderitems oi
where o.order_num = oi.order_num
  and c.cust_id = o.cust_id
  and prod_id = 'TNT2';

This not only shortens the SQL statement, but also allows the select statement. The aliases created at the same time can be used not only in the select clause, but also in the where , order by clauses and other parts of the statement.

10.3 inner join

The join used so far is called equivalent join (equijoin), which is based on the equality test between two tables, also known as internal join . In fact, a slightly different syntax can be used for this type of connection to clearly specify the type of connection. The following select statement returns exactly the same data as the previous example:

# 列出产品的供应商及其价格
select vend_name, prod_name, prod_price
from vendors
         inner join products
                    on vendors.vend_id = products.vend_id;

The join condition here uses the on clause instead of where . Both of these grammars can achieve results. Although the use of the where clause to define the connection is indeed relatively simple, the use of a clear connection grammar can ensure that the connection conditions are not forgotten, and sometimes this can also affect performance.

10.4 Self-connection

For example, a product has a quality problem, and now I want to find out all the product information provided by the supplier of this product. According to the subquery described earlier, we can first find the supplier of the corresponding product, and then find the list of products with this supplier ID:

# 先找到产品 ID 为 TNT1 的供应商 ID,然后找到对应供应商 ID 提供的产品列表
select prod_id, prod_name, vend_id
from products
where vend_id in (
    select vend_id
    from products
    where prod_id = 'TNT1'
);

It can be achieved using subqueries, and it can be done using joins. This is self-joining:

# 自联结
select p1.prod_id, p1.prod_name, p1.vend_id
from products p1,
     products p2
where p1.vend_id = p2.vend_id
  and p2.prod_id = 'TNT1';

The two tables of the self-join query are the same table, so the products table needs to be aliased separately for distinction, and select clause also need to be qualified, because the same fields appear in both tables.

Self-join is usually used as an external statement to replace the subquery statement used when retrieving data from the same table. Although the final result is the same, sometimes processing joins is much faster than processing subqueries. You should try two methods to determine which one has better performance.

10.5 Natural Connection

Whenever a table is joined, there should be at least one listed in more than one table (columns being joined). The standard join returns all data, even the same column appears multiple times. Natural joins exclude multiple occurrences, so that each column is returned only once.

Natural connection is that you select only those unique columns. This is generally done by using wildcards for the table and explicit subsets for all other table columns.

# 自选择唯一的通配符只对第一个表使用。所有其他列明确列出,所以没有重复的列被检索出来。
select v.*, p.prod_id
from vendors v,
     products p
where v.vend_id = p.vend_id;

10.6 outer join

In some cases, the join contains rows that have no associated rows in the related table. This type of join is called external join .

for example:

  • Count how many orders each customer has placed, including those customers who have not yet placed an order;
  • List all products and the quantity ordered, including products that no one ordered;
  • Calculate the average sales volume, including those customers who have not yet placed an order.

At this time, the link needs to contain those rows that have no associated rows.

For example, to retrieve all users and their orders, list those without orders:

# 内部联结,查找用户对应的订单
select c.cust_id, o.order_num
from customers c
         inner join orders o
                    on c.cust_id = o.cust_id;
                    
# 左外部联结,将没有下单过的顾客行也列出来
select c.cust_id, o.order_num
from customers c
         left outer join orders o
                         on c.cust_id = o.cust_id;
                         
# 右外部联结,列出所有订单及其顾客,这样没下单过的顾客就不会被列举出来
select c.cust_id, o.order_num
from customers c
         right outer join orders o
                          on c.cust_id = o.cust_id;

When using the outer join syntax, you must use the right or left keyword to specify the table that includes all its rows. right refers to the table on the right of outer join left refers to the table on the left of outer join The above uses left outer join to select all rows from the left table custermers of the from In order to select all rows from the table on the right, right outer join should be used.

The left outer join can be from or where . Which one is more convenient for you.

10.7 Using joins with aggregate functions

For example, if you want to retrieve the number of orders placed by a customer, write 0 even if there is no one. At this time, use grouping and the count to count the number:

# 找到每个顾客所下订单的数量,并降序排列
select c.cust_id, c.cust_name, count(o.order_num) count_orders
from customers c
         left outer join orders o on c.cust_id = o.cust_id
group by c.cust_id
order by count_orders desc;

Because even if the customer does not place an order, it will still be in the result, so put the customer table on the left and use the left outer link.

11. Combination query

MySQL allows multiple select statements to be executed and the results are returned as a single query result set. These combined queries are usually called unions or compound queries.

There are two situations where you need to use combined queries:

  1. Return similar structured data from different tables in a single query;
  2. Perform multiple queries on a single table, and return data in a single query.

In most cases, the combined query can be replaced by a single query where You can try these two methods for specific scenarios to see which one performs better for a particular query.

11.1 Create a combined query union

When the query results come from multiple tables, but there is no correlation between the multiple tables, a combined query is often used at this time. Put the keyword union between each select statement.

# 比如需要列出商品价格小于等于 10 而且是供应商 ID 为 1005 或 1003 的产品信息
select prod_id, prod_name, prod_price, vend_id from products where prod_price <= 10
union
select prod_id, prod_name, prod_price, vend_id from products where vend_id in (1005, 1003);

# 实际上这句也可以通过 where 语句代替
select prod_id, prod_name, prod_price from products where prod_price <= 10 or vend_id in (1005, 1003);
  1. In some cases, such as more complex filter conditions and the need to retrieve data from multiple tables, union may be simpler to use 06136d5c30dcb4.
  2. union Each query must contain the same columns, expressions, and aggregate functions, but each column does not need to be listed in the same order.
  3. The column data types must be compatible, the types need not be exactly the same, but must be implicitly converted by the database management system.
  4. The sorting of the combined query order by can only appear after the last select statement, and the different select statements cannot be sorted separately.

11.2 Include or cancel duplicate rows union (all)

Two lines of union separate statements may return duplicate rows, but the actual result of the previous example does not contain duplicate rows. This is because the union keyword automatically removes the duplicate rows. If you don’t want to remove duplicates, you can use the union all keyword. .

# 不去重重复行
select prod_id, prod_name, prod_price, vend_id from products where prod_price <= 10
union all
select prod_id, prod_name, prod_price, vend_id from products where vend_id in (1005, 1003);

If duplicate rows are required, the where keyword cannot be used to achieve the same effect at this time.

12. Addition, deletion and modification of data

What I said earlier is all about querying data. This chapter will add, delete, and modify data.

12.1 Insert into

Data insertion uses the insert keyword, which can insert one row, multiple rows of data, or the results of certain queries.

# 插入一条数据到顾客表中
insert into customers
values (null, 'Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null);

Here insert a piece of data into the customer table, the data stored in each table column needs to values , given in the order in which the table was created. If a column has no value, give null. Although the attribute of the first data corresponding to the cust_id column is not null , this column is auto_increment, which is self-increasing. MySQL will automatically ignore the null you give and automatically increase the value and then fill it.

However, it is not safe to use the above values clause, because this method of injecting data depends entirely on the order of the input data. If the table structure changes, it will cause the input data to be misplaced.

The safe way to insert data is like this:

# 安全但繁琐的插入方式
insert into customers(cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
values ('Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null);

Here, the column names corresponding to the data in the following parentheses are given in the previous parenthesis, so that even if the table structure or order changes, the data can be inserted correctly.

You can see that the column cust_id is omitted. When the following conditions are met, the column can be omitted:

  1. The column is defined to allow null values;
  2. The default value is given in this column when the table is defined, which means that if no value is given, the default value will be used.

If it cannot be omitted but it is omitted, an error will be reported.

insert operation can be time-consuming, especially when there are many indexes that need to be updated, and it can reduce the performance of the select statement waiting to be processed. If data retrieval is the most important, you can by insert and into add keywords between low_priority , reduce insert priority statement, which also applies to the below mentioned update and delete statements.

12.2 Insert multiple rows

insert statement introduced above can insert one row at a time. If you want to insert multiple rows at a time, listing the column names every time is more cumbersome. You can use the following method:

# 插入多个行
insert into customers(cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
values ('Zhang San', '001 street', 'ShangHai', 'SH', '666666', 'ZH', null, null),
       ('Li Si', '002 street', 'BeiJing', 'BJ', '878787', 'ZH', null, '123123@163.com');

values clause, continue to enclose the field in parentheses to add a new line, with a comma in between. This can improve the performance of database processing, because a single insert statement can handle multiple inserts faster than using multiple insert statements.

12.3 Insert the retrieved data insert select

insert can select statement into the table, which is insert select . For example, you want to insert data queried from another table into this table:

# 从别的表中找出数据,并插入 customers 表中
insert into customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
select id, name, address, city, state, zip, country, contact, email
from custnew;

select can also omit the cust_id, so that a new cust_id can be automatically generated for insert In addition, you can see that the column name of the select statement does not correspond to the column name after the insert into insert select uses position correspondence. The first column returned by the select statement corresponds to cust_id, and the second column corresponds to cust_name, which corresponds in turn . select statement, you can also add the where clause to filter.

12.4 Modify data update

update statement is used to modify the data in the table. update . Don't forget to add the where clause, because if you are not careful, all rows in the table will be updated.

# 更新 id 为 10005 的用户的信息
update customers set cust_email = '888@qq.com' where cust_id = 10005;

where clause is not used here update will update the cust_email field of all rows in this table, so be careful.

To delete the value of a certain row and column, you can modify the value to null.

The way to update multiple fields is also very simple:

# 更新多个字段
update customers
set cust_email   = '666@qq.com',
    cust_contact = 'S Zhang'
where cust_id = 10005;

If you update statement, and an error occurs when updating one or more of these rows, the entire update operation is cancelled (all rows updated before the error occurred are restored to their original values). Even for the error occurs, continue to be updated, you can update use after ignore keyword.

update statement can use subqueries to update the column data with the data retrieved select

12.5 Delete data delete

delete statement can be used to delete specific rows or all rows from the table. Be careful when using the delete statement. Don't forget to add the where clause, because you will delete all rows in the table if you are not careful.

# 删除顾客表中顾客 id 为 10008 的行
delete from customers where cust_id = 10008;

If the where clause is removed, then all rows in this table are deleted, but this table is not deleted, and another statement drop is used to delete the table. In addition, the faster statement to delete all rows in a table is truncate table , because delete deletes data row by row, while truncate deletes the original table and rebuilds a table.

Note that update and delete , because MySQL has not been revoked. It is recommended to use select for testing before using it to prevent data loss due to incorrect clauses of where

13. Create and manipulate tables

13.1 create table

create.sql file that was executed for the whole point of data. It can be opened with VSCode/Webstorm/Navivate/ text. The first line of this file except for the comment is the table creation statement:

CREATE TABLE customers
(
  cust_id      int       NOT NULL AUTO_INCREMENT,
  cust_name    char(50)  NOT NULL ,
  cust_address char(50)  NULL ,
  cust_city    char(50)  NULL ,
  cust_state   char(5)   NULL ,
  cust_zip     char(10)  NULL ,
  cust_country char(50)  NULL DEFAULT 'ZH',  # 指定默认值
  cust_contact char(50)  NULL ,
  cust_email   char(255) NULL ,
  PRIMARY KEY (cust_id)          # 指定主键
) ENGINE=InnoDB;

From here you can see the format of create table

If you want to create a table does not exist at the time, it should be in front of the table name, create table after adding if not exists . This will first check whether the table name already exists, and create it if it does not exist.

For auto_increment , each table can only have one auto_increment , and it must be indexed. When y


SHERlocked93
6.4k 声望4.9k 粉丝