Issue 34: MySQL table redundancy design

introduction:

In the previous article, I introduced the standardized table design of MySQL paradigm. Paradigm design has the following advantages:
How to eliminate data redundancy to the extreme, thereby reducing the additional occupation of the disk by the relational table.
The relationship between the various tables is very clear and very readable.

text:

But the paradigm design also has disadvantages:

The table paradigm is standardized. The higher the level, the greater the number of tables. For example, 2NF may have more tables than 1NF, 3NF may have more tables than 2NF, and so on.
The more tables there are, the more tables may need to be associated when querying. We know that the cost of retrieving multiple tables is much larger than the cost of retrieving a single table.

To sum up, we need to combine the advantages of paradigm design and find ways to solve the shortcomings of paradigm design. The idea brought by this is to allow a certain degree of redundancy in data and use space for time. For example, the current microservice design, NOSQL database, etc. do not consider the paradigm standard theory at all.

This kind of thinking is the focus of today, and it is called anti-paradigm for short.

Anti-paradigm is to reduce the original high-level paradigm design to a low-level paradigm design through a certain amount of redundancy to reduce the shortcomings of the increase in the number of tables brought by the paradigm design. For example, a table that satisfies BCNF can be reduced to 3NF, or even to 2NF, to 1NF through redundancy of certain fields. Some scenarios do not even need to meet 1NF for query performance. For example, in table t1, originally there are 100 fields, of which 5 are commonly used, and the remaining 95 are not commonly used. Then you can integrate these 95 fields into a large object field, such as a JSON type field.

Next, we use a simple example to see how the anti-paradigm can streamline query statements and improve efficiency.

The following 5 relationship tables represent the employee table, department table, salary table, employee-department relationship table, employee-salary relationship table.

Employee table:

(debian-ytt1:3500)|(ytt)>desc employee;
+-----------------+------------------+------+-----+---------+-------+
| Field           | Type             | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| employee_number | varchar(64)      | NO   | PRI | NULL    |       |
| employee_name   | varchar(64)      | YES  |     | NULL    |       |
| gender          | char(1)          | YES  |     | NULL    |       |
| age             | tinyint unsigned | YES  |     | NULL    |       |
| register_date   | date             | YES  |     | NULL    |       |
+-----------------+------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

Department table:


(debian-ytt1:3500)|(ytt)>desc dept;
+------------+------------------+------+-----+---------+-------+
| Field      | Type             | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+-------+
| dept_id    | tinyint unsigned | NO   | PRI | NULL    |       |
| dept_name  | varchar(64)      | YES  |     | NULL    |       |
| found_date | datetime         | YES  |     | NULL    |       |
+------------+------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

Salary table:


(debian-ytt1:3500)|(ytt)>desc salary;
+--------------+------------------+------+-----+---------+-------+
| Field        | Type             | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+-------+
| salary_level | tinyint unsigned | NO   | PRI | NULL    |       |
| base_salary  | decimal(10,2)    | YES  |     | NULL    |       |
| extra_salary | decimal(10,2)    | YES  |     | NULL    |       |
+--------------+------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

Employee and salary relationship table:

(debian-ytt1:3500)|(ytt)>desc employee_vs_salary;
+-----------------+------------------+------+-----+---------+-------+
| Field           | Type             | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| employee_number | varchar(64)      | NO   | PRI | NULL    |       |
| salary_level    | tinyint unsigned | NO   | PRI | NULL    |       |
+-----------------+------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Employee and department relationship table:


(debian-ytt1:3500)|(ytt)>desc employee_vs_dept;
+-----------------+------------------+------+-----+---------+-------+
| Field           | Type             | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| employee_number | varchar(64)      | NO   | PRI | NULL    |       |
| dept_id         | tinyint unsigned | NO   | PRI | NULL    |       |
| is_manager      | char(1)          | YES  | MUL | NULL    |       |
+-----------------+------------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

Now there are the following query requirements:

Query the name of the department manager of each department and the corresponding department name.
Query the name of the first employee in each department.
List the name of the employee with the highest salary in the "Service Department".
Count the number of men and women in each department.

Next, let's implement the above-mentioned query requirements separately.

The first requirement requires the association of the three tables employee, dept, and employee_vs_dept to get the result.

(debian-ytt1:3500)|(ytt)>select
    ->   a.employee_name as '管理者名字',
    ->   b.dept_name as '所属部门'
    -> from
    ->   employee as a,
    ->   dept as b,
    ->   employee_vs_dept as c
    -> where
    ->   a.employee_number = c.employee_number
    ->   and b.dept_id = c.dept_id
    ->   and c.is_manager = '是';
+-----------------+-----------------+
| 管理者名字      | 所属部门        |
+-----------------+-----------------+
| 小张            | 研发部          |
| 小徐            | 服务部          |
| 小李            | 材料部          |
| 小倩            | 财务部          |
| 小娜            | 人力资源部      |
| 小聂            | 市场部          |
| 小婷            | 公关部          |
| 小北            | 销售部          |
| 小婉            | 行政部          |
+-----------------+-----------------+
9 rows in set (0.00 sec)

For the second requirement, assuming that no employee has resigned so far, then three tables of employee, dept, and employee_vs_dept are also required to associate to get the result. It can be seen that the first employee in each department has been promoted to the head of the department.

(debian-ytt1:3500)|(ytt)>select
    ->   a.employee_name as '员工姓名',
    ->   b.dept_name as '所属部门'
    -> from
    ->   employee as a,
    ->   dept as b,
    ->   (
    ->     select
    ->       min(employee_number) as employee_number,
    ->       dept_id
    ->     from
    ->       employee_vs_dept
    ->     group by
    ->       dept_id
    ->   ) c
    -> where
    ->   a.employee_number = c.employee_number
    ->   and b.dept_id = c.dept_id;
+---------------+-----------------+
| 员工姓名       | 所属部门       |
+---------------+-----------------+
| 小娜          | 人力资源部      |
| 小聂          | 市场部          |
| 小北          | 销售部          |
| 小婷          | 公关部          |
| 小徐          | 服务部          |
| 小婉          | 行政部          |
| 小李          | 材料部          |
| 小倩          | 财务部          |
| 小张          | 研发部          |
+---------------+-----------------+
9 rows in set (0.02 sec)

For the third requirement, the five tables of employee, dept, employee_vs_dept, salary, and employee_vs_salary need to be jointly queried to obtain the result.

(debian-ytt1:3500)|(ytt)>select
    ->   a.employee_name as '员工姓名',
    ->   b.dept_name as '所属部门'
    -> from
    ->   employee as a,
    ->   dept as b,
    ->   employee_vs_dept as c,
    ->   (
    ->     select
    ->       b.employee_number
    ->     from
    ->       (
    ->         select
    ->           max(salary_level) as salary_level
    ->         from
    ->           salary
    ->       ) as a,
    ->       employee_vs_salary as b
    ->     where
    ->       a.salary_level = b.salary_level
    ->   ) as d
    -> where
    ->   a.employee_number = c.employee_number
    ->   and b.dept_id = c.dept_id
    ->   and d.employee_number = a.employee_number
    ->   and b.dept_name = '服务部';
+---------------+-----------+
| 员工姓名       | 所属部门 |
+---------------+-----------+
| 小郑4826      | 服务部    |
...
| 小王2381      | 服务部    |
+---------------+-----------+
93 rows in set (0.01 sec)

For the fourth requirement, you need to combine the query tables dept, employee, and employee_vs_dept to get the result

(debian-ytt1:3500)|(ytt)>select
    ->   a.dept_name as '部门名称',
    ->   sum(case b.gender when '男' then 1 end) as '男',
    ->   sum(case b.gender when '女' then 1 end) as '女'
    -> from
    ->   dept as a,
    ->   employee as b,
    ->   employee_vs_dept as c
    -> where
    ->   a.dept_id = c.dept_id
    ->   and b.employee_number = c.employee_number
    -> group by
    ->   a.dept_id;
+-----------------+------+------+
| 部门名称        | 男   | 女   |
+-----------------+------+------+
| 人力资源部      |  982 | 1179 |
| 市场部          | 1036 | 1163 |
| 销售部          | 1078 | 1200 |
| 公关部          | 1031 | 1190 |
| 服务部          | 1043 | 1199 |
| 行政部          | 1017 | 1224 |
| 材料部          |  956 | 1180 |
| 财务部          | 1076 | 1219 |
| 研发部          | 1037 | 1190 |
+-----------------+------+------+
9 rows in set (0.05 sec)

For the above four requirements, if you want to find out the results, you must perform a joint query on multiple tables.

The overhead of joint query is very large. In order to eliminate unnecessary joint query, the table cannot be designed according to the paradigm concept at this time. A certain anti-paradigm idea is required. For each requirement, the necessary redundant columns can be added to simplify the query. .

Next, let's look at how to add the necessary redundant columns for each table to improve query performance for the above several requirements.

For the first type, you can add the department name and whether it is the person in charge of the department to the employee table.

(debian-ytt1:3500)|(ytt)>create table employee2 like employee;
Query OK, 0 rows affected (0.07 sec)

(debian-ytt1:3500)|(ytt)>insert into employee2 select * from employee;
Query OK, 20000 rows affected (1.69 sec)
Records: 20000  Duplicates: 0  Warnings: 0

(debian-ytt1:3500)|(ytt)>alter table employee2 add dept_name varchar(64), add is_manager char(1);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

(debian-ytt1:3500)|(ytt)>UPDATE employee2 AS a,
    ->          employee_vs_dept AS b,
    ->          dept AS c SET a.dept_name = c.dept_name,
    ->          a.is_manager = b.is_manager
    -> WHERE a.employee_number = b.employee_number
    ->         AND b.dept_id = c.dept_id;

Query OK, 20000 rows affected (4.40 sec)
Rows matched: 20000  Changed: 20000  Warnings: 0

(debian-ytt1:3500)|(ytt)>alter table employee2 add key idx_is_manager(is_manager);
Query OK, 0 rows affected (0.34 sec)
Records: 0  Duplicates: 0  Warnings: 0

After adding redundant columns, you only need to query the employee table, and there is no need to associate more tables.


(debian-ytt1:3500)|(ytt)>select employee_name '管理者名字', dept_name '所属部门' from employee2 where is_manager = '是';
+-----------------+-----------------+
| 管理者名字      | 所属部门        |
+-----------------+-----------------+
| 小张            | 研发部          |
| 小徐            | 服务部          |
| 小李            | 材料部          |
| 小倩            | 财务部          |
| 小娜            | 人力资源部      |
| 小聂            | 市场部          |
| 小婷            | 公关部          |
| 小北            | 销售部          |
| 小婉            | 行政部          |
+-----------------+-----------------+
9 rows in set (0.00 sec)

At this time, for the second requirement, you only need to query the employee table.

(debian-ytt1:3500)|(ytt)>SELECT employee_name '员工姓名',dept_name '所属部门'
    -> FROM
    ->     (SELECT employee_name,
    ->         dept_name,
    ->          row_number() over(partition by dept_name
    ->     ORDER BY  employee_number) AS rn
    ->     FROM employee2 ) t
    -> WHERE rn = 1;
+--------------+-----------------+
| 员工姓名     | 所属部门        |
+--------------+-----------------+
| 小娜         | 人力资源部      |
| 小婷         | 公关部          |
| 小聂         | 市场部          |
| 小徐         | 服务部          |
| 小李         | 材料部          |
| 小张         | 研发部          |
| 小婉         | 行政部          |
| 小倩         | 财务部          |
| 小北         | 销售部          |
+--------------+-----------------+
9 rows in set (0.08 sec)

For the third requirement, you only need to add a salary grade field to the employee table again.

(debian-ytt1:3500)|(ytt)>alter table employee2 add salary_level tinyint unsigned;
Query OK, 0 rows affected (0.09 sec)
Records: 0  Duplicates: 0  Warnings: 0

(debian-ytt1:3500)|(ytt)>UPDATE employee2 AS a,
    ->          employee_vs_salary AS b SET a.salary_level = b.salary_level
    -> WHERE a.employee_number = b.employee_number;
Query OK, 20000 rows affected (1.77 sec)
Rows matched: 20000  Changed: 20000  Warnings: 0

(debian-ytt1:3500)|(ytt)>alter table employee2 add key idx_salary_level (salary_level);
Query OK, 0 rows affected (0.34 sec)
Records: 0  Duplicates: 0  Warnings: 0

At this time, the query is simplified to:

(debian-ytt1:3500)|(ytt)>SELECT employee_name '员工姓名',dept_name '所属部门'
    -> FROM employee2
    -> WHERE salary_level =
    ->     (SELECT max(salary_level)
    ->     FROM salary)
    ->         AND dept_name ='服务部';
+--------------+--------------+
| 员工姓名     | 所属部门     |
+--------------+--------------+
| 小郑4826     | 服务部       |
...
+--------------+--------------+
93 rows in set (0.00 sec)

Also for the fourth requirement, you only need to query the employee table:

(debian-ytt1:3500)|(ytt)>select dept_name '部门名称', sum(if(gender='男',1,0)) '男',sum(if(gender='女',1,0
+-----------------+------+------+
| 部门名称        | 男   | 女   |
+-----------------+------+------+
| 人力资源部      |  982 | 1179 |
| 公关部          | 1031 | 1190 |
| 市场部          | 1036 | 1163 |
| 服务部          | 1043 | 1199 |
| 材料部          |  956 | 1180 |
| 研发部          | 1037 | 1190 |
| 行政部          | 1017 | 1224 |
| 财务部          | 1076 | 1219 |
| 销售部          | 1078 | 1200 |
+-----------------+------+------+
9 rows in set (0.05 sec)

Summarize:

Paradigm design standards are the foundation of relational databases. The anti-paradigm concept is not a standard, but a way to simplify query statements and improve query performance to customize the table structure. The purpose is to make query statements more concise and more efficient. .

What else do you want to know about the technical content of MySQL? Hurry up and leave a message to tell the editor!

Issue 34: MySQL table redundancy design

introduction:

text:

But the paradigm design also has disadvantages:

Next, we use a simple example to see how the anti-paradigm can streamline query statements and improve efficiency.

Now there are the following query requirements:

Next, let's implement the above-mentioned query requirements separately.

For the above four requirements, if you want to find out the results, you must perform a joint query on multiple tables.

Summarize:

爱可生开源社区

引用和评论

SQLShift 重大更新：Oracle→PostgreSQL 存储过程转换功能上线！

MySQL慢查询日志：性能优化的终极指南

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析一】

一键实现 Oracle 数据整库同步至 Apache Doris

Issue 34: MySQL table redundancy design

introduction:

text:

But the paradigm design also has disadvantages:

Next, we use a simple example to see how the anti-paradigm can streamline query statements and improve efficiency.

Now there are the following query requirements:

Next, let's implement the above-mentioned query requirements separately.

For the above four requirements, if you want to find out the results, you must perform a joint query on multiple tables.

Summarize:

爱可生开源社区

引用和评论

SQLShift 重大更新：Oracle→PostgreSQL 存储过程转换功能上线！​​

MySQL慢查询日志：性能优化的终极指南

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析 一】

一键实现 Oracle 数据整库同步至 Apache Doris

SQLShift 重大更新：Oracle→PostgreSQL 存储过程转换功能上线！

php+mysql 搭建一个在线游戏网站目前已有2000+游戏【代码解析一】