Certain rules must be followed when the relational database is designed. In particular, the database design paradigm is now briefly introduced 1NF (first normal form), 2NF (second normal form), 3NF (third normal form) and BCNF, and the fourth and fifth normal forms will be introduced later.
When you design a database, if you can meet these paradigms, you are a master of database design.
(1NF) : In each specific relationship r in the relational pattern R, if each attribute value is the smallest unit of data that cannot be divided, then R is said to be a relationship in the first normal form.
Example: If the employee number, name, and phone number form a table (a person may have an office phone number and a home phone number), there are three ways to standardize into 1NF: One is to repeatedly store the employee number and name . In this way, the keyword can only be a phone number. second is the employee number as the keyword , and the phone number is divided into two attributes: third is the employee number as the keyword , but it is mandatory that each record can only have one phone number. Of the above three methods, the first method is the most undesirable, and the latter two cases are selected according to actual conditions.
(2NF) : If all non-primary attributes in the relational pattern R (U, F) completely depend on any candidate key, then the relation R is said to belong to the second normal form. Example: Course selection relationship SCI (SNO, CNO, GRADE, CREDIT), where SNO is the student number, CNO is the course number, GRADE is the grade, and CREDIT is the credit. Based on the above conditions, the keywords are combined keywords (SNO, CNO). There are the following problems when using the above relational model in applications:
- a. Data redundancy , assuming that the same course is taken by 40 students, the credits will be repeated 40 times.
- b. Update exception . If the credits of a certain course are adjusted, the CREDIT value of the corresponding tuple must be updated, and the credits of the same course may be different.
- c. Insert exception . If a new course is planned to be opened, since no one takes an elective course, there is no student number key word, and the course and credits can only be deposited when someone elects.
- d. Delete the exception . If the student has completed the course, delete the elective record from the current database. Some courses have not yet been taken by freshmen, so the courses and credit records cannot be saved.
Reason : The non-keyword attribute CREDIT depends only on the CNO function, that is, CREDIT partially depends on the combined keywords (SNO, CNO) instead of completely. Solution: Divide into two relational modes SC1 (SNO, CNO, GRADE) and C2 (CNO, CREDIT). New relationships include two relationship patterns between them by SC1 foreign key in CNO linked, then the natural connection when necessary, restore the original relationship.
(3NF) : If all non-primary attributes in the relational pattern R(U, F) do not have any transfer trust to any candidate keywords, then the relation R is said to belong to the third normal form. Example: For example, S1 (SNO, SNAME, DNO, DNAME, LOCATION) each attribute represents student ID, name, department, department name, department address. The keyword SNO determines each attribute. Since it is a single keyword, there is no problem of partial dependence. It must be 2NF. However, there must be a lot of redundancy in this relationship. Several attributes DNO, DNAME, and LOCATION where the student is located will be stored repeatedly, inserted, deleted, and modified, which will also produce a situation similar to the above example. Reason: There is a transitive dependency in the relationship. That is, SNO -> DNO (and DNO -> SNO does not exist), DNO -> LOCATION, so the decision of the keyword SNO on the LOCATION function is achieved by transitively dependent on SNO -> LOCATION. In other words, SNO does not directly determine the non-primary attribute LOCATION. solves the purpose : no transitive dependency can be left in each relational model. Solution : Divided into two relations S (SNO, SNAME, DNO), D (DNO, DNAME, LOCATION) Note: relationship S cannot be without the foreign keyword DNO. Otherwise, the connection between the two relationships is lost.
BCNF : If all attributes (including primary attributes and non-primary attributes) of the relationship pattern R (U, F) do not pass any candidate keywords that depend on R, then the relationship R is said to belong to BCNF. Or the relational model R, if each determinant contains keywords (rather than being contained by keywords), then the relational model of BCNF. Example: Parts management relationship model WPE (WNO, PNO, ENO, QNT) respectively table warehouse number, part number, employee number, and quantity. There are the following conditions: a. There are multiple employees in a warehouse. b. An employee only works in one warehouse. c. One type of accessories in each warehouse is handled by a dedicated person, but one person can manage several accessories. d. The accessories of the same model can be placed in several warehouses. analysis:
- (1) QNT cannot be determined by the above PNO, which is determined by the combination of attributes (WNO, PNO), and there is a functional dependency (WNO, PNO) -> ENO.
- (2) Since one kind of accessory in each warehouse is in charge of a special person, and one person can manage several kinds of accessories, there are combination attributes (WNO, PNO) to determine the person in charge, with (WNO, PNO) -> ENO.
Because an employee only works in one warehouse, there is ENO -> WNO. Since one kind of accessory in each warehouse is in charge of a special person, and an employee only works in one warehouse, there is (WNO, PNO) -> QNT. Look for candidate keywords, because (WNO, PNO) -> QNT, (WNO, PNO) -> ENO, so (WNO, PNO) can determine the entire tuple, which is a candidate keyword. According to ENO->WNO, (ENO, PNO)->QNT, (ENO, PNO) can also determine the entire tuple as another candidate key. attributes ENO, WNO, and PNO are all primary attributes, and there is only one non-primary attribute QNT. is completely functionally dependent on any candidate keyword, and is directly dependent, so the relationship model is 3NF. Analyze the main attributes. Because ENO->WNO, the main attribute ENO is the determining factor of WNO, but it is not a keyword itself, but a part of the combined keyword. This causes the main attribute WNO to partially depend on another candidate keyword (ENO, PNO), because (ENO, PNO) -> ENO but the reverse is not true, and PNO -> WNO, so (ENO, PNO) -> WNO It is transitive dependency. Although there is no transitive dependence of non-primary attributes on candidate keywords, the transitive dependence of primary attributes on candidate keywords will also cause trouble. For example, a new employee is assigned to work in the warehouse, but is temporarily in the internship stage and is not independently responsible for the management tasks of certain accessories. Cannot be inserted into the relationship due to missing part of the PNO of the keyword. Also, if a person changes to take responsibility for safety regardless of the accessories, the employee will be deleted when the accessories are deleted.
- solution: divided into management EP (ENO, PNO, QNT), the key word is (ENO, PNO) work EW (ENO, WNO) its key word is ENO
- Disadvantages: has poor retention of functional dependence after decomposition.
In this case, due to the decomposition, the functional dependency (WNO, PNO) -> ENO is lost, thus destroying the original semantics. It does not reflect that a certain kind of part in each warehouse is handled by a dedicated person. It is possible that a part is managed by two or more people at the same time. Therefore, the relationship model after decomposition reduces some integrity constraints. A relationship is decomposed into multiple relations. To make the decomposition meaningful, the minimum requirement is not to lose the original information after decomposition. This information includes not only the data itself, but also the mutual constraints between the data represented by the functional dependence. decomposition is to achieve a higher degree of standardization, but two issues must be considered when decomposing: lossless connectivity and maintaining functional dependence. Sometimes it is often impossible to achieve both lossless connectivity and complete functional dependence. There are trade-offs as needed.
The four paradigms from 1NF to BCNF have the following relationship: BCNF contains 3NF contains 2NF contains 1NF
summary:
- Purpose : The purpose of standardization is to make the structure more reasonable, eliminate storage exceptions, make data redundancy as small as possible, and facilitate insertion, deletion and update
- Principle : Follow the principle of "one thing, one place" of conceptual simplification, that is, a relationship model describes an entity or a connection between entities. The essence of norms is the simplification of concepts.
- Method : Decompose the relational pattern projection into two or more relational patterns. Requirement: The set of decomposed relationship patterns should be "equivalent" to the original relationship pattern, that is, the original relationship can be restored without losing information through natural connection, and reasonable connections between attributes are maintained.
- Note : A relational pattern knot decomposition can get a set of different relational patterns, that is to say, the decomposition method is not unique. The requirement of minimum redundancy must be realized on the premise that the decomposed database can express all the information of the original database. Its fundamental goal is to save storage space, avoid data inconsistencies, improve the efficiency of relational operations, and meet application requirements at the same time. In fact, it is not necessarily required that all modes reach BCNF. Sometimes deliberately keeping some redundancy may be more convenient for data query. This is especially true for database systems that are not updated frequently and are frequently queried. In relational databases, in addition to functional dependence, there are problems of multi-value dependence and connection dependence, which puts forward higher-level standardization requirements such as the fourth normal form and the fifth normal form. Here, I will talk about it later.
Dear friends, what do you think after reading it? In fact, any book on the basic theory of databases will talk about these things. Considering that many netizens are renunciations halfway through, they come to make databases. I am looking for a large copy of a book. If you have any questions, please don't ask me. Go and find a book on relational database theory. Maybe it will be of great help to you. It is said that the above is the basic theory. Please think about it. Have you considered following the above paradigms when you are doing database design? Have you thought about it when database design is not doing well. Compared with the above, which paradigm is violated? Few of the database designs that I have seen are in line with the above paradigms. Generally speaking, everyone can follow the first paradigm, and there are very few people who fully comply with the second and third paradigms. Those who follow must be You are a master of database design. The BCNF paradigm has fewer opportunities to appear and will destroy the integrity. You can ignore it when designing. Of course, you can solve its shortcomings through triggers in ORACLE. When we do design together in the future, we also hope that everyone will abide by the above paradigms.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。