In-depth understanding of aggregation in domain-driven design

Introduction to aggregation mode is one of the more difficult to understand the mode structure of DDD, and it is also a key obstacle in the learning curve of DDD. Reasonable design of aggregation can clearly express business consistency, and it is also easier to bring about clear realization. Unreasonable design of aggregation, even if there is no concept of aggregation in the design, is the opposite.

Author | Songhua
Source | Alibaba Technical Official Account

The aggregation mode is one of the more difficult to understand DDD's mode structure, and it is also a key obstacle in the DDD learning curve. Reasonable design of aggregation can clearly express business consistency, and it is also easier to bring about clear realization. Unreasonable design of aggregation, even if there is no concept of aggregation in the design, is the opposite.

The concept of aggregation is not complicated. This article hopes to return to the essence of aggregation and give some valuable suggestions on the definition and practical operation of aggregation.

What is the core problem solved by one aggregation

Let's first look at the definition of aggregation in DDD Reference.

Divide entities and value objects into aggregates and define boundaries around the aggregates. Choose an entity as the root of each aggregate, and allow only external objects to hold references to the aggregate root. Define the properties and invariants of the aggregation as a whole, and assign the execution responsibility to the aggregation root or the specified framework mechanism.

This is a typical "pattern language" that explains what an aggregation is, what an aggregation root is, and how to use aggregation. However, the problem with the pattern language is that it is over-refined. If readers are already familiar with this pattern, it is easy to understand, but those who need to understand the most and those who are not familiar with these concepts are easy to feel incomprehensible. In order to deeply understand the nature of a pattern, we still have to return to the core problem it is trying to solve.

There is a famous saying in the field of software architecture:

"The architecture is not determined by the function of the system, but by the non-functional attributes of the system."

The straightforward explanation of this sentence is: If you do not consider factors such as performance, robustness, portability, modifiability, development costs, time constraints, etc., with any architecture and any method, the function of the system can always be achieved. , The project can always be developed, but the development time, future maintenance cost, and ease of function expansion are different.

Of course the reality is definitely not the case. We always hope that the system performs well in terms of understandability, maintainability, and scalability, so that the business goals behind the system can be achieved quickly and economically. However, in reality, unreasonable design methods may increase the complexity of the system. Let's first look at an example:

Assume that the problem area is an internal office supplies procurement system.

Employees of an enterprise can submit a purchase request through the system. A request contains several quantities and types of office supplies (called purchase items). (1)
The supervisor is responsible for approving the purchase application. (2)
After approval, the system will generate several orders according to different providers. (3)

For the same problem, there are several different design ideas, such as database-centric design, object-oriented design and "correct OO" DDD design.

If a database-centric modeling method is adopted, the database design will be carried out first-I do see that many teams are still adopting this method and spend a lot of time discussing the structure of the database. In order to avoid the chart being too large, we only give the form related to the purchase request. The structure is shown in the figure below:

Figure 1 Design from the perspective of the database

If you directly consider issues at such a low design level of the database, in addition to the cumbersome and error-prone database design, more importantly, you will face some more complex business rules and data consistency guarantee issues. E.g:

If the purchase request is deleted, the corresponding purchase items related to the purchase request and their associations need to be deleted-in database design, this constraint can be guaranteed by database foreign keys.
If multiple users are processing related data concurrently, a complicated locking mechanism may be involved. For example, if the approver is approving the purchase request, and the purchase submitter is modifying the purchase item, it may cause the reviewed data to be out of date or cause the purchase item update to fail.
If you update some related data at the same time, you may also face problems caused by partial update success-in database design, such constraints need to be guaranteed by transaction.

It is true that every problem has a solution, but, first, the discussion of the model has entered the realm of implementation prematurely, disconnected from the business concept, and it is not convenient for continuous collaboration with business personnel; second, technology The details are entangled with the details of the business rules, and it is easy to miss the other. Is there a solution that allows us to focus more on the problematic area instead of getting stuck in such technical details?

Object-oriented technology and ORM (Object-Relational Mapping) help us improve the level of abstraction of the problem. In the object-oriented world, the structure we see is this:

Figure 2 Design from the perspective of traditional OO

The object-oriented approach raises the level of abstraction and ignores unnecessary technical details. For example, there is no need to care about the technical details of foreign keys and associated tables. The number of model elements we need to care about is reduced, and the complexity is reduced accordingly. However, how business rules ensure that there are no strict implementation constraints in traditional object-oriented methods. E.g:

From a business perspective, if the approval of the purchase application has been passed, it should be illegal to update the purchase items of the purchase application again. However, in the object-oriented world, you cannot prevent programmers from writing code like this:

...
PurchaseRequest purchaseRequest = getPurchaseRequest(requestId);
PurchaseItem item = purchaseRequest.getItem(itemId);
item.setQuantity(1000);
savePurchaseItem(item);

Sentence 1 gets an instance of a purchase request; sentence 2 gets an item in the request. Statements 3 and 4 modify the purchase requisition entry and save it. If the purchase application has been approved, wouldn't this modification easily break through the purchase application budget?

Of course, programmers can add logical checks to the code to ensure consistency: the status of the purchaseRequest is always checked when modifying or saving the application form, and if the status is not draft, the modification is prohibited. However, considering that the PurchaseItem object can be taken out from anywhere in the code, and may be passed between different methods, if the OO design is improper, it may cause the business logic to be scattered everywhere. Without design constraints, the realization of this check is not an easy task.

Let us return to the essential thinking: if the purchase item is separated from the purchase request, is its own existence valuable? --no value. If there is no value: is the modification to the purchase item in nominal terms, but is it essentially a modification to the purchase item? Or is it actually a modification of the purchase request?

If we accept the conclusion that "modifying a purchase item is also a modification of a purchase request", then we should not study the purchase item and the purchase request separately, but should be as shown in the following figure:

Figure 3 Using aggregation to encapsulate objects

We organize the "purchase request" and "purchase item" together as a larger whole, called "aggregation". The internal business logic of this aggregation, such as "After the purchase application is approved, no changes can be made to the purchase application items", it should be built in the aggregation. In order to achieve this goal, we agreed that all operations (addition, deletion, modification, etc.) of the purchase item are operations on the object of the purchase request.

That is to say: in the world of DDD, there should never have been a savePurchaseItem() method, and it should be replaced by purchaseRequest.modifyPurchaseItem() and purchaseRequestRepository.save(purchaseRequest).

In the new object relationship, the purchase application is responsible for "guarding the pass" (ie, the "aggregate root"), and the purchase item becomes the aggregated internal data. Since the aggregation is now a whole, the operations related to it can only be carried out through the purchase application object, and business consistency can be guaranteed. This is in fact a more accurate description of the relationship between objects: although both purchase requisitions and purchase items are modeled as objects, their status is not equal. Purchasing items are subordinate to the objects of the purchase requisition, and they are meaningful only if they are a whole.

The essence of aggregation is to establish a boundary larger than the object granularity, gather those closely related objects, and form a business object as a whole. Use the aggregate root as the external interaction entry, thus ensuring the consistency of multiple interrelated objects. Reasonable use of aggregation can more easily ensure the consistency of business rules, reduce possible coupling between objects, improve the comprehensibility of the design, and reduce the possibility of problems.

Therefore, by organizing objects into aggregates, a new layer of encapsulation is constructed on top of the basic object hierarchy. Encapsulation simplifies the concept and hides the details. The number of external model elements that need to be cared about is further reduced, and the complexity is reduced. However, the introduction of packaging boundaries also raises a new question. For example, commodity information is also an effective part of the purchase item. Should the commodity be included in the aggregation of "purchase request"? Should the submitter and approver also be included in the aggregation? If you want to easily obtain the consistency of business rules, wouldn't it be better to put all business-related objects together? If some objects should be placed in aggregations and some should not be placed in aggregations, is there a clear guiding principle? This article answers this question in the next section.

The principle of the division of two aggregations

As a layer in the object system of DDD, aggregation should also follow the principles of high cohesion and low coupling. This article believes that the objects within the aggregation boundary should satisfy the following heuristic rules:

Life cycle consistency
Problem domain consistency
Scene frequency consistency
There are as few elements in the aggregate as possible

1 Life cycle consistency

Life cycle consistency refers to the "personal attachment" relationship between the objects within the aggregation boundary and the aggregation root. That is: if the aggregate root disappears, all other elements in the aggregate should disappear at the same time. For example, in the preceding example, if the aggregate root (purchase request) does not exist, the purchase item will of course lose its meaning. However, there is no such relationship between objects such as products, users who are applicants, and purchase requests.

You can use proof by contradiction to prove the consistency of the life cycle: if an object is still meaningful after the aggregate root disappears, then there must be other ways to access the object in the system. This contradicts the definition of aggregation. Therefore, other elements in the aggregate root must become invalid after the aggregate root disappears. Violation of life cycle consistency will also bring about serious problems in implementation. Let's look at an example together:

The life cycle of the User object is inconsistent with the purchase request. Now suppose there are two pieces of program code executed in parallel:

Code 1 (for example, modification of purchase requisition) obtains the object of a certain purchase requisition, modifies the object, and saves it. Note that since the User object is embedded in the PurchaseRequest, the User object will also be saved at the same time.

r = purchaseRequestRepository.findOne(id);
//...一些修改
purchaseRequestRepository.save(r);

Code 2 (for example, user management), obtained the information of the approver corresponding to the object, and modified it.

User user = userRepo.findOne(r.getSubmitter().getId());
//...一些修改
userRepo.save(user);

This will lead to a completely unacceptable consequence: the uncertainty of the modification of the User object! Therefore, for those objects that are not clear about whether they should be included in the same aggregation, you may ask: If this object leaves the context of this aggregation, does it have any value in its own existence? If the answer is yes, the object should not be included in this aggregation:

The User object corresponding to Submitter/Approver is separated from PurchaseRequest, and there is still a reason for its separate existence.
The Product object is separated from the PurchaseRequest and can exist alone.

So the above two objects do not belong to the aggregation of purchase requisitions.

2 Problem domain consistency

The second principle is problem domain consistency. In fact, the problem domain is consistent with the bounded context (Bounded Context) constraints. Aggregation is a tactical model, and the model it represents must be within the same bounded context.

Although principle 1 states that the life cycle consistency of objects can be used as the basis for aggregation and division, there may be disputes about what is meant by "whether an object has the meaning of being separated from another object". For example: if a purchase requisition is deleted, is the order generated based on this purchase requisition valuable? (Because the order example may fall into another kind of dispute, it can be circumvented from the business process: as long as the order exists, the purchase request cannot be deleted), let us change a very similar example:

An online forum where users can comment on users' articles on the forum. The article should obviously be an aggregate root. If the article is deleted, then the user’s comments appear to disappear at the same time. So can comments belong to the aggregation of articles?

Let us now consider whether there are other possible uses for comments. For example, a book website, users can comment on books. If only because of the logical connection between article deletion and comment deletion, let the article aggregation hold the comment object, then it will obviously restrict the scope of application of the comment. The clear fact is that the concept of comment is far from the concept of the article in essence. Therefore, we have a new principle that is above principle 1-objects that do not belong to the same problem domain should not appear in the same aggregation. Friends who are familiar with DDD may know that this corresponds to the strategic model of bounded context in DDD. Due to the length of the article, we will not expand here too much.

Figure 4 Problem domain consistency

Since the aggregate root cannot guarantee consistency outside the aggregation, we need to rely on "eventual consistency" to achieve consistency between aggregations. For example, when an article is deleted, a message of article deletion is sent. After the comment system receives the article deletion message, the comment corresponding to the article is deleted.

3 Scene frequency consistency

Relying on the aforementioned two principles has been able to distinguish most aggregations. However, there will still be some more complicated situations. For example, consider the relationship between "product" and "version" and "function" in software development. Are "product" and "version" considered the same problem domain? ——The relationship between these concepts may not be as clear as "article" and "comment". But it doesn't matter, we still have a heuristic rule to avoid this ambiguity. This is the principle of "scene frequency consistency".

Scenario is a concrete description of business use cases, reflecting the way users use the system to achieve business goals. We can observe the domain object operations involved in these scenes, such as viewing and modifying domain objects. The consistency of scene operation frequency is a key characterization of the same aggregated internal objects. Objects that are often operated at the same time, they often belong to the same aggregation. And those objects that are rarely paid attention to at the same time should generally not be classified as an aggregation.

The three concepts of "product", "version" and "function" shown in the following figure are taken as examples to illustrate. The product does contain a lot of features, and these features are released through a series of versions. However, operations at the product level, such as viewing all product lists, do not need to care about the detailed information of a specific function, nor do they need to know a specific version of information. When we do version planning, we do use the function list, but most of the time we don't check the function details, and it is even more impossible to modify the function description when doing version planning.

Figure 5 Inappropriate aggregation

According to this principle, we have divided the following three aggregations:

Figure 6 More reasonable aggregation

Dividing aggregation based on scene consistency is also of great benefit to implementation. Putting objects that are not operated in the same scene into the same aggregation means that every time you operate an object, you need to capture all the information of other objects, which is very meaningless. From the implementation level, if objects that are not closely related appear in the same aggregation, they will often be modified concurrently in different scenarios, which also increases the possibility of conflicts between these objects. Therefore: operating objects with inconsistent scenes, or if an object will be used in different scenes, you should consider dividing them into different aggregations.

4 Smallest possible aggregation

The essence of aggregation is to solve the complexity brought about by the consistency problem. Therefore, it is not necessary to put them in the same aggregation if the above three consistency are not destroyed. Aggregates composed of only one business concept (namely, the class name and attributes in the domain model and the Id object mentioned later) are the majority in the object-oriented world.

According to the above analysis, in the purchase request example, the purchase request, some attributes of the purchase request (such as status, submission time, etc.), and purchase items belong to an aggregation. However, commodities and users cannot belong to the aggregation of purchase requisitions. How are these aggregations related? We introduce a new value object to solve this problem, as shown in the figure below. The figure also marks by the way whether each object is a value object or an entity object.

Figure 7 Refined polymerization package

In the purchase request aggregation, except that the purchase request aggregation root is an entity object, other objects, including the Id object as an external reference, are all value objects.

The corresponding code is as follows:

The introduction of Id value objects is a question worth discussing.

First of all, the introduction of the Id value object can break the aggregation and speed up the query, but it will inevitably lead to the need for a second query of the information in some scenarios, and the EagerFetch/LazyFetch loading mechanism of ORM cannot be used. Traverse. Is this a loss? The simple answer is: not a loss. Don't covet the so-called convenience brought by the nesting of objects that are not part of an aggregate—it causes more trouble than benefits. Such problems should be handled by external services, such as application layer services.

Secondly, can the additional Id value objects introduced to break the aggregation be regarded as part of the domain model or the "unified language"? My interpretation of this question is: This is part of the implementation mechanism of DDD, it belongs to the domain model, but please control the visibility in the development team.

There is no need to communicate these concepts with business people. Only use the entities, value objects, domain services, and domain events identified in the problem domain to communicate with business personnel. The concepts of Id value objects, resource libraries and factories, as well as aggregation and aggregation root are left to the implementation personnel to understand and use in the implementation. They are still part of the domain model, and their existence is still part of the unified language, but just as the view can selectively ignore part of the information, these concepts should be ignored in communication with business personnel and business description.

Third, please note that this Id object can only refer to the Id of other aggregate roots. Since only the aggregate root may be referenced externally, the ID of the aggregate root should be globally unique. The internal objects of the aggregation, whether they are entity objects or value objects, only need to ensure that the internal ID is unique.

Three implementation considerations

1 Resource library, factory-oriented definition of aggregation

The Factory mode and the Repository mode are both modes of DDD in realizing dimensions. Although in the schema relationship diagram given by DDD Reference, in addition to the connection between the factory and the resource library and the aggregation, there are also connections with the entity, and even between the factory and the value object, but this article believes that these connections The intensity is different, and the value is also different.

The existence of the factory pattern is obviously to separate the construction and use of objects, but in the context of DDD, it contains a deeper meaning. The direct relationship between the objects in the aggregation may be complicated, and business consistency needs to be guaranteed. Then using factories to construct aggregate objects is a better encapsulation of complexity. It is true that the factory pattern is also valuable for the construction of non-aggregated and complex body objects and value objects, but this is only at the design or implementation level, and has nothing to do with the business model.

Although both the aggregate factory and the general object factory have the same name as the factory pattern, the Factory designed by DDD with aggregation as the basic unit is of greater significance for simplifying the complexity of the system. In terms of design constraints, apart from aggregation, there should only be one factory visible to the outside world, and that is the aggregation factory. (The factory of domain events is also meaningful, and domain events are a bit far from the topic of this article, so we won't discuss it for the time being).

The resource library model does not just mean persistence, and it is not a database access layer, so don't misunderstand. The more important meaning of the resource library is that the resource library is a storage mechanism for aggregation. The external world uses the resource library and can only access the aggregation through the resource library. The resource library is an aggregated overall management object. Therefore, in terms of design constraints, an aggregation can only have one resource library object, which is the resource library named by the aggregate root. No other objects should provide resource library objects.

Figure 8 Aggregation and resource library

2 The code structure is consistent with the aggregation

Careful readers must have discovered that the package organization in the above figure is also consistent with the aggregation, and the name of the aggregation root is used as the package name. This is my idiomatic way of organizing code. I treat aggregation as a level of code (of course there are other levels above it, such as bounded contexts, modules, etc.), and all entities (including aggregation roots) objects and value objects that belong to the aggregation , Resource library, factory, etc. are all put into the same code package. The code structure and the structure of the domain model are highly consistent, which can reduce the representation gap and better manage the complexity of the object world.

3 Aggregation cannot cross the boundary of deployment

The deployment boundary is a complex topic, and this article only discusses content related to aggregation. First of all, if the system adopts a microservice architecture, the deployment boundary and the boundary of the bounding context should be kept consistent-don't let the granularity of the deployment be greater than the granularity of the bounding context, which can bring better business flexibility and scalability. Secondly, from the minimum boundary of the service, the minimum boundary cannot be made smaller than the granularity of the aggregation, otherwise it will bring a lot of data consistency problems-because the consistency between microservices generally needs to be guaranteed by the final consistency, if the aggregation Crossing the deployment boundary would be a disaster for consistency. I have seen some unreasonable suggestions on the division of microservices in some books, such as making the addition, deletion, modification, and inspection of each object into a service. This suggestion is wrong in my opinion.

4 Aggregation improves system performance and scalability

Many people are troubled by inefficient queries in the ORM mechanism. Why is this happening? Look at the previous example to understand. We add Spring JPA's Annotation to the aforementioned incorrect aggregation example:

Due to the lack of the concept of aggregation, or an incorrectly made a super-large aggregation, then every query to the PurchaseRequest requires a large number of objects to be fetched from the system, which consumes a lot of computing resources-perhaps the User itself is also a super-large What about the object? "Pull out the radish and bring out the mud." Naturally, the performance cannot be good.

Maybe some readers will say, I don't use Eager Fetch, I can use Lazy Fetch. Yes, this is indeed better for performance, but unfortunately, the context of data access will have to be retained all the time, the probability of system errors is greatly increased, and it also brings inconvenience to distributed design.

Small aggregations do not have this problem at all-in this case, each object involved in access (in fact, aggregation) can not be very large, and the required data is appropriately there, data integrity and business Integrity is guaranteed, and horizontal expansion can be conveniently carried out, and performance and scalability are also satisfied at the same time.

Four summary

Modeling is one of the ways we understand the real world and simplify the complexity of problems. As a level of domain modeling, aggregation realizes information hiding, improves abstraction level, encapsulates closely related business logic, ensures the consistency of system data, and improves system performance through appropriate boundaries.

This article discusses the definition and value of aggregation, in a nutshell:

Aggregation is a level of modeling in the object-oriented world. It hides fine-grained objects and restricts the coupling between objects.
Aggregation is the boundary of consistency and the encapsulation of objects that are closely related. The aggregation encapsulates the entity object and the value object, and uses the most important entity object as the aggregation root. As the only external entrance of the aggregation, the aggregate root ensures the consistency of business rules and data.

This article also discusses four heuristic rules about aggregation recognition, specifically:

Life cycle consistency
Problem domain consistency
Scene frequency consistency
There are as few elements in the aggregate as possible

From an implementation perspective, the granularity of the resource library and factory should be consistent with the granularity of the aggregation, and the code structure and deployment structure can also be aligned with the aggregation. The realization is consistent with the domain model, which is also the goal and value of domain-driven design as the correct OO.

[2021 Alibaba R&D Efficiency Summit] Registration is open

On June 23, nearly 30 domestic and overseas big names shared efficiency trends and practices, such as Alibaba partners, IBM deputy partners, Deloitte cloud services chief architect, PMI business vice president, etc., cloud-native, low-code, intelligent, Future architecture, DevOps, and digital transformation will gather 1200 minutes of selected dry goods to perceive the technological level of the industry with you and gain insight into the future development trend.

click here , make an appointment for free~

Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

In-depth understanding of aggregation in domain-driven design

What is the core problem solved by one aggregation

The principle of the division of two aggregations

1 Life cycle consistency

2 Problem domain consistency

3 Scene frequency consistency

4 Smallest possible aggregation

Three implementation considerations

1 Resource library, factory-oriented definition of aggregation

2 The code structure is consistent with the aggregation

3 Aggregation cannot cross the boundary of deployment

4 Aggregation improves system performance and scalability

Four summary

阿里云开发者

引用和评论

福利来了！计算巢支持在已经购买的 ECS 上搭建幻兽帕鲁服务器，支持图形化管理配置

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性