My thoughts on Spring Circular Dependencies

foreword

Today, there are still many people who are arguing about circular dependencies, and many interviewers love to ask questions about circular dependencies, and even only ask about circular dependencies in Spring. In China, this seems to have become a must-learn knowledge point for Spring. Features are also talked about by many people. And I think this is a stain in the many good designs in the Spring framework, an implementation that compromises for bad design. You know, there is no circular dependency in the entire Spring project. This is because the Spring project is too simple. Yet? On the contrary, Spring is more complex than most projects. Similarly, in the Spring-Boot 2.6.0 Realease Note, it is also stated that circular dependencies are no longer supported by default. If support is required, it needs to be manually enabled (it used to be enabled by default), but it is strongly recommended to break the circular dependencies by modifying the project.

In this article, I want to share my thoughts on circular dependencies. Of course, before that, I will take you to review some knowledge about circular dependencies.

dependency injection

Since circular dependencies occur in the process of dependency injection, let's briefly review the process of dependency injection.

Case:

 @Component
public class Bar {
    
}

 @Component
public class Foo {

    @Autowired
    private Bar bar;
}

 @ComponentScan(basePackages = "com.my.demo")
public class Main {

    public static void main(String[] args) {
        AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(Main.class);
        context.getBean("foo");
    }

}

The above is a very simple Spring entry case, in which Foo injected with Bar , and the injection process occurs in context.getBean("foo") .

The process is as follows:

1. Find the corresponding BeanDefinition by passing in "foo". If you don't know what a BeanDefinition is, then you can understand it as an object that encapsulates the Class information corresponding to the bean, through which Spring can get some beanClass and beanClass identifiers annotation.

2. Use the beanClass in the BeanDefinition to instantiate it through reflection to get what we call bean (foo).

3. Parse the beanClass information and get the attribute (bar) that identifies the Autowired annotation

4. Use the attribute name (bar), call context.getBean('bar') again, and repeat the above steps

5. Set the obtained bean (bar) to the property (bar) of foo

The above is a simple process description

what is circular dependency

A circular dependency is actually that A depends on B, and B also depends on A, thus forming a cycle. From the above example, if bar also depends on foo, then a circular dependency occurs.

How Spring solves circular dependencies

The process of getBean can be said to be a recursive function. Since it is a recursive function, there must be a condition for recursive termination. In getBean, it is obvious that this termination condition is returned in the process of filling properties. Then if Foo depends on Bar in an existing process, what will happen if Bar depends on Foo?

1. Create a Foo object

2. When filling the property, it is found that the Foo object depends on Bar

3. Create the Bar object

4. When filling the property, it is found that the Bar object depends on Foo

5. Create a Foo object

6. When filling the property, it is found that the Foo object depends on Bar....

foo_bar

Obviously, recursion has become an infinite loop at this time. How to solve such a problem?

add cache

We can add a layer of cache to this process, put the object into the cache after instantiating the foo object, and fetch it from the cache every time we getBean, and then create the object if it cannot be retrieved.

The cache is a Map, the key is beanName, and the value is Bean. The process after adding the cache is as follows:

1. getBean('foo')

2. Get foo from cache, if not found, create foo

3. After the creation, put foo into the cache

4. When filling the property, it is found that the Foo object depends on Bar

5. getBean('bar')

6. Get bar from cache, if not found, create bar

7. After the creation is completed, put the bar into the cache

8. When filling the property, it is found that the Bar object depends on Foo

9. getBean('foo')

10. Get foo from the cache, get foo, and return

11. Set foo to the bar property and return the bar object

12. Set bar to the foo property and return

After adding a layer of caching in the above process, we found that it can indeed solve the problem of circular dependencies.

Null pointers appear in multiple threads

As you may have noticed, this design has problems when multi-threading occurs.

Let's assume two threads are gettingBean('foo')

1. The running code of thread one is to fill the property, that is, just after foo is put into the cache

2. Thread 2 is slightly slower, and the running code is: get foo from the cache

At this point, we assume that thread one is suspended and thread two is running, then it will execute the logic of getting foo from the cache, then you will find that thread two gets foo, because thread one just put foo in cache, and at this point foo has not been populated with properties!

If thread 2 gets the foo object that has not been set (bar) to use, and just uses the bar property in the foo object, it will get a null pointer exception, which is not allowed!

So how do we solve this new problem?

lock

The easiest way to solve the multithreading problem is to lock.

We can lock it before [Get from Cache] and unlock it after [Fill Properties].

In this way, thread 2 must wait for thread 1 to complete the entire getBean process before getting the foo object in the cache.

We know that locking can solve multi-threading problems, but we also know that locking can cause performance problems.

Just think, locking is to ensure that the object in the cache is a complete object, but what if all the objects in the cache are complete? Or are some objects already complete?

Suppose we have three objects A, B, C

1. The A object has been created, and the A object in the cache is complete

2. The B object is still being created, and some properties of the B object in the cache have not been filled yet

3. The C object has not been created

At this point we want getBean('A'), so what should we expect? Do we expect the A object to be returned directly from the cache? Or wait until the lock is acquired to get the A object?

Obviously, we are more expected to directly obtain the A object and return it, because we know that the A object is complete and does not need to acquire the lock.

However, the above design is obviously unable to meet this requirement.

L2 cache

The above question can actually be simplified to how to distinguish a complete object from an incomplete object? Because as long as we know that this is a complete object, then return directly, if it is an incomplete object, then we need to acquire the lock.

We can add a first-level cache in this way. The first-level cache stores complete objects, and the second-level cache stores incomplete objects. Since such objects are put into the cache when the bean is just created, we call it here. as an early object .

At this time, when we need to obtain the A object, we only need to judge whether the first level cache has the A object. If there is, it means that the A object is complete and can be used directly. If not, it means that the A object may not have been created or If it is being created, continue to lock --> get the object from the second-level cache --> the logic of creating the object

At this point the process is as follows:

1. getBean('foo')

2. Get foo from the first-level cache, but not get it

3. Lock

4. Get foo from the second-level cache, but not get it

5. Create the foo object

6. Put the foo object into the second level cache

7. Fill properties

8. Put the foo object into the first level cache, at this time the foo object is already a complete object

9. Delete the foo object in the second level cache

10. Unlock and return

Based on the existing process, let's simulate the situation when the dependency is broken

Now, both the object completeness problem and our performance requirements are met. perfect!

proxy object

You must know that there are not only ordinary objects, but also proxy objects in Java. Can the creation of proxy objects meet the requirements when circular dependencies occur?

Let's first understand when the proxy object is created?

In Spring, the logic of creating proxy objects is the last step, which is what we often call [after initialization]

Now, let's try to add this part of the logic to the previous process

Obviously, the final foo object is actually a proxy object, but the object bar depends on is still a normal foo object!

Therefore, when there is a circular dependency of proxy objects, the previous process cannot meet the requirements!

So how should this problem be solved?

ideas

The reason for the problem is that when the bar object gets the foo object, the foo object obtained from the L2 cache is an ordinary object.

So is there any way to add some judgments here, such as judging whether the foo object is to be proxied, and if so, create a proxy object of foo, and then return the proxy object proxy_foo.

Let's first assume that this solution is feasible, and then see if there are other problems

According to the flow chart, we can find a problem: proxy_foo is created twice!

1. In the getBean('foo') process, a proxy_foo is created after filling the attributes

2. When getBean('bar') fills the attribute, when foo is obtained from the cache, proxy_foo is also created once

And the two proxy_foo are not the same! Although the foo object referenced in proxy_foo is the same, this is also not acceptable.

How should this problem be solved?

L3 cache

We know that the two created proxy_foo are different, so how should the program know? That is to say, if we can add an identifier to identify that the foo object has been proxied, and let the program use the proxy directly, don't create another proxy. Does it solve this problem?

This flag is not something like flag=ture or false, because even if the program knows that foo has been proxied, the program still has to get proxy_foo, that is, we have to find a place to store proxy_foo .

At this time, we need to add another level of cache.

The logic is as follows:

1. When foo is obtained from the cache and foo is proxied, proxy_foo is put into this level of cache.

2. In the getBean('foo') process, when creating a proxy object, first check whether there is a proxy object in the cache, and if so, use the proxy object

You may have questions here: don't you first judge whether there is a third-level cache, and then create proxy_foo? How to create it regardless of whether it is there or not?

Yes, proxy_foo is created here no matter what, but it is finally judged whether there is a third-level cache. If there is, use the one in the third-level cache, and the proxy_foo created before is not needed.

The reason is this, we know that the logic of creating a proxy object is done in a post-processor in the process of Bean [after initialization], and the post-processor can be implemented by the user, then conversely It means that Spring cannot control this part of the logic.

We can assume that we have implemented a post-processor ourselves. The function of this processor is not to create a proxy object proxy_foo, but to replace foo with dog. If you follow the previous idea (just judge whether it is a proxy object), you can You will find this problem: getBean('foo') returns dog, but the bar object depends on foo.

But if we think of the logic of [creating a proxy object] as just one implementation of many post-processors.

1. When fetching foo from the cache, a series of post-processors are called, and then the final result returned by the post-processor is placed in the L3 cache.

2. When getBean('foo'), a series of post-processors are also called, and then the object corresponding to foo is obtained from the L3 cache, and it is used when it is obtained, otherwise the post-processor is used to return the result.

You will find that, whatever you do, the object returned by getBean('foo') and the foo that the bar object depends on are always the same object.

The above is Spring's solution to circular dependencies

My thoughts on the design of this part of Spring

First, let's review the design of Spring in general. Spring uses a three-level cache.

1. The first level cache stores complete bean objects

2. The second level cache stores anonymous functions

3. The third-level cache stores objects returned from anonymous functions in the second-level cache

Yes, Spring makes the two steps we said [get foo from the second level cache, call the post processor] directly into an anonymous function

Its structure is as follows:

 private final Map<String, ObjectFactory<?>> singletonFactories = new HashMap<>(16);

 @FunctionalInterface
public interface ObjectFactory<T> {

    T getObject() throws BeansException;

}

The content of the function is to call a series of post-processors

 protected Object getEarlyBeanReference(String beanName, RootBeanDefinition mbd, Object bean) {
  Object exposedObject = bean;
  if (!mbd.isSynthetic() && hasInstantiationAwareBeanPostProcessors()) {
    for (BeanPostProcessor bp : getBeanPostProcessors()) {
      if (bp instanceof SmartInstantiationAwareBeanPostProcessor) {
        SmartInstantiationAwareBeanPostProcessor ibp = (SmartInstantiationAwareBeanPostProcessor) bp;
        exposedObject = ibp.getEarlyBeanReference(exposedObject, beanName);
      }
    }
  }
  return exposedObject;
}

For this part of the design, there has always been some controversy: how many levels of caching can be used in Spring to solve circular dependencies?

point one

The second-level cache can be solved when the circular dependency of ordinary objects occurs, but the third-level cache is required when the proxy object has circular dependencies.

This is also a general view

The perspective of this point of view is that when using the second-level cache, whether there will be bugs in the occurrence of circular dependencies, it is considered that ordinary objects will not, but proxy objects will.

In other words: when multiple circular dependencies occur, get objects from the cache multiple times, does the object get the same each time?

For example, the A object depends on the B object, the B object depends on the A object and the C object, and the C object depends on the A object.

The process of getBean('A') is as follows

In this flow, the A object is fetched from the cache twice.

Now, let's think about it in conjunction with the process of getting an object from the cache.

Logic when there is only L2 cache:

1. Call the anonymous function in the second level cache to get the object

2. Return the object

Assuming that the original object is returned in the anonymous function, no proxy logic is created - strictly speaking, there is no post-processor logic here

Then the A object returned each time [calling an anonymous function in the secondary cache to get the object] is the same.

So it is concluded that there is no problem with ordinary objects when there is only L2 cache.

Assuming that the logic to create a proxy is triggered in the anonymous function, the anonymous function returns the proxy object.

Then every time you [call an anonymous function in the secondary cache to get an object], a proxy object will be created.

The proxy object created each time is a new object, so the A object returned each time is not the same.

So it is concluded that the proxy object will have problems when there is only a second level cache.

So why is the L3 cache okay?

The logic of the third-level cache:

1. First try to get it from the L3 cache, but not get it

2. Call the anonymous function in the secondary cache to get the object

3. Put the object into the L3 cache

4. Delete anonymous functions in the second level cache

5. Return the object

Therefore, the anonymous function is called to create a proxy object when it is acquired from the cache for the first time, and each subsequent acquisition is directly retrieved from the third-level cache and returned.

All in all, this view is valid.

But I prefer to put this point of view in a more rigorous way: when the objects returned by the anonymous function are consistent each time, the second-level cache is sufficient; when the objects returned by the anonymous function are inconsistent each time, a third-level cache is required.

point two

This point of view is also my own: from a design point of view, only the third-level cache can guarantee the scalability and robustness of the framework.

When we review the conclusion of point 1, you will find a very contradictory place: how can Spring know that the object returned by the anonymous function is consistent?

The logic in the anonymous function is to call a series of post-processors, and post-processors are customizable.

It means that what is returned by the anonymous function is not controlled by Spring itself.

At this time, we borrow the third-level cache to look at this problem, and we will find that no matter whether the objects returned by the anonymous function are consistent, the third-level cache can effectively solve the problem of circular dependencies.

From the design point of view, the design of the third-level cache can include the requirements achieved by the second-level cache.

So we can conclude that the design using the L3 cache will have better scalability and robustness than the design of the L2 cache.

If you design the Spring framework with the first point of view, you need to add a lot of logical judgments. If you use the second point of view, you only need to add a layer of cache.

summary

The original intention of this article is to write my thoughts on Spring's circular dependencies, but in order to make it clear, I will describe Spring's design to solve circular dependencies in detail.

So in the end, when I want to express my thoughts, there are only a few sentences, because most of my thoughts have been written in the chapter [How Spring solves circular dependencies].

Finally, I hope everyone can gain something. If you have any questions, you can ask me, or leave your thoughts in the comment area.

If my article is helpful to you, please like , follow, and forward it. Your support is the motivation for my update, thank you very much!
Personal blog space: https://zijiancode.cn

My thoughts on Spring Circular Dependencies

foreword

dependency injection

what is circular dependency

How Spring solves circular dependencies

add cache

Null pointers appear in multiple threads

lock

L2 cache

proxy object

ideas

L3 cache

My thoughts on the design of this part of Spring

point one

point two

summary

阿紫

引用和评论

一个对象在JVM中经历了什么？

SpringMVC-ResponseBodyAdvice

【Spring】@Size 无法拦截null的原因

SpringMVC-@InitBinder

SpringMVC-RequestMappingHandlerMapping

Spring Cloud中MyBatis-Plus动态数据源刷新问题

SpringMVC-RequestMappingHandlerAdapter