头图

Remember a painful failure caused by online ThreadLocal, the 3-month year-end bonus is gone, and you may face dismissal.

the cause of the matter

Rat teasing cat - nothing to do

A few days ago, when I was not too busy at work, in order to show that I was proactive at work, had strong technical skills, and left a good impression on the leaders, I went to check if there was room for optimization in the project code.

装比.jpg

Unexpectedly, I really let me find it.

The calamity is here!

Some users reported that the interface for querying the order list was a bit slow, so I went to print the time-consuming information of each step. Before finding the query order, you need to query the user information according to the user ID, and the query user information interface needs to call the service provided by the user team. Sometimes when the network is slow, it takes up to 200 milliseconds.

When the query order interface is called layer by layer, the query user information interface is called several times. Of course, it can be changed to query the top layer once, and then pass it down layer by layer, so there are many places to change, which is also very troublesome.

I wonder if I can add a local cache to cache user information, so that I don't have to call the user service query every time. Just thought of using ThreadLocal , I heard that senior programmers use ThreadLocal , I also want to try it.

ThreadLocal is private to the thread. After the call ends, the thread is destroyed, and the data in ThreadLocal is also gone.

It sounds like ThreadLocal is thread safe, so it should be fine.

没问题.jpg

Hands

I first write a ThreadLocal tool class to store and get user information:

 /**
 * @author 一灯
 * @apiNote 本地缓存用户信息
 **/
public class ThreadLocalUtil {

    // 使用ThreadLocal存储用户信息
    private static final ThreadLocal<User> threadLocal = new ThreadLocal<>();

    /**
     * 获取用户信息
     */
    public static User getUser() {
        // 如果ThreadLocal中没有用户信息,就从request请求解析出来放进去
        if (threadLocal.get() == null) {
            threadLocal.set(UserUtil.parseUserFromRequest());
        }
        return threadLocal.get();
    }

}

Then in the query order interface, call the method of this tool class to obtain user information, and finally query the order information according to the user information, which is perfect.

 /**
 * 获取订单列表方法
 */
public List<Order> getOrderList() {
    // 1. 从ThreadLocal缓存中获取用户信息
    User user = ThreadLocalUtil.getUser();
    // 2. 根据用户信息,调用用户服务获取订单列表
    return orderService.getOrderList(user);
}

Self-testing, testing, acceptance, online, the interface access speed "swoops" up, and everything looks so perfect.

I have already begun to fantasize about getting a promotion and a salary increase, marrying Bai Fumei, and reaching the pinnacle of my life.

走上巅峰.jpg

backfires

An hour after going online, the group on duty exploded.

One after another, users reported that the order they just placed was missing, and other users also reported that their order list was inexplicably more orders.

I was stunned, I had never encountered such a situation, and more and more users gradually responded, and I was overwhelmed.

The leader made a decisive decision, Xiao Deng, what kind of plane are you doing, please roll back the service quickly.

事与愿违.jpg

After half an hour, the rollback was completed, and the user's emotions gradually calmed down.

Troubleshooting

After the online fault is resolved, the cause of the problem will be investigated immediately.

After countless times of logging and debugging, the problem was finally located.

ThreadLocal is indeed thread private, and the data in ThreadLocal will be cleaned up after the thread is destroyed.

But the problem is that no matter if we use Tomcat, Jetty, SpringBoot, Dubbo, etc. on the server side, we will not create a thread with a request, but create a thread pool, and all requests share the threads in this thread pool. .

A thread will not be destroyed after processing a request. It may cause multiple user requests to share a thread, and finally data override occurs, and other users' orders are seen.

image-20220508171253731.png

solution

The solution is to call the remove method to clear the ThreadLocal data after using ThreadLocal.

 /**
 * @author 一灯
 * @apiNote 本地缓存用户信息
 **/
public class ThreadLocalUtil {

    // 使用ThreadLocal存储用户信息
    private static final ThreadLocal<User> threadLocal = new ThreadLocal<>();

    /**
     * 获取用户信息
     */
    public static User getUser() {
        // 如果ThreadLocal中没有用户信息,就从request请求解析出来放进去
        if (threadLocal.get() == null) {
            threadLocal.set(UserUtil.parseUserFromRequest());
        }
        return threadLocal.get();
    }

    /**
     * 删除用户信息
     */
    public static void removeUser() {
        threadLocal.remove();
    }

}

Wrap the business code with try/catch, then clear the ThreadLocal data in finally.

 /**
 * 获取订单列表
 */
public List<Order> getOrderList() {
    // 1. 从ThreadLocal缓存中获取用户信息
    User user = ThreadLocalUtil.getUser();
    // 2. 根据用户信息,调用用户服务获取订单列表
    try {
        return orderService.getOrderList(user);
    } catch (Exception e) {
        throw new RuntimeException(e.getMessage());
    } finally {
        // 3. 使用完ThreadLocal后,删除用户信息
        ThreadLocalUtil.removeUser();
    }
    return null;
}

fault rating

If the user is affected by more than 10w, or the wrong data exceeds 10w, or the capital loss is more than 100w, the fault rating is P1, and the annual performance is C.

Originally, I wanted to optimize the program performance, improve the access speed, and give the leadership a good impression, so as to show that I have strong technical ability and proactive work.

This is good, not only the year-end bonus is gone, but the job may not be preserved.

Sleeping without covering my butt - I'm showing my face!

哭一个月.jpg

accident summary

After this accident, I summed up the following lessons:

  1. It's okay, don't be foolish.
  2. If you don't have diamonds, don't work on porcelain.
  3. Do not seek success, but seek no fault.
  4. Dengzi, the water for reconstruction and optimization is too deep for you to grasp.
The article is continuously updated, and you can search for "One Light Architecture" on WeChat to read more technical dry goods as soon as possible.

一灯架构
41 声望9 粉丝