11

Hello, I am crooked.

A few days ago, I saw a question about the usage of Synchronized on Sifu. I thought it was very interesting. This question was actually a real question I encountered when I interviewed a company three years ago. At that time, I didn’t know what the interviewer wanted What to test, no answer is particularly good, and then I remember it after researching it.

So when I saw this question, I felt very kind, and I am ready to share it with you:

First of all, in order to make it easier for you to reproduce the problem when you read the article, I will give you a code that can be run directly. I hope you can also take out the code and run it if you have time:

public class SynchronizedTest {

    public static void main(String[] args) {
        Thread why = new Thread(new TicketConsumer(10), "why");
        Thread mx = new Thread(new TicketConsumer(10), "mx");
        why.start();
        mx.start();
    }
}

class TicketConsumer implements Runnable {

    private volatile static Integer ticket;

    public TicketConsumer(int ticket) {
        this.ticket = ticket;
    }

    @Override
    public void run() {
        while (true) {
            System.out.println(Thread.currentThread().getName() + "开始抢第" + ticket + "张票,对象加锁之前:" + System.identityHashCode(ticket));
            synchronized (ticket) {
                System.out.println(Thread.currentThread().getName() + "抢到第" + ticket + "张票,成功锁到的对象:" + System.identityHashCode(ticket));
                if (ticket > 0) {
                    try {
                        //模拟抢票延迟
                        TimeUnit.SECONDS.sleep(1);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    System.out.println(Thread.currentThread().getName() + "抢到了第" + ticket-- + "张票,票数减一");
                } else {
                    return;
                }
            }
        }
    }
}

The program logic is also very simple. It is a process of simulating ticket grabbing. There are 10 tickets in total, and two threads are opened to grab tickets.

Tickets are shared resources and are consumed by two threads, so in order to ensure thread safety, the synchronized keyword is used in the logic of TicketConsumer.

This is an example that everyone should write when they are beginners in synchronized. The expected result is 10 tickets, two people grabbing, each ticket can only be grabbed by one person.

But the actual running result is like this, I only intercept the log at the beginning:

There are three boxed parts in the screenshot.

The top part is that two people are grabbing the 10th ticket. From the log output, there is no problem at all. In the end, only one person grabs the ticket, and then enters the process of competing for the ninth ticket.

But the competition for the 9th ticket, which is framed below, is a bit confusing:

why抢到第9张票,成功锁到的对象:288246497
mx抢到第9张票,成功锁到的对象:288246497

Why did both of them grab the 9th ticket, and the objects that were successfully locked were the same?

This thing is beyond recognition.

How can these two threads get the same lock and then execute business logic?

So, the question of the questioner emerges.

  • 1. Why does synchronized not take effect?
  • 2. Why is the output of the lock object System.identityHashCode the same?

Why didn't it work?

Let's look at a question first.

First of all, we already know very clearly from the log output that synchronized fails when the 9th ticket is grabbed in the second round.

Supported by theoretical knowledge, we know that if synchronized fails, there must be a lock problem.

If there is only one lock and multiple threads are competing for the same lock, there is absolutely nothing wrong with synchronized.

However, the two threads here do not reach the condition of mutual exclusion, which means that there is definitely more than one lock here.

This is a conclusion we can deduce from theoretical knowledge.

The conclusion is drawn first, so how can I prove that "there is more than one lock"?

Being able to enter synchronized means that the lock must be obtained, so I only need to see what the locks held by each thread are.

So how to see what lock the thread holds?

jstack command, print thread stack function, understand?

This information is hidden in the thread stack, and we can see it when we take it out.

How to get the thread stack in idea?

This is a little trick for debugging in idea, which should have appeared many times in my previous articles.

First of all, for the convenience of obtaining thread stack information, I adjusted the sleep time here to 10s:

After running, click the "camera" icon here:

Click several times and there will be several dump information corresponding to the click time:

Since I need to observe the first two locks, and each time the thread enters the lock, it will wait for 10s, so I just click once between the first 10s and the second 10s of the project startup.

In order to observe the data more intuitively, I choose to click the following icon to copy the Dump information:

There is a lot of copied information, but we only need to care about the two threads why and mx.

Here is the relevant information from the first dump:

The mx thread is in BLOCKED state, waiting for the lock at address 0x000000076c07b058.

why the thread is in TIMED_WAITING state, it is sleeping, indicating that it has grabbed the lock and is executing business logic. And the lock it grabs, you said, coincidentally, it is 0x000000076c07b058 that the mx thread is waiting for.

Judging from the output log, the first ticket grab was indeed grabbed by the why thread:

From the dump information, the two threads are competing for the same lock, so there is no problem for the first time.

Ok, let's look at the second dump information:

This time, both threads are in TIMED_WAITING and both are sleeping, indicating that they have obtained the lock and entered the business logic.

But a closer look shows that the locks held by the two threads are not the same locks.

The mx lock is 0x000000076c07b058.

why lock is 0x000000076c07b048.

Since it is not the same lock, there is no competition relationship, so both can enter synchronized to execute business logic, so both threads are sleeping, and there is nothing wrong.

Then, I will put the information of the two Dumps together for you to see, so that it is more intuitive:

If I replace 0x000000076c07b058 with "lock one" and 0x000000076c07b048 with "lock two".

Then the process is like this:

why Once the lock is successful, the business logic is executed, and mx enters the lock-wait state.

why release lock 1, wait for the mx of lock 1 to wake up, hold lock 1, and continue to perform business.

At the same time, why the second lock is successful, and the business logic is executed.

From the thread stack, we did prove that the reason synchronized did not take effect was that the lock changed.

At the same time, we can also see from the thread stack why the output of the lock object System.identityHashCode is the same.

During the first dump, the tickets are all 10, in which mx did not grab the lock and was locked by synchronized.

why the thread executes the ticket-- operation, and the ticket becomes 9, but at this time, the locked monitor of the mx thread is still the object of ticket=10, which is still waiting in the _EntryList of the monitor, and will not be affected by the change of the ticket. Variety.

Therefore, when the why thread releases the lock, the mx thread gets the lock and continues to execute, and finds that ticket=9.

And why also got a new lock, can also enter the synchronized logic, and found that ticket=9.

Good guy, the tickets are all 9, can the System.identityHashCode be different?

It stands to reason that after releasing lock one, why should continue to compete with mx for lock one, but it doesn't know where it got a new lock.

Then the question arises: why has the lock changed?

Who moved my lock?

After the previous analysis, we confirmed that the lock has indeed changed. When you analyzed this, you were furious, slapped the table, and shouted: Which melon moved my lock? Isn't this shit?

According to my experience, don't be in a hurry at this time, continue to look down, and you will find that the clown is actually yourself:

After grabbing the ticket, the operation of ticket-- is performed, and isn't this ticket the object of your lock?

At this time, you slapped your thigh, suddenly realized, and said to the onlookers, "It's not a big problem, it's just a shaking hand."

So I waved my hand and changed the locked place to this:

synchronized (TicketConsumer.class)

Using the class object as the lock object ensures the uniqueness of the lock.

It has been verified that there is nothing wrong with it, it is perfect, and the work is over.

But is it really over?

In fact, about why the lock object has changed, there is still a little thing that has not been said.

It's hidden in the bytecode.

We can check the bytecode through the javap command, and we can see this information:

Integer.valueOf What is this?

The familiar cache of Integers from -128 to 127.

That is to say, in our program, the process of unboxing and packing will be involved, and the Integer.valueOf method will be called in this process. Specifically, it is the operation of ticket-- .

For Integer, the same object is returned when the value is in the cache range. When the cache range is exceeded, a new object will be created every time.

This should be a must-have knowledge point of Baguwen. What do I want to express by emphasizing this for you here?

It's very simple, just change the code to understand.

I changed the number of initial votes from 10 to 200, which exceeded the cache range. The result of the program is as follows:

Obviously, from the first log output, the locks are not the same lock.

This is what I said before: because the cache range is exceeded, the operation of new Integer(200) is performed twice. These are two different objects. When used as locks, they are two different locks. (Note that the program here is to remove static)

Modify it back to 10, run it once, and you will feel it:

From the log output, there is only one lock at this time, so only one thread grabs the ticket.

Because 10 is a number in the cache range, it is the same object obtained from the cache each time.

The purpose of writing this short paragraph is to reflect the knowledge point that Integer has cache, everyone knows it. But when it is mixed with other things, you have to analyze the problems caused by this cache, which is more effective than memorizing dry knowledge points directly.

but...

Our initial ticket is 10. After ticket-- , the ticket becomes 9, which is also within the cache range. Why does the lock change?

If you have this question, then I urge you to think again.

10 is 10 and 9 is 9.

Although they are all within the cache range, they are originally two different objects, and they are also new when building the cache:

Why am I adding this silly-looking statement?

Because when I see other similar questions on the Internet, some articles are not clearly written, which will make readers mistakenly think that "the values in the cache range are all the same object", which will mislead beginners.

In a word: Please don't use Integer as the lock object, you can't grasp it.

but...

stackoverflow

However, I saw a similar question on stackoverflow when I wrote the article.

The problem with this guy is: he knows that Integer cannot be used as a lock object, but his requirements seem to have to use Integer as a lock object.

https://stackoverflow.com/questions/659915/synchronizing-on-an-integer-value

I will describe his problem to you.

First of all, look at the place labeled ①. His program is actually obtained from the cache first. If there is no cache, it is obtained from the database, and then placed in the cache.

Very simple and clear logic.

But he considers the concurrent scenario, if there are multiple threads to obtain the same id at the same time, but the data corresponding to this id is not in the cache, then these threads will perform the action of querying the database and maintaining the cache.

Corresponding to the action of query and storage, he used the term "fairly expensive" to describe it.

It means "quite expensive". To put it bluntly, this action is very "heavy" and it is best not to repeat it.

So just let one thread perform the fairly expensive operation.

So he thought of the code for the place marked ②.

Use synchronized to lock the id, unfortunately, the id is of type Integer.

In the place labeled ③, he said it himself: different Integer objects do not share locks, so synchronized is useless.

In fact, his sentence is not rigorous. After the previous analysis, we know that the Integer objects in the cache range will still share the same lock. The "sharing" here means competition.

But obviously, his id range must be larger than the Integer cache range.

So the question arises: what to do with this thing?

The first question that came to my mind when I saw this question was: I seem to be doing the above requirement often, how did I do it?

After thinking about it for a few seconds, I suddenly realized, oh, now it's all distributed applications, and I use Redis directly for locks.

Never thought about it at all.

If Redis is not allowed now, it is a single application, then how to solve it?

Before looking at the high praise answer, let's take a look at a comment below this question:

The first three letters: FYI.

It doesn't matter if you don't understand it, because that's not the point.

But you know, my English is very high, so I will teach some English by the way.

FYI, is a commonly used English abbreviation, the full name is for your information, for reference.

So you know, he must have attached a document for you later, and the translation is: Brian Goetz mentioned in his Devoxx 2018 speech that we should not use Integer as a lock.

You can go directly to this part of the explanation through this link. It only takes less than 30 seconds to practice listening: https://www.youtube.com/watch?v=4r2Wg-TY7gU&t=3289s

So the question comes again?

Who is Brian Goetz, and why does he sound authoritative?

Java Language Architect at Oracle, the developer of the Java language, ask you if you are afraid.

At the same time, he is the author of the book "Java Concurrent Programming in Practice" that I have recommended many times.

Well, now that I have found the endorsement of the big guy, I will show you what Gaozan said in the answer.

I won't go into details in the previous part. In fact, it is the points we mentioned earlier. Integer cannot be used. It involves the internal and external cache...

Pay attention to the underlined part, I will add my own understanding to translate it for you:

If you really have to use Integer as a lock, then you need to make a Map or a Set of Integer, and by doing the mapping with the collection class, you can ensure that the mapping is a clear instance of what you want. And this instance can be used as a lock.

Then he gives this code snippet:

Is to use ConcurrentHashMap and then use the putIfAbsent method to do a mapping.

For example, if locks.putIfAbsent(200, 200) is called multiple times, there is only one Integer object with a value of 200 in the map. This is guaranteed by the characteristics of the map and does not need to be explained too much.

But this buddy is very good. In order to prevent someone from being unable to turn this corner, he explained it to everyone again.

First, he says you can also write:

But in this way, you will incur a small cost, that is, every time you access, if the value is not mapped, you will create an Object object.

To avoid this, he just keeps the integer itself in a Map. What is the purpose of doing this? How is this different from using the integer itself directly?

He explained it like this, which is actually what I said earlier "this is what the characteristics of map guarantee":

When you perform a get() from a Map, the equals() method is used to compare the keys.

Two different Integer instances of the same value, calling the equals() method will be judged to be the same.

Therefore, you can pass any number of different Integer instances of "new Integer(5)" as arguments to getCacheSyncObject, but you will always only get the first instance passed in that contains that value.

That's what I mean:

To sum up a sentence: It is mapped through Map, no matter how many Integers you new, these Integers will be mapped to the same Integer, thus ensuring that even if the Integer cache is exceeded, there is only one lock.

In addition to the high praise answer, there are two other answers I would like to say.

The first is this:

Don't care what he said, but I was shocked when I saw the translation of this sentence:

skin this cat ???

It's so cruel.

I thought at the time that this translation must not be right, it must be a little slang. So I checked it out, and it turned out to be this:

I will send you a little knowledge of English for free. You are welcome.

The second answer that should be concerned is at the end:

This buddy told you to look at the content of Section 5.6 of "Java Concurrent Programming in Action", which has the answer you are looking for.

Coincidentally, I had the book handy, so I opened it and took a look.

Section 5.6 is titled "Building Efficient and Scalable Results Caches":

Man, I took a closer look at this section and saw that this is a baby.

The sample code in the book you read:

.png)

Isn't it exactly the same as the code of the guy who asked the question?

They are all obtained from the cache, and if they are not available, they will be built again.

The difference is that the book adds synchronize to the method. But the book also says that this is the worst solution, just to elicit the problem.

Then he gave a relatively good solution with the help of ConcurrentHashMap, putIfAbsent and FutureTask.

You can see that the problem is solved from another angle. There is no entanglement in the synchronization at all, and the second method directly removes the synchronization.

After reading the plan in the book, I suddenly realized: my dear, although the plan given above can solve this problem, it always feels strange, and I can't tell what is wrong. It turned out to be staring at synchronize, and the idea was not opened at the beginning.

A total of four pieces of code are given in the book. The solutions are progressively written, and how to write them. Since the book has been written very clearly, I will not go into details. You can just flip through the book.

If you don't have a book, you can also search the original text by searching "Building an efficient and scalable results cache" on the Internet.

I'll show you the way, let's go.

This article has been included in the personal blog, welcome to play:

https://www.whywhy.vip/

why技术
2.2k 声望6.8k 粉丝