Last week, a message on the Linux kernel mailing list about "the community recently discussed whether to adopt a modern C language standard for the kernel" caught the industry's attention. Just now, the Linux open source community has officially announced that the kernel C language version will be upgraded to C11 in the future, and it is expected to take effect after version 5.18 in May this year.
This sudden decision finally brought an upgrade to the 30-year-old Linux kernel C language.
As we all know, trying to convince the stubborn father of Linux, Linus Torvalds, is no easy task. So, why did Linus Torvalds finally let go this time? There seems to be a bit of a coincidence here.
The cause of the incident has to go back to last week's Linux community discussion.
A "chain reaction" caused by a bug
It is reported that a doctoral student named Jakob Koschel was researching speculative execution vulnerabilities related to the kernel linked list primitive at the time, and he discovered a problem in the process: the Linux kernel widely uses the double linked list defined by struct list_head:
struct list_head {
struct list_head *next, *prev;
};
Often, developers make linked lists of any related struct type possible by embedding such structs within other structs. At the same time, the kernel also provides a large number of functions and macros that can be used to traverse and manipulate linked lists. One of them is list_for_each_entry(), a macro disguised as a control structure.
Coincidentally, the problem lies with this macro.
We assume that the kernel contains the following structure:
struct foo {
int fooness;
struct list_head list;
};
The elements in List can then be used to create a doubly linked list of foo structures.
Assuming that there is a struct named foo_list declared as the head of such a linked list, the following code can be used to traverse the linked list:
struct foo *iterator;
list_for_each_entry(iterator, &foo_list, list) {
do_something_with(iterator);
}
/ Should not use iterator here /
The list parameter tells the macro the name of the list_head structure in the foo structure. The loop will execute once for each element in the list pointed to by the iterator.
And this causes an error in the USB subsystem: After exiting the macro, the iterator passed to the macro is still available. Of course, this is a very "dangerous" thing.
So, Koschel submitted a patch that rewrote the offending code to fix the bug by stopping using iterators after the loop ended. Subsequently, Jakob Koschel submitted a patch to Linus Torvalds for the speculative execution vulnerability related to the kernel linked list that was fixed by the Speculative Safe List Iterator Proposal.
The father of Linux is finally persuaded
Initially, Linus Torvalds himself didn't seem to like the patch very much, and didn't know what it had to do with the speculative execution vulnerability. But after Koschel explained it in detail, Linus admitted that it was just a common bug.
However, things weren't so simple, and Linus quickly realized the real problem: the iterator passed to the linked list traversal macro had to be declared in scope outside the loop itself.
The reason for this unpredictable error is that there is no "declare variable in loop" in C89.
We know that while the Linux kernel is developing rapidly, it also relies on some very old tools, one of which is that its kernel code is still using the 1989 version of the C language standard, that is, the standard was launched 30 years after the kernel project started Written many years ago.
Macros like list_for_each_entry() basically always leak the last HEAD entry out of the loop, just because iterator variables cannot be declared in the loop itself.
If it was possible to write an iterator list traversal macro to declare itself, then the iterator would not be visible outside the loop and would not have such a problem.
However, since the kernel is stuck on the C89 standard, it is not possible to declare variables in loops.
So Linus decided, "Let's upgrade", maybe it's time to upgrade to the C99 standard, although C99 is also over 20 years old, it's at least a little newer than C89 and can declare variables in loops.
Since C89 is obsolete, why hasn't it changed over the years? Linus explained, "This is because we had some weird issues with some old gcc compiler versions that couldn't be upgraded at will."
However, now that the Linux kernel has raised the minimum requirements for gcc to version 5.1, those weird bugs of the past should be gone.
Another core developer, Arnd Bergmann, is also concerned about this matter, he thinks that it is possible to upgrade to C11 or even higher, but upgrading to C17 or C2x will break gcc-5/6/7 support, so upgrading to C11 is easier to achieve .
Eventually, Linus Torvalds backed the idea, announcing that he would "try it out early in the 5.18 merge window".
While the next move to C11 may lead to some unexpected bugs, if all goes well, the next Linux kernel release will officially move to C11. What are your thoughts on this escalation event? You are also welcome to communicate and interact below.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。