Hello everyone, this is Liang Xu.
Computer majors must have learned C language during school. It is the originator of many high-level languages. In-depth study of this language will give you a deeper understanding of computer principles, operating systems, memory management, and other low-level related knowledge. Therefore, during the live broadcast, I repeatedly emphasized that everyone must be well. Learn this language.
However, even the most experienced programmers will write all kinds of bugs. This article will take a look at 5 bugs that are easy to appear in the process of learning or using the C language, and how to avoid these bugs.
This article is mainly for beginners, old birds can ignore it (in fact, many old birds still make these low-level mistakes)~
1. Variables are not initialized
When the program starts, the system automatically allocates a piece of memory to it, and the program can use it to store data. So if you define a variable, its value may be arbitrary when it is not initialized.
But this is not absolute, some environments will automatically "clear" the memory when the program starts, so the default value of each variable is zero. Considering portability, it is best to initialize variables. This is a good habit that a qualified software engineer should develop.
Let's take a look at the following example program that uses several variables and two arrays:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i, j, k;
int numbers[5];
int *array;
puts("These variables are not initialized:");
printf(" i = %d\n", i);
printf(" j = %d\n", j);
printf(" k = %d\n", k);
puts("This array is not initialized:");
for (i = 0; i < 5; i++) {
printf(" numbers[%d] = %d\n", i, numbers[i]);
}
puts("malloc an array ...");
array = malloc(sizeof(int) * 5);
if (array) {
puts("This malloc'ed array is not initialized:");
for (i = 0; i < 5; i++) {
printf(" array[%d] = %d\n", i, array[i]);
}
free(array);
}
/* done */
puts("Ok");
return 0;
}
This program does not initialize the variable, so the value of the variable may be random, not necessarily zero. The results of its operation on my computer are as follows:
These variables are not initialized:
i = 0
j = 0
k = 32766
This array is not initialized:
numbers[0] = 0
numbers[1] = 0
numbers[2] = 4199024
numbers[3] = 0
numbers[4] = 0
malloc an array ...
This malloc'ed array is not initialized:
array[0] = 0
array[1] = 0
array[2] = 0
array[3] = 0
array[4] = 0
Ok
From the results, i
and j
value is exactly zero, but k
value 32766. In the numbers array, most elements also happen to be zero, except for the third one (4199024).
Compiling this same program on different operating systems may result in different results. So don't think that your result is correct and unique, you must consider portability.
For example, this is the result of the same program running on FreeDOS:
These variables are not initialized:
i = 0
j = 1074
k = 3120
This array is not initialized:
numbers[0] = 3106
numbers[1] = 1224
numbers[2] = 784
numbers[3] = 2926
numbers[4] = 1224
malloc an array ...
This malloc'ed array is not initialized:
array[0] = 3136
array[1] = 3136
array[2] = 14499
array[3] = -5886
array[4] = 219
Ok
It can be seen that the results of the operation are almost completely different from the above. Therefore, initializing the variables will save you a lot of unnecessary trouble and facilitate future debugging.
2. Array out of bounds
In the computer world, counting starts from 0, but there are always people who intentionally or unintentionally forget this. For example, if the length of an array is 10, if you want to get the value of the last element, someone always uses array[10]...
Don't ask, ask is what I wrote...
Novice friends make such low-level mistakes particularly frequently. Let's see what happens when the array goes out of bounds.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
int numbers[5];
int *array;
/* test 1 */
puts("This array has five elements (0 to 4)");
/* initalize the array */
for (i = 0; i < 5; i++) {
numbers[i] = i;
}
/* oops, this goes beyond the array bounds: */
for (i = 0; i < 10; i++) {
printf(" numbers[%d] = %d\n", i, numbers[i]);
}
/* test 2 */
puts("malloc an array ...");
array = malloc(sizeof(int) * 5);
if (array) {
puts("This malloc'ed array also has five elements (0 to 4)");
/* initalize the array */
for (i = 0; i < 5; i++) {
array[i] = i;
}
/* oops, this goes beyond the array bounds: */
for (i = 0; i < 10; i++) {
printf(" array[%d] = %d\n", i, array[i]);
}
free(array);
}
/* done */
puts("Ok");
return 0;
}
Please note that the program initializes the values of all elements of the array numbers (0~4), but reads the values of elements 0~9 out of bounds. It can be seen that the first five values are correct, but the ghost does not know what these values will be afterwards:
This array has five elements (0 to 4)
numbers[0] = 0
numbers[1] = 1
numbers[2] = 2
numbers[3] = 3
numbers[4] = 4
numbers[5] = 0
numbers[6] = 4198512
numbers[7] = 0
numbers[8] = 1326609712
numbers[9] = 32764
malloc an array ...
This malloc'ed array also has five elements (0 to 4)
array[0] = 0
array[1] = 1
array[2] = 2
array[3] = 3
array[4] = 4
array[5] = 0
array[6] = 133441
array[7] = 0
array[8] = 0
array[9] = 0
Ok
So when you write code, you must know the boundaries of the array. Fortunately, this kind of data is read, if once the memory is written, it will be core dumped directly!
3. String overflow
In the C programming language, a string is a set of char
values, which can also be regarded as an array. Therefore, you also need to avoid going beyond the scope of the string. If it exceeds, it is called string overflow .
In order to test the string overflow, a simple method is to use the gets
function to read the data. gets
function is very dangerous, because it does not know how much data can be stored in the string that receives it, and will only naively read data from the user.
If the user input string is relatively short, it is fine, but if the user input value exceeds the length of the received string, it may be disastrous.
Let's demonstrate this phenomenon below:
#include <stdio.h>
#include <string.h>
int main()
{
char name[10]; /* Such as "Beijing" */
int var1 = 1, var2 = 2;
/* show initial values */
printf("var1 = %d; var2 = %d\n", var1, var2);
/* this is bad .. please don't use gets */
puts("Where do you live?");
gets(name);
/* show ending values */
printf("<%s> is length %d\n", name, strlen(name));
printf("var1 = %d; var2 = %d\n", var1, var2);
/* done */
puts("Ok");
return 0;
}
In this code, the length of the receiving array is 10, so when the length of the input data is less than 10, the program runs without problems.
For example, enter the city Beijing with a length of 7:
var1 = 1; var2 = 2
Where do you live?
Beijing
<Beijing> is length 7
var1 = 1; var2 = 2
Ok
The small town of Welsh Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch
is the city with the longest name in the world. This string has 58 characters, which is far beyond the 10 characters that can be reserved in name
If you enter this string, the result is that other locations in the program memory, such as var1
and var2
, may be affected:
var1 = 1; var2 = 2
Where do you live?
Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch
<Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch> is length 58
var1 = 2036821625; var2 = 2003266668
Ok
Segmentation fault (core dumped)
Before aborting, the program overwrites other parts of the memory with a long string. Please note that var1
and var2
no longer their starting values 1
and 2
.
So we need to use a safer method to read user data. For example, the getline
function is a good choice. It will allocate a large enough memory to store user input, so the user will not accidentally overflow by entering a long string.
4. Repeated release of memory
One of the rules of good C programming is that if memory is allocated, it must be freed.
We can use the malloc
function to apply for memory for arrays and strings. The system will open up a memory and return a pointer to the starting address of the memory. After the memory is used up, we must remember to use the free
function to release the memory, and then the system will mark the memory as unused.
However, in this process, you can only call the free
once. If you call the free
function a second time, it will cause unexpected behavior and may break your program.
Below we give a simple example:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *array;
puts("malloc an array ...");
array = malloc(sizeof(int) * 5);
if (array) {
puts("malloc succeeded");
puts("Free the array...");
free(array);
}
puts("Free the array...");
free(array);
puts("Ok");
}
Running this program will cause a core dump error when free
malloc an array ...
malloc succeeded
Free the array...
Free the array...
free(): double free detected in tcache 2
Aborted (core dumped)
So how to avoid calling the free
function multiple times? One of the simplest methods is to put the malloc
and free
statements in a function.
If you put malloc
in one function, and put free
in another function, then in the process of use, if the logic design is not appropriate, free
may be called multiple times.
5. Use invalid file pointer
Files are a very common way of storing data in operating systems. For example, you can store the configuration information of the program in a config.dat
. When the program is running, you can call this file to read the configuration information.
Therefore, the ability to read data from files is important to all programmers. But what if the file you want to read does not exist?
In C language, to read a file, it is generally first to use the fopen
function to open the file, and then the function returns a stream pointer to the file.
If the file you want to read does not exist or your program cannot read it, the fopen
function will return NULL
. In this case, we still operate on it, what will happen? Let's take a look together:
#include <stdio.h>
int main()
{
FILE *pfile;
int ch;
puts("Open the FILE.TXT file ...");
pfile = fopen("FILE.TXT", "r");
/* you should check if the file pointer is valid, but we skipped that */
puts("Now display the contents of FILE.TXT ...");
while ((ch = fgetc(pfile)) != EOF) {
printf("<%c>", ch);
}
fclose(pfile);
/* done */
puts("Ok");
return 0;
}
When you run this program, if the file FILE.TXT does not exist, then pfile will return NULL. In this case, if we write to pfile, it will immediately cause core dump:
Open the FILE.TXT file ...
Now display the contents of FILE.TXT ...
Segmentation fault (core dumped)
Therefore, we must always check whether the file pointer is valid. For example, calling fopen
after the function opens the file, use if (pfile != NULL)
to ensure that the pointer can be used.
summary
No matter how experienced programmers are, they are likely to make mistakes, so when we write code, we must be rigorous and rigorous. However, if you develop some good habits and add some extra code to check for these five types of errors, you can avoid serious C programming errors.
What bugs have you written about the 5 common mistakes mentioned above? Leave a message to communicate with everyone, and see who is the king of bugs!
Finally, recently, many friends asked me for Linux learning roadmap , so based on my own experience, I spent a month staying up late in my spare time and compiled an e-book. Whether you are in an interview or self-improvement, I believe it will be helpful to you!
Give it to everyone for free, just ask you to give me a thumbs up!
e-book | Linux development learning roadmap
I also hope that some friends can join me to make this e-book more perfect!
Gain? I hope that the old guys will have a three-strike combo, so that more people can read this article
Recommended reading:
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。