1
头图

foreword

Students who have written programs on linux must have the experience of analyzing how much memory a process occupies, or have been asked such questions - how much memory (physical memory) does your program occupy when running?

Usually we can see how much memory a process occupies with the top command. Here we can see three important indicators of VIRT, RES and SHR. What do they mean?

This is the question that this article needs to discuss with you. Of course, if you go deeper, you may ask where the physical memory occupied by the process is used? At this time, the top command may not give you the answer you want, but we can analyze the smaps file provided by the proc file system. This article lists the physical memory usage occupied by the current process in detail.

This article will be divided into three parts:

1. Briefly explain the two important concepts of virtual memory and resident memory;

2. Explain the actual reference meaning of the three parameters VIRT, RES and SHR in the top command;

3. Introduce the format of the smaps file to you. By analyzing the smaps file, we can understand the usage of the physical memory of the process in detail, such as how much space is occupied by the mmap file, how much space is consumed by dynamic memory development, how much space is consumed by the function call stack, etc. Wait.

1. Two concepts about memory

To understand the output of the top command about memory usage, we must first understand the concepts of virtual memory (Virtual Memory) and resident memory (Resident Memory).

(1) Virtual memory

The first thing to emphasize is that virtual memory is different from physical memory. Although both contain the word memory, they belong to two different concepts. The fact that a process occupies a large amount of virtual memory does not necessarily mean that the physical memory of the program must also occupy a large amount. Virtual memory is a logical memory space concept carefully designed by the operating system kernel for process address space management.

The pointers in our program are actually addresses in this virtual memory space. For example, we need to use g++ to compile after writing a C++ program. At this time, the address used by the compiler is actually the address of the virtual memory space. Because the program has not yet run at this time, what about the physical memory space address? All the instructions or data that may be needed during the running of the program must be in the virtual memory space.

Since virtual memory is a logical (fake) memory space, in order to allow programs to run on physical machines, there must be a mechanism that allows these fictitious virtual memory spaces to be mapped to physical memory spaces (real ones). space on the RAM stick). This is actually what the page table in the operating system does.

The kernel maintains a separate page mapping table for each process in the system. The basic principle of the page mapping table is to map a section of virtual memory space that needs to be accessed during the running of the program to a section of physical memory space through the page mapping table, so that when the CPU accesses the corresponding virtual memory address, it can use this to find the page mapping table. mechanism to access a corresponding address on physical memory. A "page" is the basic unit of mapping virtual memory space to physical memory space.

The following figure demonstrates the relationship between virtual memory space and physical memory space, which are associated with Page Table. The shaded parts in the virtual memory space are respectively mapped to the same shaded parts in the physical memory space. The gray part in the virtual memory space indicates that there is no corresponding part in the physical memory space, that is to say, the gray part is not mapped to the physical memory space. This is also based on the guiding ideology of "on-demand mapping". Because the virtual memory space is very large, many parts of it may not need to be accessed at all during a program running, so there is no need to map these parts of the virtual memory space. to the physical memory space.
image.png

So far, we have basically explained what virtual memory is.

To sum up, virtual memory is an imaginary memory space, and the part that needs to be accessed in the virtual memory space will be mapped to the physical memory space during the running of the program. A large virtual memory space can only mean that the accessible space during the running of the program is relatively large, but it does not mean that the physical memory space is also large.

(2) resident memory

Resident memory, as the name suggests, refers to the physical memory that is mapped into the virtual memory space of the process. In the above figure, the parts colored in the system physical memory space are all resident memory. For example, A1, A2, A3 and A4 are the resident memory of process A; B1, B2 and B3 are the resident memory of process B.

The resident memory of a process is the physical memory actually occupied by the process. Generally speaking, how much memory a process occupies is actually how much resident memory is used instead of how much virtual memory. Because the virtual memory is large does not mean that the physical memory occupied is large.

2. VIRT, RES and SHR in the top command

Here we talk about the two concepts of virtual memory and resident memory. In the next part, let's take a look at what VIRT, RES, and SHR mean in the top command.

After figuring out the concept of virtual memory, it is very simple to explain the meaning of VIRT. VIRT represents the size of the process virtual memory space. Corresponding to the process A in Figure 1, it is the sum of all spaces of A1, A2, A3, A4 and the gray part. That is to say, VIRT contains the sum of the part that has been mapped to the physical memory space and the part that has not yet been mapped to the physical memory space.

The meaning of RES refers to the size of the part of the virtual memory space of the process that has been mapped to the physical memory space. Corresponding to the process A in Figure 1, it is the sum of several partial spaces of A1, A2, A3 and A4. So, to see how much memory the process occupies during the running process, you should look at the value of RES rather than the value of VIRT.

Finally, let's take a look at what SHR means.

SHR is the abbreviation of share (shared), which represents the size of the shared memory occupied by the process. In the above figure, we see that A4 in the virtual memory space of process A and B3 in the virtual memory space of process B are both mapped to the A4/B3 part of the physical memory space. It's strange at first glance.

Why is there such a situation?

In fact, the programs we write will depend on many external dynamic libraries (.so), such as libc.so, libld.so, and so on. These dynamic libraries will only be saved/mapped in memory. If a process needs this dynamic library when running, the dynamic loader will map this memory to the virtual memory space of the corresponding process. This is also the case when multiple processes communicate with each other through shared memory.

As a result, the virtual memory space of different processes will be mapped to the same physical memory space. This part of the physical memory space is actually shared by multiple processes, so we call them shared memory, which is represented by SHR.

The memory occupied by a process is its own exclusive memory in addition to the memory shared with other processes. So to calculate the size of the exclusive memory of the process, just subtract the SHR value from the RES value.

3. The smaps file of the process

Through the top command, we can already see the virtual space size (VIRT) of the process, the physical memory occupied (RES), and the memory shared with other processes (SHR). But that's all, if I want to know questions like:

The distribution of the virtual memory space of the process, such as how much space does the heap occupy, how much space does the file mapping (mmap) occupy, and how much space does the stack occupy?

Does the process have memory that is swapped into the swap space, and if so, what is the size that is swapped out?

How many pages of a data file opened by mmap are dirty pages in memory that have not been written back to disk?

How many pages of the data file opened by mmap are currently in memory, and how many pages are still in the disk and have not been loaded into the page cache?

None of the above questions can be answered by the top command, but sometimes these questions are exactly what we need to answer when we analyze and optimize the performance bottleneck of the program. Fortunately, there are more solutions to problems in the world than the problems themselves. Linux provides a smaps file for each process through the proc file system. By analyzing this file, we can answer the above questions one by one.

In the smaps file, each record (as shown in the figure below) represents a contiguous area in the virtual memory space of the process. The first line represents the address range, permission identifier, mapping file offset, device number, inode, and file path from left to right. For a detailed explanation, see understanding-linux-proc-id-maps.

The meanings of the next 8 fields are as follows:

• Size: Indicates the size of the mapped area in the virtual memory space.

• Rss: Indicates how much space the mapped area currently occupies in physical memory

• Shared_Clean: The size of the unwritten pages shared with other processes

• Shared_Dirty: The size of the overwritten page shared with other processes

• Private_Clean: Size of private pages that are not overwritten.

• Private_Dirty: The size of private pages that have been overwritten.

• Swap: Indicates the size of non-mmap memory (also called anonymous memory, such as memory dynamically allocated by malloc) that is swapped to the swap space due to insufficient physical memory.

• Pss: The size of the physical memory used by the virtual memory area after the amortization calculation (some memory will be shared with other processes, such as mmap). For example, the physical memory part mapped by this area is also mapped by another process, and the size of this part of the physical memory is 1000KB, then the process allocates half of the memory, that is, Pss=500KB.
image.png

With smap so detailed about the mapping information from virtual memory space to physical memory space, I believe you have been able to answer the four questions raised above by analyzing this file.

Finally, I hope that you can have a clearer understanding of the virtual memory and physical memory of the process by reading this article, and can more accurately understand the output of the top command about memory. Finally, you can further analyze the memory usage of the process through the smaps file.
———————————————
Copyright statement: This article is an original article by the CSDN blogger "JD Cloud Developer" and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting.
Original link: https://blog.csdn.net/jdcdev_/article/details/126847305


京东云开发者
3.4k 声望5.5k 粉丝

京东云开发者(Developer of JD Technology)是京东云旗下为AI、云计算、IoT等相关领域开发者提供技术分享交流的平台。