Write the OS kernel from scratch-the realization of the process


Series catalog


In the previous articles, we built a framework for thread running and scheduling. We will begin Benpian thread above, progress towards process management.

The concept and difference between the thread thread and the process process should be commonplace, and I do not want to repeat these eight-legged essays here. For the realization of a small project such as scroll, thread is the focus and the skeleton, because thread is the basic unit of task operation; while process is just a higher-level encapsulation of one or more threads, which is more responsible for resources Management, for example, the content of each process management in the Linux system includes:

  • Virtual Memory;
  • File descriptor
  • Signaling mechanism
  • ......

Our project is relatively simple and will not involve complex file systems and signals, so the main responsibility of the process is to manage memory. This article first defines the structure of the process, and then mainly focuses on its two functions:

  • user stack management;
  • page management;

In the next few chapters, we will further show how the OS loads and runs a user executable program, which will be accompanied by the realization of functions such as fork / exec , which are all operations performed process

process structure

Define process_struct , which is the so-called pcb ( process control block ) in Linux:

struct process_struct {
  uint32 id;
  char name[32];
  enum process_status status;
  // map { tid -> threads }
  hash_table_t threads;
  // allocate user space stack for threads
  bitmap_t user_thread_stack_indexes;
  // exit code
  int32 exit_code;
  // page directory
  page_directory_t page_dir;

typedef struct process_struct pcb_t;

Only the most important fields are listed here, and the comments should be clearly written. For now, such a simple structure is sufficient.

user stack allocation

As mentioned in the previous article, multiple threads under each process have their own stack in the user space, so the process is responsible for allocating the location of these stacks for its threads. In fact, it is very simple. These stacks are just Arranged in order near the bottom of 3GB:

For example, we specify the location of a stack top, and then each stack is specified to be 64 KB, so that allocating stacks is very simple, only one bitmap can be done:

#define USER_STACK_TOP   0xBFC00000  // 0xC0000000 - 4MB
#define USER_STACK_SIZE  65536       // 64KB

struct process_struct {
  // ...
  bitmap_t user_thread_stack_indexes;
  // ...

You can see that in the create_new_user_thread function, there is a process of allocating stack for user thread:

// Find a free index from bitmap.
uint32 stack_index;
if (!bitmap_allocate_first_free(&process->user_thread_stack_indexes, &stack_index)) {
  return nullptr;

// Set user stack top.
thread->user_stack_index = stack_index;
uint32 thread_stack_top = USER_STACK_TOP - stack_index * USER_STACK_SIZE;

Note that there is a lock here, because there may be multiple threads competing under a process.

page management

process is to manage the virtual memory of the process. We know that virtual memory is isolated by process, and each process will save its own page directory and page tables . When the threads are switched, if the process to which the thread belongs is changed, then the page directory needs to be reloaded, which is reflected context switch

void do_context_switch() {
  // ...
  if (old_thread->process != next_thread->process) {
  // ...

void process_switch(pcb_t* process) {

Copy page table

Obviously, each process needs to create its own page directory when it is created, but generally speaking, except for a few original kernel processes when the kernel is initialized, the new process comes from the existing process fork , especially the user mode process. .

As a digression, I don’t know if you have ever wondered why the process has to be created from an existing fork, can’t it be created out of thin air, and then loaded into a new program to run? I think you should understand the use and programming paradigm of fork under Linux. The results of fork will also determine whether you are currently in the parent or child process. In most cases, it is fork + exec . Instead of being so troublesome, why not A system call is done, for example:

int create_process(char* program, int argc, char** argv)

It can completely replace the combination fork + exec

There are historical reasons for Unix, as well as considerations for its design philosophy. There are many discussions on the Internet. Some people like it and others oppose it. It is a difficult question to tell. Since we are novices, let's just follow Unix and create processes fork

The complete fork will be expanded in detail in the following system call article. This article only discusses a very important step in the fork process, that is, the copy of the page table. We know that the child process that just forks is exactly the same as the virtual memory of the parent at the beginning. This is why there are two almost the same processes running after the fork. The reason here is that the page table of the child is copied from the parent. Now, the content inside is exactly the same, which is also beneficial in terms of saving memory resources.

However, if the child just reads the memory, it’s okay. If a write operation occurs, then obviously the father and son can’t continue to share this memory. They must go their separate ways. This involves the copy of ( copy-on-write ) Technology will be realized here.

The code used in this section is mainly clone_crt_page_dir function.

The first is to create a new page directory , the size is a page, here is assigned a physical frame and a virtual page, note that this page must be page aligned, and then manually create a mapping relationship for them. After operating this new page directory, you can directly use the virtual address to access it.

int32 new_pd_frame = allocate_phy_frame();
uint32 copied_page_dir = (uint32)kmalloc_aligned(PAGE_SIZE);
map_page_with_frame(copied_page_dir, new_pd_frame);

Next, create a page tables mapping for the new page directory. As we mentioned before, all processes share the kernel space, so the 256 page tables of the kernel space are shared:

Therefore, in the page directory of all processes, pde[768] ~ pde[1023] are the same, just simply copy them.

pde_t* new_pd = (pde_t*)copied_page_dir;
pde_t* crt_pd = (pde_t*)PAGE_DIR_VIRTUAL;

for (uint32 i = 768; i < 1024; i++) {
  pde_t* new_pde = new_pd + i;
  if (i == 769) {
    new_pde->present = 1;
    new_pde->rw = 1;
    new_pde->user = 1;
    new_pde->frame = new_pd_frame;
  } else {
    *new_pde = *(crt_pd + i);

Note, however, that there is a pde that is special, item 769. in detail in the 16100cfce0af8c virtual memory preliminary study , which is the 769th 4MB space in the 4GB space, we use it to map 1024 page tables themselves, so the 769th item needs to point to the process Page directory:

After processing the kernel space, the next step is to copy the page tables of the user space. Each page table here needs to be copied, and then set the pde in the new page directory to point to it. Note that only the page table is copied here, instead of continuing to copy the pages managed by the page table, so that the virtual memory used by the parent and child processes is actually exactly the same:

int32 new_pt_frame = allocate_phy_frame();

// Copy page table and set ptes copy-on-write.
map_page_with_frame(copied_page_table, new_pt_frame);
       (void*)(PAGE_TABLES_VIRTUAL + i * PAGE_SIZE),

The copy of the page table here is the same as the page directory. We manually allocate the physical frame and the virtual page, and establish the mapping. All memory operations use virtual addresses.

The next step is a critical step. Since the parent and child processes share all virtual memory in the user space, but they need to be isolated during write, the copy-on-write pte in the page table of the parent and child will be temporarily They are all marked as read-only. If anyone tries to perform a write operation, page fault will be triggered. In the page fault handler, this page will be copied, and then pte will point to the newly copied page, thus achieving isolation:

// ...
crt_pte->rw = 0;
new_pte->rw = 0;
// ...

copy-on-write exception handling

copy-on-write has been mentioned above. After the page fault caused by copy-on-write is triggered, the problem needs to be solved in the page fault handler. The corresponding code is here .

Note that the judgment conditions for this type of page fault are:

if (pte->present && !pte->rw && is_write)

That is, the page is mapped, but it is marked as read-only, and the current operation that causes the page fault is a write operation.

We use a global hash table to store how many times the frame has been fork, that is, how many processes it is currently shared by. Every time copy-on-write is processed, its reference count will be decremented by 1. If there are still references, copy is required; otherwise, it means that this is the last process reference, and it can exclusively use this frame, and it can be directly marked as rw = true :

int32 cow_refs = change_cow_frame_refcount(pte->frame, -1);
if (cow_refs > 0) {
  // Allocate a new frame for 'copy' on write.
  frame = allocate_phy_frame();
  void* copy_page = (void*)COPIED_PAGE_VADDR;
  map_page_with_frame_impl((uint32)copy_page, frame);
         (void*)(virtual_addr / PAGE_SIZE * PAGE_SIZE),
  pte->frame = frame;
  pte->rw = 1;

  release_pages((uint32)copy_page, 1, false);
} else {
  pte->rw = 1;


This article is just the beginning of the process. It mainly defines the basic data structure of the process and realizes the memory management function of the process. This is also one of the most important responsibilities of the process in this project. In the next few articles, we will start to actually create the process, and will load the user executable file from the disk to run, which is the classic combination of fork + exec

阅读 544

naive programmer

511 声望
90 粉丝
0 条评论

naive programmer

511 声望
90 粉丝