5

Series catalog

From mbr to loader

article 160fecd83e446c BIOS boot to real mode , this article starts the writing of loader First review the disk mirroring and memory distribution diagram:

At present, only need to pay attention to the memory distribution below 1MB, mainly the yellow mbr and blue loader . In the previous article, mbr has been loaded into the memory, and the program flow has been executed to the loader jmp LOADER_BASE_ADDR (0x8000) mbr, and then the loader needs to be implemented.

The work of the loader

In general, the work of loader mainly includes the following items:

  • Establish GDT(Global Descriptor Table) , initialize the kernel code and data segment register ( segment registers ), and lead the CPU into the protected mode ( protection mode );
  • Create the kernel page directory ( page directory ) and page table ( page tables ), open the virtual memory ( virtual memory ), and enter the paging mode;
  • Load the kernel image to the memory, and then enter the kernel code execution, so far the control of the system is transferred to the kernel;

It can be seen that the loader has a lot of work, and some core parts of the x86 architecture have been involved. Therefore, in order to understand and implement the loader, you must be prepared for the following knowledge:

  • GDT, segment memory addressing, segment register, protection mode;
  • Virtual memory, page directory, page table;
  • elf file format, because the kernel will be compiled and linked into a file of this format;

loader implementation

Still the same as before, first give my project code link src/boot/loader.S for your reference.

This source code is already relatively large, especially since it is compiled in assembly, and the code also contains many utility functions and printing-related functions. In order to avoid falling into chaos, here are a few of the most important key nodes (functions), which represent the work that needs to be done for loader

# 入口
loader_start

# 初始化 GDT 并进入保护模式
setup_protection_mode
protection_mode_entry

# 初始化 kernel 页目录和页表
setup_page

# 加载并进入 kernel
init_kernel

Next we implement these functions one by one. In this article, we first initialize the GDT and enter the 32-bit protection mode.

Enter the loader

Before we start, we first look at the code at the beginning of the loader. Like mbr, the starting memory address of the loader encoding is still defined first 0x8000 . This is because we have designed it in advance, and mbr will change the loader from The disk is loaded to the memory location 0x8000 and jumps over, so the addressing of the loader must start from this address.

; LOADER_BASE_ADDR = 0x8000
SECTION loader vstart=LOADER_BASE_ADDR

Next, we formally enter the first code jmp loader_start loader. It is a simple jump. We jump to loader_start start the actual execution of the loader:

loader_entry:
  jmp loader_start

; 全局数据
; ...

loader_start:
  call clear_screen
  call setup_protection_mode

If you are not familiar with this method of assembly coding, you may find it strange, why do you want jmp ? What is the skipped part in the middle? The answer is that in the middle is the data part we want to define, similar to the global variables defined in the .c A bunch of strings for printing are defined there, as well as the crucial GDT .

You may have realized that the instructions and data in the assembly source code can be freely mixed and arranged, and their arrangement in the final compiled binary completely follows the arrangement of the source code. So you can arbitrarily arrange the location of your instructions and data, as long as the instruction stream can flow and execute smoothly, and it will not run away. Of course, the starting position of the loader 0x8000 must be the entry code, because this is the jump address agreed upon mbr As for the rest, you can play and arrange freely.

Initialize the GDT table

Coming to the definition part of the global data mentioned above, you can skip some of the print string information I added and go directly to the definition of GDT GDT entry are defined here, and each entry occupies 8 bytes or 64 bits. Regarding the meaning and field format of GDT, you can refer to here JamesM's kernel development tutorials recommended by me before. These are the historical burdens of the x86 architecture. I don't want to waste my pen and ink explaining it again, but our code must implement and obey its laws.

The first entry of GDT is reserved and not used; the fourth is the display video memory segment descriptor, this is actually not necessary, you can ignore it; so we only need to pay attention to the second and third items, they Yes:

  • Kernel code segment ( kernel code ) descriptor;
  • Kernel data segment ( kernel data ) descriptor;

We use the dd pseudo-instruction to define these two segment descriptors ( segment descriptor ):

CODE_DESC:
  dd DESC_CODE_LOW_32
  dd DESC_CODE_HIGH_32

DATA_DESC:
  dd DESC_DATA_LOW_32
  dd DESC_DATA_HIGH_32

DESC_CODE_LOW_32 , DESC_CODE_HIGH_32 , DESC_DATA_LOW_32 , DESC_DATA_HIGH_32 are all defined in src/boot/boot.inc , you can verify each bit against the manual document given above. Again, this is a boring, cumbersome, meticulous but inevitable job. There is no difficulty. What you need is the patience of reading the manual.


In order to take care of students who are not very familiar with assembly, it is necessary to explain the function of the dd dd means define double (4-bytes) , and similarly there are db (byte) , dw (word, 2-bytes) , which appear in the assembly source code, which means that the data content defined later is written in the compiled binary at this position. From this, you can once again experience the relationship between assembly and compiled binary, which is almost a rigid translation.

Enter protected mode

After setting the GDT, we can enter the protection mode:

; enable A20
in al, 0x92
or al, 0000_0010b
out 0x92, al

; load GDT
lgdt [gdt_ptr]

; open protection mode - set cr0 bit 0
mov eax, cr0
or eax, 0x00000001
mov cr0, eax

; refresh pipeline
jmp dword SELECTOR_CODE:protection_mode_entry

Note that the lgdt instruction GDT , and the bit of the protected mode of cr0 Later, through a far jump , the cs segment register is initialized to the kernel code segment. Note that the value of the cs mov instruction, but must be set implicitly through a jump statement.

After the jump, the program then comes to the execution of protection_mode_entry kernel data segment registers are initialized:

protection_mode_entry:
  ; set data segments
  mov ax, SELECTOR_DATA
  mov ds, ax
  mov es, ax
  mov ss, ax

  ; set video segment
  mov ax, SELECTOR_VIDEO
  mov gs, ax

At this point, the initialization of the protected mode is complete, and then it comes to the setup_page function, the key part of the loader, and starts to build the virtual memory of the kernel, which is left to the next article.


navi
612 声望191 粉丝

naive