Series catalog
- Preface
- Preparation work
- BIOS boot to real mode
- GDT and protection mode
- A preliminary
- load and enter the kernel
- display and print
- Global Descriptor Table GDT
- Interrupt handling
- virtual memory perfect
- implements heap and malloc
- first kernel thread
- multi-thread switching
- lock and multi-thread synchronization
- enter user mode
- process
- system call
- Simple file system
- Load executable program
- keyboard driver
- run shell
From mbr to loader
article 160fecd83e446c BIOS boot to real mode , this article starts the writing of loader
First review the disk mirroring and memory distribution diagram:
At present, only need to pay attention to the memory distribution below 1MB, mainly the yellow mbr
and blue loader
. In the previous article, mbr
has been loaded into the memory, and the program flow has been executed to the loader
jmp LOADER_BASE_ADDR (0x8000)
mbr, and then the loader needs to be implemented.
The work of the loader
In general, the work of loader mainly includes the following items:
- Establish
GDT(Global Descriptor Table)
, initialize the kernel code and data segment register (segment registers
), and lead the CPU into the protected mode (protection mode
); - Create the kernel page directory (
page directory
) and page table (page tables
), open the virtual memory (virtual memory
), and enter thepaging
mode; - Load the
kernel
image to the memory, and then enter the kernel code execution, so far the control of the system is transferred to the kernel;
It can be seen that the loader has a lot of work, and some core parts of the x86 architecture have been involved. Therefore, in order to understand and implement the loader, you must be prepared for the following knowledge:
- GDT, segment memory addressing, segment register, protection mode;
- Virtual memory, page directory, page table;
elf
file format, because the kernel will be compiled and linked into a file of this format;
loader implementation
Still the same as before, first give my project code link src/boot/loader.S for your reference.
This source code is already relatively large, especially since it is compiled in assembly, and the code also contains many utility functions and printing-related functions. In order to avoid falling into chaos, here are a few of the most important key nodes (functions), which represent the work that needs to be done for loader
# 入口
loader_start
# 初始化 GDT 并进入保护模式
setup_protection_mode
protection_mode_entry
# 初始化 kernel 页目录和页表
setup_page
# 加载并进入 kernel
init_kernel
Next we implement these functions one by one. In this article, we first initialize the GDT and enter the 32-bit protection mode.
Enter the loader
Before we start, we first look at the code at the beginning of the loader. Like mbr, the starting memory address of the loader encoding is still defined first 0x8000
. This is because we have designed it in advance, and mbr will change the loader from The disk is loaded to the memory location 0x8000 and jumps over, so the addressing of the loader must start from this address.
; LOADER_BASE_ADDR = 0x8000
SECTION loader vstart=LOADER_BASE_ADDR
Next, we formally enter the first code jmp loader_start
loader. It is a simple jump. We jump to loader_start
start the actual execution of the loader:
loader_entry:
jmp loader_start
; 全局数据
; ...
loader_start:
call clear_screen
call setup_protection_mode
If you are not familiar with this method of assembly coding, you may find it strange, why do you want jmp
? What is the skipped part in the middle? The answer is that in the middle is the data part we want to define, similar to the global variables defined in the .c
A bunch of strings for printing are defined there, as well as the crucial GDT
.
You may have realized that the instructions and data in the assembly source code can be freely mixed and arranged, and their arrangement in the final compiled binary completely follows the arrangement of the source code. So you can arbitrarily arrange the location of your instructions and data, as long as the instruction stream can flow and execute smoothly, and it will not run away. Of course, the starting position of the loader
0x8000
must be the entry code, because this is the jump address agreed upon mbr
As for the rest, you can play and arrange freely.
Initialize the GDT table
Coming to the definition part of the global data mentioned above, you can skip some of the print string information I added and go directly to the definition of GDT GDT entry
are defined here, and each entry occupies 8 bytes or 64 bits. Regarding the meaning and field format of GDT, you can refer to here JamesM's kernel development tutorials recommended by me before. These are the historical burdens of the x86 architecture. I don't want to waste my pen and ink explaining it again, but our code must implement and obey its laws.
The first entry of GDT is reserved and not used; the fourth is the display video
memory segment descriptor, this is actually not necessary, you can ignore it; so we only need to pay attention to the second and third items, they Yes:
- Kernel code segment (
kernel code
) descriptor; - Kernel data segment (
kernel data
) descriptor;
We use the dd
pseudo-instruction to define these two segment descriptors ( segment descriptor
):
CODE_DESC:
dd DESC_CODE_LOW_32
dd DESC_CODE_HIGH_32
DATA_DESC:
dd DESC_DATA_LOW_32
dd DESC_DATA_HIGH_32
DESC_CODE_LOW_32
, DESC_CODE_HIGH_32
, DESC_DATA_LOW_32
, DESC_DATA_HIGH_32
are all defined in src/boot/boot.inc , you can verify each bit against the manual document given above. Again, this is a boring, cumbersome, meticulous but inevitable job. There is no difficulty. What you need is the patience of reading the manual.
In order to take care of students who are not very familiar with assembly, it is necessary to explain the function of the dd
dd
means define double (4-bytes)
, and similarly there are db (byte)
, dw (word, 2-bytes)
, which appear in the assembly source code, which means that the data content defined later is written in the compiled binary at this position. From this, you can once again experience the relationship between assembly and compiled binary, which is almost a rigid translation.
Enter protected mode
After setting the GDT, we can enter the protection mode:
; enable A20
in al, 0x92
or al, 0000_0010b
out 0x92, al
; load GDT
lgdt [gdt_ptr]
; open protection mode - set cr0 bit 0
mov eax, cr0
or eax, 0x00000001
mov cr0, eax
; refresh pipeline
jmp dword SELECTOR_CODE:protection_mode_entry
Note that the lgdt
instruction GDT
, and the bit of the protected mode of cr0
Later, through a far jump
, the cs
segment register is initialized to the kernel code
segment. Note that the value of the cs
mov
instruction, but must be set implicitly through a jump statement.
After the jump, the program then comes to the execution of protection_mode_entry
kernel data
segment registers are initialized:
protection_mode_entry:
; set data segments
mov ax, SELECTOR_DATA
mov ds, ax
mov es, ax
mov ss, ax
; set video segment
mov ax, SELECTOR_VIDEO
mov gs, ax
At this point, the initialization of the protected mode is complete, and then it comes to the setup_page
function, the key part of the loader, and starts to build the virtual memory of the kernel, which is left to the next article.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。