英特尔 8088 处理器的指令预取电路:内部一瞥

  • Introduction: In 1979, Intel introduced the 8088 microprocessor. IBM's use of it in the IBM PC in 1981 was a crucial event, leading to the dominance of the x86 architecture. The 8088 increased performance through prefetching. The author has been reverse-engineering the 8088 from die photos and discusses its prefetch circuitry.
  • Die Details: The die photo shows the 8088 microprocessor under a microscope. It has a metal layer on top with the silicon and polysilicon hidden. Bond wires connect the die's pads to external pins. The chip is partitioned into a Bus Interface Unit (BIU) at the top and an Execution Unit (EU) below. The BIU handles memory accesses and fetches instructions, which are transferred to the EU via the queue bus.
  • Architectural Differences: The 8086 and 8088 present the same 16-bit architecture to the programmer. The key difference is that the 8088 has an 8-bit data bus instead of the 16-bit bus of the 8086. This reduces performance but enables cheaper hardware and allows reuse of 8-bit I/O circuitry. Internally, they are similar except for differences in the Bus Interface Unit and a few microcode differences.
  • Prefetching and Architecture: In microprocessor history when memory became slower than the CPU, the 8086 was one of the first to prefetch instructions. The 8088 also has a prefetch queue. Prefetching led to the division of processors into BIU and EU. The BIU contains the prefetch queue and an adder for address calculation. The EU executes instructions and has most of the registers.
  • Implementing the Queue: The 8088's prefetch queue is implemented with four 8-bit queue registers and two hardware pointers. There is ambiguity in determining if the queue is empty or full. The queue length is determined using a two-bit value and an MT flip-flop. The queue circuitry takes up a substantial part of the die.
  • The Loader: The loader provides synchronization between the prefetch queue and the instruction decoder. It uses a small state machine to fetch bytes at the right time and generate timing signals. It also avoids slowdowns in microcode execution by fetching the next instruction in advance.
  • Microcode and the Prefetch Queue: The loader fetches opcode and Mod R/M bytes. Microcode uses a separate mechanism to fetch additional instruction bytes. A jump or control flow change flushes the prefetch queue. The Instruction Pointer points to the next instruction to be fetched, not executed.
  • The Queue Registers: The 8086 and 8088 partition registers. The queue registers are physically part of the upper registers but wired differently. Intel used simulations to determine the best queue sizes. The queue control circuitry is different between the two processors.
  • Notes and References: Various notes and references are provided about x86 domination, memory speed, caches, design decisions, queue read process, DeMorgan's laws, memory reads/writes, constant ROM, and the width of the 8088's queue registers.
阅读 13
0 条评论