Microprocessor Design/Virtual Memory
Virtual Memory is a computer concept where the main memory is broken up into a series of individual pages. Those pages can be moved in memory as a unit, or they can even be moved to secondary storage to make room in main memory for new data. In essence, virtual memory allows a computer to use more RAM than it has available.
If a simple virtual==physical address path is adequate for your CPU, you don't need virtual memory.
Most processors have a very simple address path -- address bits come from the PC or some other programmer-visible register, or directly from some instruction, and they are directly applied to the address bus.
Many general-purpose processors have a more complex address path: user-level programs run as if they have a simple address path, but the physical address applied to the address bus is significantly different than the programmer-visible address. This enables virtual memory, memory protection, and other desirable features.
If your CPU needs to do this, then you need something to translate user-visible addresses to physical address -- either design the CPU to connect to some off-chip bank register or MMU (such as the 8722 MMU or the 68851 MMU) or design in an on-chip bank register or MMU.
You may want to do this in order to:
- support various debug tools that trap on reads or writes to selected addresses.
- allow access to more RAM (wider physical address) than the user-level address seems to support (banking)
- support many different programs all in RAM at the same time at different physical RAM locations, even though they were all compiled to run at location 0x300.
- allow a program to successfully read and write a large block of data using normal LOAD and STORE instructions as if it were all in RAM, even though the machine doesn't have that much RAM (paging with virtual memory)
- support a "protected" supervisor-level system that can run buggy or malicious user-level code in an isolated sandbox at full speed without damaging other user-level programs or the supervisor system itself -- Popek and Goldberg virtualization, W xor X memory protection, etc.
- or some combination of the above.
Implementation[edit | edit source]
Virtual memory can be implemented both in hardware and (to a lesser degree) in software, although many modern implementations have both hard and soft components. We discuss virtual memory here because many modern PC and server processors have virtual memory capabilities built in.
Paging systems are designed to be transparent, that is, the (user-mode) programs running on the microprocessor do not need to be explicitly aware of the paging mechanism to operate correctly.
Many processor systems give pages certain qualifiers to specify what kinds of data can be stored in the page. For instance, many new processors specify whether a page contains instructions or data, so that data pages cannot be executed as instructions, and instructions cannot be corrupted by data writes (see W^X).
The hardware part of virtual memory is called the memory management unit (MMU). Most MMUs have a granularity of one page.
A few CPU designs use a more fine-grained access control to detect and prevent buffer overflow bugs, a common security vulnerability.
Memory Accessing[edit | edit source]
Memory addresses correspond to a particular page, and an offset within that page. If a page is 212 bytes in a 32-bit computer, then the first 20 bits of the memory address are the page address, and the lower 12 bits are the offset of the data inside that page. The top 20 bits in this case will be separated from the address, and they will be replaced with the current physical address of that page. If the page does not exist in main memory, the processor (or the paging software) will retrieve the page from secondary storage, which can cause a significant delay.
Pages[edit | edit source]
A page is a basic unit of memory, typically several kilobytes or larger. A page may be moved in memory to different locations, or if it is not being used, it can frequently be moved to secondary storage instead. The area in the secondary storage is typically known as the page file, the "scratchpad", or something similar.
Page Table[edit | edit source]
The addresses of the various pages are stored in a paging table. The paging table itself can be stored in a memory unit inside the processor, or it can reside in a dedicated area of main memory.
Page Faults[edit | edit source]
A page fault occurs when the processor cannot find a page in the page table.
Translation Look-Aside Buffer[edit | edit source]
The translation look-aside buffer (TLB) is a small structure, similar to a cache, that stores the addresses of the most recently used pages. Looking up a page in the TLB is much faster than searching for the page in the page table. When the processor cannot find a particular page in the TLB, it is known as a "TLB Miss". When the TLB misses, the processor looks for the page in the page table. If the page is not in the table either, there is a page fault.
Notice that even though the TLB can be considered a kind of cache, caching part of the page table stored in main memory, it is a physically separate structure than the instruction cache or the data cache, and has several features not found in those caches.
TLB entry[edit | edit source]
The SRAM in the TLB can be seen as entirely composed of TLB entries. The format of the TLB entries in the TLB SRAM is fixed by the TLB hardware. The paging supervisor -- part of the operating system -- typically maintains a page table in main memory which stores the page table entries in exactly the same format as TLB entries. Each TLB entry contains:
- the virtual address of a page (analogous to the "tag" in a cache)
- the physical address of a page (analogous to the "data" in a cache)
While not essential, some TLB hardware has many other optional control and protection fields and flags in the TLB, including:
- the no-execute bit (NX bit), used to implement W^X ("Write XOR Execute")
- a "dirty bit" (also called the "modified bit"), set whenever there is a STORE written into that page, and typically cleared when the modified page is written to the backing store.
- the writable bit, used to implement PaX, sometimes cleared and later set by the OS in order to implement copy-on-write (COW)
- which virtual address space a physical page belongs to (unnecessary on a single address space operating system)
- the supervisor bit
- statistics on which TLB entries were most recently or most frequently used, used to decide which TLB entry to discard when loading a new TLB entry from main memory
- statistics on which page was most recently or most frequently used, used to support LRU or more sophisticated page-replacement algorithms that decide which page currently in main memory to "page out" to the backing store when the OS needs to load some other page from the backing store into physical memory
The page table entries may include additional per-page fields that are not copied into the TLB entries, such as
- the "pinned bit" (aka "fixed flag") that indicates that a page must stay in main memory -- the paging supervisor marks as pinned pages that must stay in main memory, including the paging supervisor executable code itself, the device drivers for the secondary storage devices on which pages are swapped out; interrupt handler executable code. Some data buffers are also pinned during I/O transactions during the time that devices outside the CPU read or write those buffers (direct memory access and I/O channel hardware).
- a "present" bit (clear when that particular virtual page does not currently exist in physical main memory)
Further reading[edit | edit source]
- Thomas W. Barr, Alan L. Cox, Scott Rixner. "Translation Caching: Skip, Don’t Walk (the Page Table)". describes the Intel x86-64 MMU cache, the AMD Page Walk Cache, 3 other MMU cache arrangements, and compares their performance.
- B. Jacob, and T. Mudge. "Virtual memory in contemporary microprocessors". IEEE Micro 1998 July.
- Albert Kwon, Udit Dhawan, Jonathan M. Smith, Thomas F. Knight, Jr., and André DeHon. "Low-Fat Pointers: Compact Encoding and Efficient Gate-Level Implementation of Fat Pointers for Spatial Safety and Capability-based Security". 2013.