Microprocessor Design/Register File
Registers are temporary storage locations inside the CPU that hold data and addresses.
The register file is the component that contains all the general purpose registers of the microprocessor. A few CPUs also place special registers such as the PC and the status register in the register file. Other CPUs keep them separate.
When designing a CPU, some people distinguish between "architectural features" and the "implementation details". The "architectural features" are the programmer visible parts; if someone makes a new system where any of these parts are different from the old CPU, then suddenly all the old software won't work on the new CPU. The "implementation details" are the parts that, although we put even more time and effort into getting them to work, one can make a new system that has a different way of implementing them, and still keep software compatibility -- some programs may run a little faster, other programs may run a little slower, but they all produce the same results as on the earlier machine.
The programmer-visible register set has more impact on software compatibility than any other part of the datapath, and perhaps more than any other part in the entire computer. The architectural features of the programmer-visible register set are the number of registers, the number of bits in each register, and the logical organization of the registers. Assembly language programmers like to have many registers. Early microprocessors had painfully few registers, limited by the area of the chip. Today, many chips have room for huge numbers of registers, so the number of programmer-registers is limited by other constraints: More programmer-visible registers requires bigger operand fields. More programmer-visible registers requires more time saving and restoring registers on an interrupt or context switch. Software compatibility requires keeping exactly the same number, size, and organization of programmer-visible registers. Assembly language programmers like a "flat" address space, where the full address of any location in (virtual) memory fits in a single address register. And so the amount of (virtual) memory desired by an architect sets a minimum width to each address register. 
The idea of "general registers" -- a group of registers, any one of which can, at different times, operate as a stack pointer, index register, accumulator, program counter, etc. was invented around 1971.
The overall performance of many single-chip CPUs is limited by the speed of the read operation of the register file.
A simple register file is a set of registers and a decoder. The register file requires an address and a data input.
However, this simple register file isn't useful in a modern processor design, because there are some occasions when we don't want to write a new value to a register. Also, we typically want to read two values at once and write one value back in a single cycle. Consider the following equation:
To perform this operation, we want to read two values from the register file, A and B. We also have one result that we want to write back to the register file when the operation has completed. For cases where we do not want to write any value to the register file, we add a control signal called Read/Write. When the control signal is high, the data is written to a register, and when the control signal is low, no new values are written.
In this case, it is likely advantageous for us to specify a third address port for the write address:
Many people choose to use a 3-port register file for their pipelined microprocessor so it can execute such an ALU instructions every cycle. Every cycle the CPU reads values from 2 registers in the register file to prepare for operating on them as directed by one instruction, and simultaneously the CPU writes the results from some previous instruction into some other register in the register file. (Superscalar Processors and VLIW require a register file with 6 or more ports).
(Does a microprocessor with operand forwarding ever read from a register and write to the same register during the same clock cycle?)
Full-custom register files often start with a SRAM design. Like SRAM chips, SRAM-based register files include a differential pair of bit-lines for each port -- but instead of a single SRAM read/write port, register files typically have at least one dedicated write port and several dedicated read ports. A typical register file repeats this process every instruction cycle: First, the two bit-lines of a differential pair of each read port are shorted to each other and charged to Vdd/2 during a pre-charge phase. Then the word-line connects one bit cell to those lines -- slightly imbalancing the charge on those long bit lines. Finally, the CPU enables the sense amplifier which magnifies the slight imbalance to normal digital logic levels.
More registers than you can shake a stick at
Consider a situation where the machine word is very small, and therefore the available address space for registers is very limited. If we have a machine word that can only accommodate 2 bits of register address, we can only address 4 registers. However, register files are small to implement, so we have enough space for 32 registers. There are several solutions to this dilemma -- several ways of increasing performance by using many registers, even though we don't quite have enough bits in the instruction word to directly address all of them.
Some of those solutions include:
- special-purpose registers that are always used for some specific instruction, and so that instruction doesn't need any bits to specify that register.
- In almost every CPU, the program counter PC and the status register are treated differently than the other registers, with their own special set of instructions.
- separating registers into two groups, "address registers" and "data registers", so an instruction that uses an address needs enough bits to select one out of all the address registers, which is 1 less bit than one out of every register.
- register windowing as on SPARC
- using a "register bank".
The term "register bank" is used in two different senses. Some CPUs have several different groups (or "banks") of registers, rather than one monolithic register file.
One kind of "register banking" is similar to (main memory) bank switching. Because this kind is visible to the assembly language programmer, in this chapter we'll call it "architectural banked registers". Other kinds of "register banking" allow a CPU to run exactly the same software as a single-bank implementation (invisible to the assembly language programmer), so in in this chapter we'll call them "microarchitectural banked registers" -- these implementation options have various advantages over a monolithic register file implementation.
architectural banked registers
Consider a situation where the machine word is very small, and therefore the available address space for registers is very limited. If we have a machine word that can only accommodate 2 bits of register address, we can only address 4 registers. However, register files are small to implement, so we have enough space for 32 registers. The solution to this dilemma is to utilize a register bank which consists of a series of register files combined together.
A register bank contains a number of register files or pages. Only one page can be active at a time, and there are additional instructions added to the ISA to switch between the available register pages. Data values can only be written to and read from the currently active register page, but instructions can exist to move data from one page to another.
As can be seen in this image, the gray box represents the current page, and the page can be moved up and down on the register bank.
If the register bank has N registers, and a page can only show M registers (with N > M), we can address registers with two values, n and m respectively. We can define these values as:
In other words, n and m are the number of bits required to address N and M registers, respectively. We can break down the address into a single value as such:
Where p is the number of bits reserved to specify the current register page. As we can see from this graphic, the current register address is simply the concatenation of the page address and the register address.
microarchitectural banked registers
There are several techniques that divide up the physical registers into several physical register files ("banks"), using control logic to make the CPU as a whole still software-compatible to other implementations of the same CPU architecture that only use one monolithic register file. In modern high-performance CPU, such techniques are used to build software-compatible CPUs that run at higher speed, use less power, and require less area than implementing the same CPU architecture with a single conventional multiported register file.
Replicating the entire register file, as in the POWER2 and the Alpha 21264 and the Alpha 21464, is the simplest kind of microarchitectural register banking. These CPUs implemented their register file internally with 2 copies of the entire (architectural) register file, and connect half the functional units to each copy. This requires each register file to have the same number of write ports as a monolithic register file (since all writes are sent to *both* copies to keep them in sync), but each register file requires only half the number of read ports as a monolithic register file.
- "Computer architecture: fundamentals and principles of computer design" by Joseph D. Dumas 2006 page 111.
- "general registers" were invented by C. Gordon Bell and Allen Newell as they were working on their book, Computer Structures: Readings and Examples (1971). -- Frederik Nebeker. "More Treasured Texts" article. "IEEE Spectrum" 2003 July.
- Larry R. Fenstermaker. "Current mode sense amplifiers applied to dual port register files by Larry R. Fenstermaker". 1998.
- Akshay Vijayashekar and Hasan Ali. "Optimized Register File Implementation". quote: "The purpose of this thesis is to implement a full custom, low power and area efficient register file for an Atmel 32-bit microcontroller. ... The size of the register file is 32 words of 32 bits each with two write ports and four read ports."
- Norman P. Jouppi and Jeffrey Y. F. Tang "A 20-MIPS Sustained 32-bit CMOS Microprocessor with High Ratio of Sustained to Peak Performance". doi:10.1.1.85.988. 1989. Section "4.2. The Register File Sense Amplifier". p. 10-12.
- Robert Reese. "Register Files". 1999.
- "What does banking mean when applied to registers?"
- Jessica H. Tseng, and Krste Asanović. "Banked Multiported Register Files for High-Frequency Superscalar Microprocessors". 2003.
- Il Park, Michael D. Powell, and T. N. Vijaykumar. "Reducing Register Ports for Higher Speed and Lower Energy". 2002.