Microprocessor Design/Design Steps
When designing a new microprocessor or microcontroller unit, there are a few general steps that can be followed to make the process flow more logically. These few steps can be further sub-divided into smaller tasks that can be tackled more easily. The general steps to designing a new microprocessor are:
- Determine the capabilities the new processor should have.
- Lay out the datapath to handle the necessary capabilities.
- Define the machine code instruction format (ISA).
- Construct the necessary logic to control the datapath.
We will discuss each of these steps below:
Determine Machine Capabilities
Before you start to design a new processor element, it is important to first ask why you are designing it at all. What new thing will your processor do that existing processors cannot? Keep in mind that it is always less expensive to utilize an existing chip than to design and manufacture a new one.
Some questions to start:
- Is this chip an embedded chip, a general-purpose chip, or a different type entirely?
- What, if any, are the limitations in terms of resources, price, power, or speed?
With that in mind, we need to ask what our chip will do:
- Does it have integer, floating-point, or fixed point arithmetic, or a combination of all three?
- Does it have scalar or vector operation abilities?
- Is it self-contained, or must it interface with a number of external peripherals?
- Will it support interrupts? If so, How much interrupt latency is tolerable? How much interrupt-response jitter is tolerable?
We also need to ask ourselves whether the machine will support a wide array of instructions, or if it will have a limited set of instructions. More instructions make the design more difficult, but make programming and using the chip easier. On the other hand, having fewer instructions is easier to design, but can be harder and more costly to program.
Lay out the basic arithmetic operations you want your chip to have:
- Shifting and Rotating
- Logical Operations: AND, OR, XOR, NOR, NOT, etc.
List other capabilities that your machine has:
- Unconditional jumps
- Conditional Jumps (and what conditions?)
- Stack operations (Push, pop)
Once we know what our chip is supposed to do, it is easer to lay out the framework for our datapath
Design the Datapath
Right off the bat we need to determine what ALU architecture that our processor will use:
- A combination of the above 3
This decision, more than any other, is going to have the largest effect on your final design. Do not proceed in the design process until you have made this decision. Once you have your ALU architecture, you create your memory element (stack or register file), and you can lay out your ALU.
Once we have our basic datapath, we can start to design our ISA. There are a few things that we need to consider:
- Is this processor RISC, CISC, or VLIW?
- How long is a machine word?
- How do you deal with immediate values? What kinds of instructions can accept immediate values?
Once we have our machine code basics, we frequently need to determine whether our processor will be compatible with higher-level languages. Specifically, are there any instructions that can be used for function call and return?
Determining the length of the instruction word in a RISC is a very important matter, and one that is worth a considerable amount of thought. For additional flexibility you can utilize a variable-length instruction set instead — like most CISC machines — at the expense of additional—and more complicated—instruction decode logic. If the instruction word is too long, programmers will be able to fit fewer instructions into memory. If the instruction word is too small, there will not be enough room for all the necessary information. On a desktop PC with several megabytes or even gigabytes of RAM, large instruction words are not a big problem. On an embedded system however, with limited program ROM, the length of the instruction word will have a direct effect on the size of potential programs, and the usefulness of the chips.
Each instruction should have an associated opcode, and typically the length of the opcode field should be constant for all instructions, to reduce complexity of the decoder. The length of the opcode field will directly impact the number of distinct instructions that can be implemented. if the opcode field is too small, you won't have enough room to designate all your instructions. If your opcode is too large, you will be wasting precious bits in your instruction word.
Some instructions will need to be larger than others. For instance, instructions that deal with an immediate value, a memory location, or a jump address are typically larger than instructions that only deal with registers. Instructions that deal only with registers, therefore, will have additional space left over that can be used as an extension to the opcode field.
Example: MIPS R-Type
In the MIPS architecture, instructions that only deal with registers are called R type instructions. With 32 registers, a register address is only 5 bits wide. The MIPS opcode is 6 bits wide. With the opcode and the three register addresses (two source and 1 destination register), an R-type instruction only uses 21 out of the 32 bits available.
The additional 11 bits are broken into two additional fields: Shamt, a 5 bit immediate value that controls the amount of places shifted by a shift or rotate instruction, and Func. Func is a 6 bit field that contains additional information about R-Type instructions. Because of the availability of the Func field, all R-Type instructions share an opcode of 0.
Instruction Set Design
Picking a particular set of instructions is often more an art than a science.
Historically there have been different perspectives on what makes a "good" instruction set.
- The early CISC years focused on making instruction sets that expert assembly language programmers enjoyed programming -- " code density" was a common metric.
- the early RISC years focused on making instruction sets that ran a few benchmark programs in C, when compiled with relatively primitive compilers, really, really fast -- "cycles per instruction", and later "instructions per cycle" was recognized as an important part of achieving low "time to run the benchmark".
- The rise of multitasking operating systems (and shared-memory parallel processors) lead to the discovery of non-blocking synchronization and the instructions necessary to support it.
- CPUs dedicated to a single application (ASICs or FPGAs) led to the idea of customizing the CPU for one particular application
- The rise of viruses and other malware led to the recognition of the Popek and Goldberg virtualization requirements.
Build Control Logic
Once we have our datapath and our ISA, we can start to construct the logic of our primary control unit. These units are typically implemented as a finite state machine, and we can try to map the ISA to the control unit in a logical way.
Design the Address Path
If a simple virtual==physical address path is adequate for your CPU, you can skip this section.
Most processors have a very simple address path -- address bits come from the PC or some other programmer-visible register, or directly from some instruction, and they are directly applied to the address bus.
Many general-purpose processors have a more complex address path: user-level programs run as if they have a simple address path, but the physical address applied to the address bus is significantly different than the programmer-visible address. If your CPU needs to do this, then you need something to translate user-visible addresses to physical address -- either design the CPU to connect to some off-chip bank register or MMU (such as the 8722 MMU or the 68851 MMU) or design in an on-chip bank register or MMU.
You may want to do this in order to:
- support various debug tools that trap on reads or writes to selected addresses.
- allow access to more RAM (wider physical address) than the user-level address seems to support (banking)
- support many different programs all in different physical RAM locations, even though they were all compiled to run at location 0x300.
- allow a program to successfully read and write a large block of data using normal LOAD and STORE instructions as if it were all in RAM, even though the machine doesn't have that much RAM (paging with virtual memory)
- support a "protected" supervisor-level system that can run buggy or malicious user-level code in an isolated sandbox at full speed without damaging other user-level programs or the supervisor system itself -- Popek and Goldberg virtualization, W xor X memory protection, etc.
- or some combination of the above.
Verify the design
People who design a CPU often spend more time on functional verification than all other steps combined.
- Kong and Patterson. "Instruction set design". 1995.
- "Generating instruction sets and microarchitectures from applications" by Ing-Jer Huang, and Alvin M. Despain