SPARC Assembly/SPARC Details

From Wikibooks, open books for an open world
< SPARC Assembly
Jump to: navigation, search

RISC Computers[edit]

Registers[edit]

SPARC processors have 32 integer registers. These registers are broken down into 4 basic categories: globals, locals, inputs, and outputs. The table below shows the general breakdown:

Number Purpose Specific name
%r0–%r7 Globals: accessible anywhere in a program %g0–%g7
%r8–%r15 Outputs: used to pass values to/ obtain values from subroutines %o0–%o7
%r16–%r23 Locals: used within subroutines to manipulate data %l0–%l7
%r24–%r31 Inputs: contain data passed to a subroutine %i0–%i7

Dispersed throughout these categories are several special purpose registers:

Name Number Purpose Pseudonym
Stack pointer 14 Pointer to the head of the stack. %sp/ %o6
Frame pointer 30 Pointer to the current stack frame. %fp/ %i6
Return address 31 Return address of the subroutine. %i7
Called return address 15 Return address of the called subroutine.  %o7

As you can see from the above tables, each register has at least two names, and some of the special purpose registers have three. Any of the available names for a given register is perfectly acceptable regardless of the usage context, and it is up to the programmer to choose which names to use at any particular time. Additionally, using the stack and frame pointer registers in a way other than which they were intended is not recommended and can cause severe functionality issues within a program.

SPARC processors also contain an array of floating-point registers and a small number of special-purpose registers. (further description needed here)

The Fetch and Execute Instruction Cycle[edit]

Delayed Branch[edit]

SPARC processors are pipelined, and branching is accomplished through a technique called Delayed Branch Execution. Control Transfer Intructions (CTI) are any instruction that changes the current program counter. For instance, a jmp or call instruction are CTI instructions.

In SPARC, when a CTI instruction is executed, the jump is not handled immediately. Instead, there is a one cycle delay before the branch is executed. This means that the first instruction after the jump instruction is actually handled before the jump takes place. Here is an example:

add %r3, %r2, %r5
jmp SetR5ToZero
add %r4, %r5, %r2

Notice that the last instruction executes before the jump takes place, not after the subroutine returns. This first instruction after a jump is called a delay slot. It is common practice to fill the delay slot with a special operation that performs no task, called a no-operation, or nop.

Instruction:
nop

This instruction performs no action, and therefore we don't need to worry about what order it acts in. However, if we put a nop after every branch instruction, we will waste alot of processor cycles. Therefore, if you can, it is always good practice to try and squeeze additional instructions into the delay slot, so that we don't waste any processor cycles.


The Stack[edit]