Software Engineers Handbook/Language Dictionary/DEC PDP-11
Note: All assembly language sample code depends on the developing environment assembler and operating system.
Digital Equipment Corporation PDP-11[edit | edit source]
Here is the wikipedia entry.
Type[edit | edit source]
Execution Entry Point[edit | edit source]
On Unix, execution of a program would begin at the first word. The Unix "a.out" format (assembler output) had a header, and the first word of the header, 407 octal, was a branch around the rest of the header. By Fifth Edition Unix, the header did not actually get loaded in the execution image, so the 407 octal did not have to be executed; however, the "magic number" 0407 (octal) stuck as part of a.out format, even as the format moved to other computer architectures.
Registers[edit | edit source]
There were eight 16-bit registers, addressed by three bits in addressing modes in instructions. Register 7 was the program counter and register 6 was the stack pointer. Registers zero through 5 were general-purpose. The stack grew downward.
Addressing modes included postincrement and predecrement. A popular myth assumes these to be the source of the increment and decrement operators in C, but in fact those were inherited from B, which was implemented before the PDP-11 existed.
A consequence of the fact that the PC was addressed as an ordinary register, and of the inclusion of the postincrement addressing mode, was that you could load a literal value into a register (or send it to memory, for that matter) by having the literal follow the instruction in memory.
mov (PC)+, R5 177265
would put 177265 octal in register 5. In the assembler (using Unix assembler syntax here), you could abbreviate this as
mov $177265, R5
General Syntax[edit | edit source]
In the Unix assembler, a colon followed each label.
foo: mov -(PC), -(PC) / copy this instruction to the previous / location and branch there br foo / branch to foo
Of course in the above example, the branch to foo would not have to be executed since the instruction before it is a (nasty) loop unto itself.
Instruction fields occurred on three-bit boundaries starting from the low-order side, so it was easy to remember some instructions and the addressing modes for programming in binary from the front-panel switches. For instance, machine language for
mov -(PC), -(PC)
was 01x7x7 octal, where I forget what x was but it denoted the addressing mode, predecrement, each 7 denotes the PC, and the 01 is the opcode for move (which means copy). These field boundaries did not hold for the branch instructions, however.
Comments[edit | edit source]
At least one of the assemblers (the DEC?) allowed comments delimited with a semicolon to end of line. In the Unix Assembler the character "/" was used instead. 
Interrupts[edit | edit source]
The processor status word (PSW) was addressed at a specific memory-mapped location (-2, I believe). The interrupt vectors were in low memory. An interrupt would push the return address on the stack, and something else, because the format for an interrupt differed from that of a subroutine call, because there was a return-from-interrupt instruction distinct from the normal return (from subroutine) instruction.
Conditional Statements[edit | edit source]
There were branches, which went to short relative addresses, and jump instructions, which could go to any address using any addressing mode. Only branches could be conditional, in which case they would depend on condition codes set by a previous compare or arithmetic instruction. I forget whether there was a "branch never". There was a "wait", which would wait until the next interrupt, and a "halt", which would give control to the console (if executed in kernel mode or on a machine without protection).
cmp r0, (r1) / compare the contents of register R0 to / what R1 points to in memory (two-byte word) bne foo / branch not equal, to foo
Input/Output[edit | edit source]
Devices were memory mapped in high addresses. You could use interrupts or polling to know when they were ready. There were several levels of interrupt (or bus request).
There was a graphical computer called the GT40, before raster graphics became de rigeur. It had a computing processor that was a PDP-11, and a coprocessor to do the vectoring graphics. The graphic processor had its own jump instruction and would be put in a loop to keep the screen refreshed. There were great lunar lander and space war programs.
User programs on Unix, of course, did I/O with system calls, which were trap instructions.
stdout = 1 .data msg: <Hello, world.\n> .code sys write; stdout; msg; 14. / is the length / don't bother checking for error or a write of less than / the full buffer. sys exit; 0 / exit with an OK status
I'm not certain whether the exit status followed the "sys exit" as above, or whether it was in r0.
clr r0 sys exit
Since the arguments to system calls usually followed the trap instructions in memory, and the arguments often should be variable, and you can't write reentrant code if there are variables mixed in your executable code, Unix provided the "sys indir" call which could point to a system call in data space, which Unix would interpret as though it had occurred inline.
Indirection[edit | edit source]
<How many layers of address indirection are allowed? Show example code.>
One layer of indirection, determined by the instruction. You could index by a literal following the instruction.
Physical Structure[edit | edit source]
<Describe how the files, libararies, and parts are typically divided and arranged. Do they have typical file extensions in the various forms?>
On Unix, assembly language usually used the .s extention (for "source"). The result of assembly was called a.out unless you told the assembler otherwise, in which case the convention was to use a .o suffix (for "object"). Typically the linker "ld" (for "loader") would be run on a bunch of .o files to produce the new a.out, which you could then cause the execution of by just typing its name.
Useful Commands[edit | edit source]
<List code and descriptions of particularly useful commands for this assembly language.>
There were, in both the Unix and DEC assemblers, nonce labels. Rather than make up a name for every label you needed, you could just use a number and reach it locally.
1: cmp r3, r4 / hit size limit yet? bge 2f mov (r0)+, (r1)+ inc r3 br 1b 2:
In the Unix assembler, 2f meant 2 forward (the next "2:") and 1b meant 1 back, and so on.
Jumps could jump farther than branches, but took up two words, so a branch was desirable if possible. The assemblers provided a way to code a branch if it could reach and a jump otherwise. You could write it conditional, in which case the assembler would output a branch around a jump if necessary (inverting the condition, of course).
jle foo / jump to foo if less or equal jbr bar / jump or branch to bar
The PDP-11 ia one of the original mini-computers. It was used in a huge variety of ways, from timesharing to embedded control as well as some desktop use. PDP-11 is not a language per se, but PDP-11 assembly is a flavor of assembly. This early machine's architecture influenced later microprocessors and because it affected the design of the machine native code, it also affected higher level languages.
The postincrement addressing also made it into the Motorola 6800.
The only reason you would want to learn the PDP-11 assembly language would be if you had acquired a working PDP-11 machine and wanted to write or modify programs running on it. The machine, PDP-11, as well as every machine that ran it native is obsolete.
Sources[edit | edit source]
<Where can you get assemblers, cross assemblers and simulators for this assembly language?>
Web References[edit | edit source]
<List additional references on the web. Please include for what level reader the references are appropriate. (beginner/intermediate/advanced)
Where is the code set on the web?>
Books and Articles[edit | edit source]
PDP-11 Programmer's Handbook, Digital Equipment Corporation
- Lions' Commentary on UNIX 6th Edition, Chapter 2 - Unix Assembler