X86 Assembly/MMX

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search

[edit] Saturation Arithmetic

In an 8-bit grayscale picture, 255 is the value for pure white, and 0 is the value for pure black. In a regular register (AX, BX, CX ...) if we add one to white, we get black! This is because the regular registers "roll-over" to the next value. MMX registers get around this by a technique called "Saturation Arithmetic". In saturation arithmetic, the value of the register never rolls over to 0 again. This means that in the MMX world, we have the following equations:

255 + 100 = 255
200 + 100 = 255
0 - 100 = 0;
99 - 100 = 0;

This may seem counter-intuitive at first to people who are used to their registers rolling over, but it makes good sense: if we make white brighter, it shouldnt become black.

[edit] Single Instruction Multiple Data (SIMD) Instructions

MMX registers are 64 bits wide, but they can be broken down as follows:

2 32 bit values
4 16 bit values
8 8 bit values

The MMX registers cannot easily be used for 64 bit arithmetic, so it's a waste of time to even try. Let's say that we have 4 Bytes loaded in an MMX register: 10, 25, 128, 255. We have them arranged as such:

MM0: | 10 | 25 | 128 | 255 |

And we do the following pseudo code operation:

MM0 + 10

We would get the following result:

MM0: |10+10|25+10|128+10|255+10| = | 20 | 35 | 138 | 255 |

Remember that in the last box, our arithmetic "saturates", and doesn't go over 255.

Using MMX, we are essentially performing 4 additions, in the time it takes to perform 1 addition using the regular registers. The problem is that the MMX instructions run slightly slower than the regular arithmetic instructions, the FPU can't be used when the MMX register is running, and MMX registers use saturation arithmetic.

[edit] MMX Registers

There are 8 64-bit MMX registers. These registers overlay the FPU stack register. The MMX instructions and the FPU instructions cannot be used simultaneously. MMX registers are addressed directly, and do not need to be accessed by pushing and popping in the same way as the FPU registers.

MM7 MM6 MM5 MM4 MM3 MM2 MM1 MM0

These registers correspond to to same numbered FPU registers on the FPU stack.

Usually when you initiate an assembly block in your code that contains MMX instructions, the CPU automatically will disallow floating point instructions. To re-allow FPU operations you must end all MMX code with emms. Here is an example of a C routine calling assembly language with MMX code (NOTE: Borland compatible C++ Example)....

//---------------------------------------------------
// A simple example using MMX to copy 8 bytes of data 
// From source s2 to destination s1
//---------------------------------------------------
void __fastcall CopyMemory8(char *s1, const char *s2)
{
   __asm
   {
    push edx
    mov ecx, s2
    mov edx, s1
    movq   mm0, [ecx   ]
    movq   [edx   ], mm0
    pop edx
    emms
   }
}