Saturday, 27 August 2011

Let's Get Started (Meet the Reggies)

Welcome to Assembly Aficionado, the blog on the world's most awesome assembly instruction set architecture, or ISA, the Intel x86 family. Here we cover the awesome subject of assembly language programming and the related ecosystem (excluding micro-architecture). Assembly hacking is useful and cool and better than playing model railways. It is the equivalent of sudoku designed for those of programming mindset. It is easier to understand than really hard stuff like partial differential equations and variational calculus and stuff like automated theorem proving - hey if you can do all that hard stuff, reading assembly language will be like watching advertising - serious!

You might think Assembly programming is becoming a dead art. Not so. Rather, it is becoming an art of deadly importance. People always want fast software. Do you want to write the world's fastest software or do you not?

Before we begin, you need to know the landscape of what you're working with. For this I give you three mantras, defining three dimensions of the mental space you need to inhabit in order to program the x86 motorcycle engine with finesse.

1. Understand the Execution Environment.
2. Understand the Execution Environment.
3. Understand the Execution Environment.

The secrets for the "exec-env" master-power lie in V1 (Basic Architecture) of the Intel Software Developer's Manual (SDM-V1) found here: Intel Manuals. In V1, read Chapter 3 on Basic Execution Environment to grok some basic concepts that will get you rocking in no time. A very succinct summary of V1-C3 is bestowed below. If you even have a remote interest in interpreters, compilers and virtual machines in general and how they work, then this will interest you tenfold, because this is how a real machine (and it's native language) work together to create wonderful, awesome software magic. Let the music of the floating point registers ring forth!!

Meet the Reggies

The Intel x86 playpen provides the following sets of registers:

1. FPU Double-Single-Double Extended Floating Point Register Goodness. This includes a rocking FPU control register, eight data registo's, FP Instruction Pointer register, operand data pointer register - you name it, the FPU register set has got it all man. A complete execution environment for calcing all those yummy spreadsheet formulae. Ch8 in V1 is just dedicated to the FPU exec-env and is really great reading - simply put,better than reading the Dragon Book. You want floating point computation, I give you Intel FPU, Executive Class.

2. SIMD Multimedia Action Registers. These are also known as the MMX registers. There are eight of these bad boys, labelled MM0 to MM7. A single MMX instruction can operated on 8 bytes simultaneously. Just imagine animating a dog and getting it to roll-over when you tell it to. Now imagine cloning that dog eight times, and issuing a single "roll over, Rover" command and all dogs heeding your call in unison. That is multimedia programming, man. That is true MMX. A kind of parallel processing, if you will. But before taking this wunderkind into the gladiatorial arena, remember to check for the presence of MMX technology first.

3. MMX "Aggressive Version" This is just MMX operating on more data and is called XMM. Again eight registers here, a la MMX.

Can think of FPU, MMX and XMM as all variants of floating point registers, for different purposes, such as sequential vs parallel processing, and parallel processing vs parallel processing in SIZE. We have discussed the exciting reg-sets. What about our basic, run-of-the-mill, workhorse registers however.

1. Workhorse, ROTM ("Run-of-the-Mill") Registers. Eight workhorses (aka General Purpose registers), six segment registers (more on that later), EFLAGS and EIP (a total of 16 mules). Without instructions, these registers are just mules. To make them sing and dance we need instructions, and x86 serves up the relevant instructions for basic integer arithmetic (these have parties on byte, word and doubleword integers), handle flow control (to direct the flow of the party), operate on bit and byte strings (party text), and address memory (to call up party-goers to see if they are in attendance). Basically there are a whole host of instructions to activate the mules (registers) and even transmogrify them (to become horses and unicorns).

Remember the 16 mules (14 of which are of related breeds). The EFLAGS and EIP mules are the odd ones out - call the the "E-mules". The ROTM registers are introduced this way in order to cultivate a learning approach defined by "effortless excellence" and a "natural way of understanding" or even "effortless understanding".

So to summarise, the "Reggies" are made up of the three floating point "families" and the 16 mules. The mules are the "ROTM" Reggies and the FPU-sets are the fancy families from the other side of town. Now you have met the Reggies, you should be confident that you are now a real Register-meister. You should say ComfortLevel==High if anyone asks about your familiarity with Intel Reggies now.

(function arguments) Jogging on the PlayStack

As well as Reggies, there is also a "PlayStack" (aka simply "Stack") to support calling subroutines and passing parameters. It is a "contiguous" or "gap-free" region of memory/memory locations, contained in a segment (which can be up to 4GB in size, hence no prizes in guessing how big the stack can be if it has to fit in a segment).

No comments:

Post a Comment