Assembler Syntax
Go Up to Inline Assembly Code Index
The following material describes the elements of the assembler syntax.
Contents
Statements
This syntax of an assembly statement is:
- Label: Prefix Opcode Operand1, Operand2
where Label is a label, Prefix is an assembly prefix opcode (operation code), Opcode is an assembly instruction opcode or directive, and Operand is an assembly expression. Label and Prefix are optional. Some opcodes take only one operand, and some take none.
Comments are allowed between assembly statements, but not within them. For example:
MOV AX,1 {Initial value} { OK }
MOV CX,100 {Count} { OK }
MOV {Initial value} AX,1; { Error! }
MOV CX, {Count} 100 { Error! }
Labels
Labels are used in built-in assembly statements as they are in the Delphi language by writing the label and a colon before a statement. There is no limit to a label's length. As in Delphi, labels must be declared in a label declaration part in the block containing the asm statement. The one exception to this rule is local labels.
Local labels are labels that start with an at-sign (@). They consist of an at-sign followed by one or more letters, digits, underscores, or at-signs. Use of local labels is restricted to asm statements, and the scope of a local label extends from the asm reserved word to the end of the asm statement that contains it. A local label doesn't have to be declared.
Instruction Opcodes
The built-in assembler supports all of the Intel-documented opcodes for general application use. Note that operating system privileged instructions may not be supported. Specifically, the following families of instructions are supported:
- IA-32
- Pentium family
- Pentium Pro and Pentium II
- Pentium III
- Pentium 4
- Intel 64
In addition, the built-in assembler supports the following instruction set extensions
- Intel SSE (including SSE4.2)
- AMD 3DNow! (from the AMD K6 onwards)
- AMD Enhanced 3DNow! (from the AMD Athlon onwards)
For a complete description of each instruction, refer to your microprocessor documentation.
Automatic jump sizing
Unless otherwise directed, the built-in assembler optimizes jump instructions by automatically selecting the shortest, and therefore most efficient, form of a jump instruction. This automatic jump sizing applies to the unconditional jump instruction (JMP), and to all conditional jump instructions when the target is a label (not a procedure or function).
For an unconditional jump instruction (JMP), the built-in assembler generates a short jump (one-byte opcode followed by a one-byte displacement) if the distance to the target label is -128 to 127 bytes. Otherwise it generates a near jump (one-byte opcode followed by a two-byte displacement).
For a conditional jump instruction, a short jump (one-byte opcode followed by a one-byte displacement) is generated if the distance to the target label is -128 to 127 bytes. Otherwise, the built-in assembler generates a near jump to the target label.
Jumps to the entry points of procedures and functions are always near.
Directives
The built-in assembler supports three assembly define directives: DB (define byte), DW (define word), and DD (define double word). Each generates data corresponding to the comma-separated operands that follow the directive.
Directive | Description |
---|---|
DB |
Define byte: generates a sequence of bytes. Each operand can be a constant expression with a value between 128 and 255, or a character string of any length. Constant expressions generate one byte of code, and strings generate a sequence of bytes with values corresponding to the ASCII code of each character. |
DW |
Define word: generates a sequence of words. Each operand can be a constant expression with a value between 32,768 and 65,535, or an address expression. For an address expression, the built-in assembler generates a near pointer, a word that contains the offset part of the address. |
DD |
Define double word: generates a sequence of double words. Each operand can be a constant expression with a value between 2,147,483,648 and 4,294,967,295, or an address expression. For an address expression, the built-in assembler generates a far pointer, a word that contains the offset part of the address, followed by a word that contains the segment part of the address. |
DQ |
Define quad word: defines a quad word for Int64 values. |
The data generated by the DB, DW, and DD directives is always stored in the code segment, just like the code generated by other built-in assembly statements. To generate uninitialized or initialized data in the data segment, you should use Delphi var or const declarations.
Some examples of DB, DW, and DD directives follow:
asm
DB FFH { One byte }
DB 0.99 { Two bytes }
DB 'A' { Ord('A') }
DB 'Hello world...',0DH,0AH { String followed by CR/LF }
DB 12,'string' { {{Delphi}} style string }
DW 0FFFFH { One word }
DW 0,9999 { Two words }
DW 'A' { Same as DB 'A',0 }
DW 'BA' { Same as DB 'A','B' }
DW MyVar { Offset of MyVar }
DW MyProc { Offset of MyProc }
DD 0FFFFFFFFH { One double-word }
DD 0,999999999 { Two double-words }
DD 'A' { Same as DB 'A',0,0,0 }
DD 'DCBA' { Same as DB 'A','B','C','D' }
DD MyVar { Pointer to MyVar }
DD MyProc { Pointer to MyProc }
end;
When an identifier precedes a DB, DW , or DD directive, it causes the declaration of a byte-, word-, or double-word-sized variable at the location of the directive. For example, the assembler allows the following:
ByteVar DB ?
WordVar DW ?
IntVar DD ?
// …
MOV AL,ByteVar
MOV BX,WordVar
MOV ECX,IntVar
The built-in assembler does not support such variable declarations. The only kind of symbol that can be defined in an inline assembly statement is a label. All variables must be declared using Delphi syntax; the preceding construction can be replaced by:
var
ByteVar: Byte;
WordVar: Word;
IntVar: Integer;
// …
asm
MOV AL,ByteVar
MOV BX,WordVar
MOV ECX,IntVar
end;
SMALL and LARGE can be used determine the width of a displacement:
MOV EAX, [LARGE $1234]
This instruction generates a 'normal' move with a 32-bit displacement ($00001234):
MOV EAX, [SMALL $1234]
The second instruction will generate a move with an address size override prefix and a 16-bit displacement ($1234).
SMALL can be used to save space. The following example generates an address size override and a 2-byte address (in total three bytes):
MOV EAX, [SMALL 123]
as opposed to:
MOV EAX, [123]
which will generate no address size override and a 4-byte address (in total four bytes).
Two additional directives allow assembly code to access dynamic and virtual methods: VMTOFFSET and DMTINDEX.
VMTOFFSET retrieves the offset in bytes of the virtual method pointer table entry of the virtual method argument from the beginning of the virtual method table (VMT). This directive needs a fully specified class name with a method name as a parameter (for example, TExample.VirtualMethod), or an interface name and an interface method name.
DMTINDEX retrieves the dynamic method table index of the passed dynamic method. This directive also needs a fully specified class name with a method name as a parameter, for example, TExample.DynamicMethod. To invoke the dynamic method, call System.@CallDynaInst with the (E)SI register containing the value obtained from DMTINDEX.
Note: Methods with the message directive are implemented as dynamic methods and can also be called using the DMTINDEX technique. For example:
TMyClass = class
procedure x; message MYMESSAGE;
end;
The following example uses both DMTINDEX and VMTOFFSET to access dynamic and virtual methods:
program Project2;
type
TExample = class
procedure DynamicMethod; dynamic;
procedure VirtualMethod; virtual;
end;
procedure TExample.DynamicMethod;
begin
end;
procedure TExample.VirtualMethod;
begin
end;
procedure CallDynamicMethod(e: TExample);
asm
// Save ESI register
PUSH ESI
// Instance pointer needs to be in EAX
MOV EAX, e
// DMT entry index needs to be in (E)SI
MOV ESI, DMTINDEX TExample.DynamicMethod
// Now call the method
CALL System.@CallDynaInst
// Restore ESI register
POP ESI
end;
procedure CallVirtualMethod(e: TExample);
asm
// Instance pointer needs to be in EAX
MOV EAX, e
// Retrieve VMT table entry
MOV EDX, [EAX]
// Now call the method at offset VMTOFFSET
CALL DWORD PTR [EDX + VMTOFFSET TExample.VirtualMethod]
end;
var
e: TExample;
begin
e := TExample.Create;
try
CallDynamicMethod(e);
CallVirtualMethod(e);
finally
e.Free;
end;
end.
Operands
Inline assembler operands are expressions that consist of constants, registers, symbols, and operators.
Within operands, the following reserved words have predefined meanings:
Built-in assembler reserved words
- CPU registers
Category |
Identifiers |
---|---|
8-bit CPU registers |
AH, AL, BH, BL, CH, CL, DH, DL (general purpose registers); |
16-bit CPU registers |
AX, BX, CX, DX (general purpose registers); DI, SI, SP, BP (index registers); CS, DS, SS, ES (segment registers); IP (instruction pointer) |
32-bit CPU registers |
EAX, EBX, ECX, EDX (general purpose registers); EDI, ESI, ESP, EBP (index registers); FS, GS (segment registers); EIP |
FPU |
ST(0), ..., ST(7) |
MMX FPU registers |
mm0, ..., mm7 |
XMM registers |
xmm0, ..., xmm7 (..., xmm15 on x64) |
Intel 64 registers |
RAX, RBX, ... |
Data and Operators
Category |
Identifiers |
---|---|
Data |
BYTE, WORD, DWORD, QWORD, TBYTE |
Operators |
NOT, AND, OR, XOR; SHL, SHR, MOD; LOW, HIGH; OFFSET, PTR, TYPE |
VMTOFFSET, DMTINDEX | |
SMALL, LARGE |
Reserved words always take precedence over user-defined identifiers. For example:
var
Ch: Char;
// …
asm
MOV CH, 1
end;
loads 1 into the CH register, not into the Ch variable. To access a user-defined symbol with the same name as a reserved word, you must use the ampersand (&) override operator:
MOV&Ch, 1
It is best to avoid user-defined identifiers with the same names as built-in assembler reserved words.