Assembler Syntax

From RAD Studio
Jump to: navigation, search

Go Up to Inline Assembly Code Index

The following material describes the elements of the assembler syntax.

Statements

This syntax of an assembly statement is:

Label: Prefix Opcode Operand1, Operand2

where Label is a label, Prefix is an assembly prefix opcode (operation code), Opcode is an assembly instruction opcode or directive, and Operand is an assembly expression. Label and Prefix are optional. Some opcodes take only one operand, and some take none.

Comments are allowed between assembly statements, but not within them. For example:

MOV AX,1 {Initial value}  { OK }
MOV CX,100 {Count}        { OK }

MOV {Initial value} AX,1; { Error! }
MOV CX, {Count} 100       { Error! }

Labels

Labels are used in built-in assembly statements as they are in the Delphi language by writing the label and a colon before a statement. There is no limit to a label's length. As in Delphi, labels must be declared in a label declaration part in the block containing the asm statement. The one exception to this rule is local labels.

Local labels are labels that start with an at-sign (@). They consist of an at-sign followed by one or more letters, digits, underscores, or at-signs. Use of local labels is restricted to asm statements, and the scope of a local label extends from the asm reserved word to the end of the asm statement that contains it. A local label doesn't have to be declared.

Instruction Opcodes

The built-in assembler supports all of the Intel-documented opcodes for general application use. Note that operating system privileged instructions may not be supported. Specifically, the following families of instructions are supported:

  • IA-32
    • Pentium family
    • Pentium Pro and Pentium II
    • Pentium III
    • Pentium 4
  • Intel 64

In addition, the built-in assembler supports the following instruction set extensions

  • Intel SSE (including SSE4.2)
  • AMD 3DNow! (from the AMD K6 onwards)
  • AMD Enhanced 3DNow! (from the AMD Athlon onwards)

For a complete description of each instruction, refer to your microprocessor documentation.

Automatic jump sizing

Unless otherwise directed, the built-in assembler optimizes jump instructions by automatically selecting the shortest, and therefore most efficient, form of a jump instruction. This automatic jump sizing applies to the unconditional jump instruction (JMP), and to all conditional jump instructions when the target is a label (not a procedure or function).

For an unconditional jump instruction (JMP), the built-in assembler generates a short jump (one-byte opcode followed by a one-byte displacement) if the distance to the target label is -128 to 127 bytes. Otherwise it generates a near jump (one-byte opcode followed by a two-byte displacement).

For a conditional jump instruction, a short jump (one-byte opcode followed by a one-byte displacement) is generated if the distance to the target label is -128 to 127 bytes. Otherwise, the built-in assembler generates a near jump to the target label.

Jumps to the entry points of procedures and functions are always near.

Directives

The built-in assembler supports three assembly define directives: DB (define byte), DW (define word), and DD (define double word). Each generates data corresponding to the comma-separated operands that follow the directive.

Directive Description

DB

Define byte: generates a sequence of bytes. Each operand can be a constant expression with a value between 128 and 255, or a character string of any length. Constant expressions generate one byte of code, and strings generate a sequence of bytes with values corresponding to the ASCII code of each character.

DW

Define word: generates a sequence of words. Each operand can be a constant expression with a value between 32,768 and 65,535, or an address expression. For an address expression, the built-in assembler generates a near pointer, a word that contains the offset part of the address.

DD

Define double word: generates a sequence of double words. Each operand can be a constant expression with a value between 2,147,483,648 and 4,294,967,295, or an address expression. For an address expression, the built-in assembler generates a far pointer, a word that contains the offset part of the address, followed by a word that contains the segment part of the address.

DQ

Define quad word: defines a quad word for Int64 values.

The data generated by the DB, DW, and DD directives is always stored in the code segment, just like the code generated by other built-in assembly statements. To generate uninitialized or initialized data in the data segment, you should use Delphi var or const declarations.

Some examples of DB, DW, and DD directives follow:

 asm
   DB     FFH                           { One byte }
   DB     0.99                          { Two bytes }
   DB     'A'                           { Ord('A') }
   DB     'Hello world...',0DH,0AH      { String followed by CR/LF }
   DB     12,'string'                   { {{Delphi}} style string }
   DW     0FFFFH                        { One word }
   DW     0,9999                        { Two words }
   DW   'A'                             { Same as DB  'A',0 }
   DW   'BA'                            { Same as DB 'A','B' }
   DW   MyVar                           { Offset of MyVar }
   DW   MyProc                          { Offset of MyProc }
   DD   0FFFFFFFFH                      { One double-word }
   DD   0,999999999         { Two double-words }
   DD   'A'             { Same as DB 'A',0,0,0 }
   DD   'DCBA'              { Same as DB 'A','B','C','D' }
   DD   MyVar               { Pointer to MyVar }
   DD   MyProc              { Pointer to MyProc }
  end;

When an identifier precedes a DB, DW , or DD directive, it causes the declaration of a byte-, word-, or double-word-sized variable at the location of the directive. For example, the assembler allows the following:

ByteVar       DB  ?
WordVar       DW  ?
IntVar        DD  ?
// …
MOV     AL,ByteVar
MOV     BX,WordVar
MOV ECX,IntVar

The built-in assembler does not support such variable declarations. The only kind of symbol that can be defined in an inline assembly statement is a label. All variables must be declared using Delphi syntax; the preceding construction can be replaced by:

var
  ByteVar: Byte;
  WordVar: Word;
  IntVar: Integer;
// …
asm
  MOV AL,ByteVar
  MOV BX,WordVar
  MOV ECX,IntVar
end;

SMALL and LARGE can be used determine the width of a displacement:

MOV EAX, [LARGE $1234]

This instruction generates a 'normal' move with a 32-bit displacement ($00001234):

MOV EAX, [SMALL $1234]

The second instruction will generate a move with an address size override prefix and a 16-bit displacement ($1234).

SMALL can be used to save space. The following example generates an address size override and a 2-byte address (in total three bytes):

MOV EAX, [SMALL 123]

as opposed to:

MOV EAX, [123]

which will generate no address size override and a 4-byte address (in total four bytes).

Two additional directives allow assembly code to access dynamic and virtual methods: VMTOFFSET and DMTINDEX.

VMTOFFSET retrieves the offset in bytes of the virtual method pointer table entry of the virtual method argument from the beginning of the virtual method table (VMT). This directive needs a fully specified class name with a method name as a parameter (for example, TExample.VirtualMethod), or an interface name and an interface method name.

DMTINDEX retrieves the dynamic method table index of the passed dynamic method. This directive also needs a fully specified class name with a method name as a parameter, for example, TExample.DynamicMethod. To invoke the dynamic method, call System.@CallDynaInst with the (E)SI register containing the value obtained from DMTINDEX.

Note: Methods with the message directive are implemented as dynamic methods and can also be called using the DMTINDEX technique. For example:

TMyClass = class
  procedure x; message MYMESSAGE;
end;

The following example uses both DMTINDEX and VMTOFFSET to access dynamic and virtual methods:

program Project2;
type
  TExample = class
    procedure DynamicMethod; dynamic;
    procedure VirtualMethod; virtual;
  end;

procedure TExample.DynamicMethod;
begin

end;

procedure TExample.VirtualMethod;
begin

end;

procedure CallDynamicMethod(e: TExample);
asm
  // Save ESI register
  PUSH    ESI
  // Instance pointer needs to be in EAX
  MOV     EAX, e

  // DMT entry index needs to be in (E)SI
  MOV     ESI, DMTINDEX TExample.DynamicMethod

  // Now call the method
  CALL    System.@CallDynaInst

  // Restore ESI register
  POP ESI

end;

procedure CallVirtualMethod(e: TExample);
asm
  // Instance pointer needs to be in EAX
  MOV     EAX, e
  // Retrieve VMT table entry
  MOV     EDX, [EAX]
  // Now call the method at offset VMTOFFSET
  CALL    DWORD PTR [EDX + VMTOFFSET TExample.VirtualMethod]
end;

var
  e: TExample;
begin
  e := TExample.Create;
  try
    CallDynamicMethod(e);
    CallVirtualMethod(e);
  finally
    e.Free;
  end;
end.

Operands

Inline assembler operands are expressions that consist of constants, registers, symbols, and operators.

Within operands, the following reserved words have predefined meanings:

Built-in assembler reserved words

CPU registers

Category

Identifiers

8-bit CPU registers

AH, AL, BH, BL, CH, CL, DH, DL (general purpose registers);

16-bit CPU registers

AX, BX, CX, DX (general purpose registers); DI, SI, SP, BP (index registers); CS, DS, SS, ES (segment registers); IP (instruction pointer)

32-bit CPU registers

EAX, EBX, ECX, EDX (general purpose registers); EDI, ESI, ESP, EBP (index registers); FS, GS (segment registers); EIP

FPU

ST(0), ..., ST(7)

MMX FPU registers

mm0, ..., mm7

XMM registers

xmm0, ..., xmm7 (..., xmm15 on x64)

Intel 64 registers

RAX, RBX, ...


Data and Operators

Category

Identifiers

Data

BYTE, WORD, DWORD, QWORD, TBYTE

Operators

NOT, AND, OR, XOR; SHL, SHR, MOD; LOW, HIGH; OFFSET, PTR, TYPE

VMTOFFSET, DMTINDEX

SMALL, LARGE


Reserved words always take precedence over user-defined identifiers. For example:

var
  Ch: Char;
// …
asm
  MOV  CH, 1
end;

loads 1 into the CH register, not into the Ch variable. To access a user-defined symbol with the same name as a reserved word, you must use the ampersand (&) override operator:

MOV&Ch, 1

It is best to avoid user-defined identifiers with the same names as built-in assembler reserved words.

See Also