INTEL 80386 PROGRAMMER'S REFERENCE MANUAL 1986
Online Users astalalista main site site map free page free e-mail blog
Tell a friend
about this site
Reload page
Home
Site search
add a link to
this page
Add to favorites
Start page
Guestbook
Print page
contact AOL - ICQ
Buy from
Astalalista
insert your ADS
Add Link

powered by astalalista

get search for your site! - dating - more search - money
Random Link!
Hosted By
HostedScripts.com

INTEL 80386 PROGRAMMER'S REFERENCE MANUAL 1986

 go to Italian version

translate

Chapter 10  Initialization

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

After a signal on the RESET pin, certain registers of the 80386 are set to
predefined values. These values are adequate to enable execution of a
bootstrap program, but additional initialization must be performed by
software before all the features of the processor can be utilized.


10.1  Processor State After Reset

The contents of EAX depend upon the results of the power-up self test. The
self-test may be requested externally by assertion of BUSY# at the end of
RESET. The EAX register holds zero if the 80386 passed the test. A nonzero
value in EAX after self-test indicates that the particular 80386 unit is
faulty. If the self-test is not requested, the contents of EAX after RESET
is undefined.

DX holds a component identifier and revision number after RESET as Figure
10-1 illustrates. DH contains 3, which indicates an 80386 component. DL
contains a unique identifier of the revision level.

Control register zero (CR0) contains the values shown in Figure 10-2. The
ET bit of CR0 is set if an 80387 is present in the configuration (according
to the state of the ERROR# pin after RESET). If ET is reset, the
configuration either contains an 80287 or does not contain a coprocessor. A
software test is required to distinguish between these latter two
possibilities.

The remaining registers and flags are set as follows:

   EFLAGS             =00000002H
   IP                 =0000FFF0H
   CS selector        =000H
   DS selector        =0000H
   ES selector        =0000H
   SS selector        =0000H
   FS selector        =0000H
   GS selector        =0000H
   IDTR:
              base    =0
              limit   =03FFH

All registers not mentioned above are undefined.

These settings imply that the processor begins in real-address mode with
interrupts disabled.


Figure 10-1.  Contents of EDX after RESET

                                 EDX REGISTER

      31               23               15               7              0
     ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
     º±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±³       DH       ³      DL        º
     º±±±±±±±±±±±±UNDEFINED±±±±±±±±±±±±³   DEVICE ID    ³  STEPPING ID   º
     º±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±±³       3        ³   (UNIQUE)     º
     ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ


Figure 10-2.  Initial Contents of CR0

                               CONTROL REGISTER ZERO

   31                23                15                  7     4 3   1  0
  ÉÍÑÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÑÍÑÍÑÍÑÍÑÍ»
  ºP³                                                           ³E³T³E³M³Pº
  º ³                            UNDEFINED                      ³ ³ ³ ³ ³ º
  ºG³                                                           ³T³S³M³P³Eº
  ÈÑÏÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÏÑÏÑÏÑÏÑÏѼ
   ³                                                             ³ ³ ³ ³ ³
   ÀÄÄÄÄÄÄÄÄÄÄÄÄÄ0 - PAGING DISABLED                             ³ ³ ³ ³ ³
                 * - INDICATES PRESENCE OF 80387ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³ ³ ³
                 0 - NO TASK SWITCHÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³ ³
                 0 - DO NOT MONITOR COPROCESSORÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³
                 0 - COPROCESSOR NOT PRESENTÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³
                 0 - PROTECTION NOT ENABLED (REAL ADDRESS MODE)ÄÄÄÄÄÄÄÄÄÄÙ


10.2  Software Initialization for Real-Address Mode

In real-address mode a few structures must be initialized before a program
can take advantage of all the features available in this mode.


10.2.1  Stack

No instructions that use the stack can be used until the stack-segment
register (SS) has been loaded. SS must point to an area in RAM.


10.2.2  Interrupt Table

The initial state of the 80386 leaves interrupts disabled; however, the
processor will still attempt to access the interrupt table if an exception
or nonmaskable interrupt (NMI) occurs. Initialization software should take
one of the following actions:

  þ  Change the limit value in the IDTR to zero. This will cause a shutdown
     if an exception or nonmaskable interrupt occurs. (Refer to the 80386
     Hardware Reference Manual to see how shutdown is signalled externally.)

  þ  Put pointers to valid interrupt handlers in all positions of the
     interrupt table that might be used by exceptions or interrupts.

  þ  Change the IDTR to point to a valid interrupt table.


10.2.3  First Instructions

After RESET, address lines A{31-20} are automatically asserted for
instruction fetches. This fact, together with the initial values of CS:IP,
causes instruction execution to begin at physical address FFFFFFF0H. Near
(intrasegment) forms of control transfer instructions may be used to pass
control to other addresses in the upper 64K bytes of the address space. The
first far (intersegment) JMP or CALL instruction causes A{31-20} to drop
low, and the 80386 continues executing instructions in the lower one
megabyte of physical memory. This automatic assertion of address lines
A{31-20} allows systems designers to use a ROM at the high end of
the address space to initialize the system.


10.3  Switching to Protected Mode

Setting the PE bit of the MSW in CR0 causes the 80386 to begin executing in
protected mode. The current privilege level (CPL) starts at zero. The
segment registers continue to point to the same linear addresses as in real
address mode (in real address mode, linear addresses are the same physical
addresses).

Immediately after setting the PE flag, the initialization code must flush
the processor's instruction prefetch queue by executing a JMP instruction.
The 80386 fetches and decodes instructions and addresses before they are
used; however, after a change into protected mode, the prefetched
instruction information (which pertains to real-address mode) is no longer
valid. A JMP forces the processor to discard the invalid information.


10.4  Software Initialization for Protected Mode

Most of the initialization needed for protected mode can be done either
before or after switching to protected mode. If done in protected mode,
however, the initialization procedures must not use protected-mode features
that are not yet initialized.


10.4.1  Interrupt Descriptor Table

The IDTR may be loaded in either real-address or protected mode. However,
the format of the interrupt table for protected mode is different than that
for real-address mode. It is not possible to change to protected mode and
change interrupt table formats at the same time; therefore, it is inevitable
that, if IDTR selects an interrupt table, it will have the wrong format at
some time. An interrupt or exception that occurs at this time will have
unpredictable results. To avoid this unpredictability, interrupts should
remain disabled until interrupt handlers are in place and a valid IDT has
been created in protected mode.


10.4.2  Stack

The SS register may be loaded in either real-address mode or protected
mode. If loaded in real-address mode, SS continues to point to the same
linear base-address after the switch to protected mode.


10.4.3  Global Descriptor Table

Before any segment register is changed in protected mode, the GDT register
must point to a valid GDT. Initialization of the GDT and GDTR may be done in
real-address mode. The GDT (as well as LDTs) should reside in RAM, because
the processor modifies the accessed bit of descriptors.


10.4.4  Page Tables

Page tables and the PDBR in CR3 can be initialized in either real-address
mode or in protected mode; however, the paging enabled (PG) bit of CR0
cannot be set until the processor is in protected mode. PG may be set
simultaneously with PE, or later. When PG is set, the PDBR in CR3 should
already be initialized with a physical address that points to a valid page
directory. The initialization procedure should adopt one of the following
strategies to ensure consistent addressing before and after paging is
enabled:

  þ  The page that is currently being executed should map to the same
     physical addresses both before and after PG is set.

  þ  A JMP instruction should immediately follow the setting of PG.


10.4.5  First Task

The initialization procedure can run awhile in protected mode without
initializing the task register; however, before the first task switch, the
following conditions must prevail:

  þ  There must be a valid task state segment (TSS) for the new task. The
     stack pointers in the TSS for privilege levels numerically less than or
     equal to the initial CPL must point to valid stack segments.

  þ  The task register must point to an area in which to save the current
     task state. After the first task switch, the information dumped in this
     area is not needed, and the area can be used for other purposes.


10.5  Initialization Example

$TITLE ('Initial Task')

    NAME    INIT

init_stack  SEGMENT RW
            DW  20  DUP(?)
tos         LABEL   WORD
init_stack  ENDS

init_data   SEGMENT RW PUBLIC
            DW  20  DUP(?)
init_data   ENDS

init_code   SEGMENT ER PUBLIC

ASSUME      DS:init_data

    nop
    nop
    nop
init_start:
                                    ; set up stack
    mov ax, init_stack
    mov ss, ax
    mov esp, offset tos

    mov a1,1
blink:
    xor a1,1
    out 0e4h,a1
    mov cx,3FFFh
here:
    dec cx
    jnz here

    jmp SHORT blink

    hlt
init_code   ends

    END init_start, SS:init_stack, DS:init_data

$TITLE('Protected Mode Transition -- 386 initialization')
NAME  RESET

;*****************************************************************
; Upon reset the 386 starts executing at address 0FFFFFFF0H.  The
; upper 12 address bits remain high until a FAR call or jump is
; executed.
;
; Assume the following:
;
;
; -  a short jump at address 0FFFFFFF0H (placed there by the
;    system builder) causes execution to begin at START in segment
;    RESET_CODE.
;
;
; -  segment RESET_CODE is based at physical address 0FFFF0000H,
;    i.e.   at the start of the last  64K in the 4G address space.
;    Note that  this is the base of the CS register at reset.  If
;    you locate ROMcode above  this  address,  you  will  need  to
;    figure out an adjustment factor to address things within this
;    segment.
;
;*****************************************************************
$EJECT ;

; Define addresses to locate GDT and IDT in RAM.
; These addresses are also used in the BLD386 file that defines
; the GDT and IDT. If you change these addresses, make sure you
; change the base addresses specified in the build file.

GDTbase         EQU    00001000H   ; physical address for GDT base
IDTbase         EQU    00000400H   ; physical address for IDT base

PUBLIC     GDT_EPROM
PUBLIC     IDT_EPROM
PUBLIC     START

DUMMY      segment rw      ; ONLY for ASM386 main module stack init
           DW 0
DUMMY   ends

;*****************************************************************
;
; Note: RESET CODE must be USEl6 because the 386 initally executes
;       in real mode.
;

RESET_CODE segment er PUBLIC    USE16

ASSUME DS:nothing, ES:nothing

;
; 386 Descriptor template

DESC       STRUC
    lim_0_15    DW  0              ; limit bits (0..15)
    bas_0_15    DW  0              ; base bits (0..15)
    bas_16_23   DB  0              ; base bits (16..23)
    access      DB  0              ; access byte
    gran        DB  0              ; granularity byte
    bas_24_31   DB  0              ; base bits (24..31)
DESC       ENDS

; The following is the layout of the real GDT created by BLD386.
; It is located in EPROM and will be copied to RAM.
;
; GDT[O]      ...  NULL
; GDT[1]      ...  Alias for RAM GDT
; GDT[2]      ...  Alias for RAM IDT
; GDT[2]      ...  initial task TSS
; GDT[3]      ...  initial task TSS alias
; GDT[4]      ...  initial task LDT
; GDT[5]      ...  initial task LDT alias

;
; define entries in GDT and IDT.

GDT_ENTRIES    EQU    8
IDT_ENTRIES    EQU    32

; define some constants to index into the real GDT

GDT_ALIAS      EQU    1*SIZE DESC
IDT_ALIAS      EQU    2*SIZE DESC
INIT_TSS       EQU    3*SIZE DESC
INIT_TSS_A     EQU    4*SIZE DESC
INIT_LDT       EQU    5*SIZE DESC
INIT_LDT_A     EQU    6*SIZE DESC

;
; location of alias in INIT_LDT

INIT_LDT_ALIAS    EQU    1*SIZE DESC

;
; access rights byte for DATA and TSS descriptors

DS_ACCESS   EQU   010010010B
TSS_ACCESS  EQU   010001001B


;
; This temporary GDT will be used to set up the real GDT in RAM.

Temp_GDT    LABEL   BYTE        ; tag for begin of scratch GDT

NULL_DES    DESC <>             ; NULL descriptor

                                ; 32-Gigabyte data segment based at 0
FLAT_DES    DESC <0FFFFH,0,0,92h,0CFh,0>

GDT_eprom     DP    ?           ; Builder places GDT address and limit
                                ; in this 6 byte area.

IDT_eprom     DP    ?           ; Builder places IDT address and limit
                                ; in this 6 byte area.

;
; Prepare operand for loadings GDTR and LDTR.


TGDT_pword     LABEL  PWORD                 ; for temp GDT
        DW     end_Temp_GDT_Temp_GDT -1
        DD     0

GDT_pword      LABEL  PWORD                 ; for GDT in RAM
        DW     GDT_ENTRIES * SIZE DESC -1
        DD     GDTbase

IDT_pword      LABEL   PWORD                ; for IDT in RAM
        DW     IDT_ENTRIES * SIZE DESC -1
        DD     IDTbase


end_Temp_GDT   LABEL   BYTE

;
; Define equates for addressing convenience.

GDT_DES_FLAT        EQU DS:GDT_ALIAS +GDTbase
IDT_DES_FLAT        EQU DS:IDT_ALIAS +GDTbase

INIT_TSS_A_OFFSET   EQU DS:INIT_TSS_A
INIT_TSS_OFFSET     EQU DS:INIT_TSS

INIT_LDT_A_OFFSET   EQU DS:INIT_LDT_A
INIT_LDT_OFFSET     EQU DS:INIT_LDT


; define pointer for first task switch

ENTRY POINTER LABEL DWORD
             DW 0, INIT_TSS

;******************************************************************
;
;   Jump from reset vector to here.

START:

    CLI                ;disable interrupts
    CLD                ;clear direction flag

    LIDT    NULL_des   ;force shutdown on errors

;
;   move scratch GDT to RAM at physical 0

    XOR DI,DI
    MOV ES,DI           ;point ES:DI to physical location 0


    MOV SI,OFFSET Temp_GDT
    MOV CX,end_Temp_GDT-Temp_GDT        ;set byte count
    INC CX
;
;   move table

    REP MOVS BYTE PTR ES:[DI],BYTE PTR CS:[SI]

    LGDT    tGDT_pword                ;load GDTR for Temp. GDT
                                      ;(located at 0)

;   switch to protected mode

    MOV EAX,CR0                       ;get current CRO
    MOV EAX,1                         ;set PE bit
    MOV CRO,EAX                       ;begin protected mode
;
;   clear prefetch queue

    JMP SHORT flush
flush:

; set DS,ES,SS to address flat linear space (0 ... 4GB)

    MOV BX,FLAT_DES-Temp_GDT
    MOV US,BX
    MOV ES,BX
    MOV SS,BX
;
; initialize stack pointer to some (arbitrary) RAM location

    MOV ESP, OFFSET end_Temp_GDT

;
; copy eprom GDT to RAM

    MOV ESI,DWORD PTR GDT_eprom +2 ; get base of eprom GDT
                                   ; (put here by builder).

    MOV EDI,GDTbase                ; point ES:EDI to GDT base in RAM.

    MOV CX,WORD PTR gdt_eprom +0   ; limit of eprom GDT
    INC CX
    SHR CX,1                       ; easier to move words
    CLD
    REP MOVS   WORD PTR ES:[EDI],WORD PTR DS:[ESI]

;
; copy eprom IDT to RAM
;
    MOV ESI,DWORD PTR IDT_eprom +2 ; get base of eprom IDT
                                   ; (put here by builder)

    MOV EDI,IDTbase                ; point ES:EDI to IDT base in RAM.

    MOV CX,WORD PTR idt_eprom +0   ; limit of eprom IDT
    INC CX
    SHR CX,1
    CLD
    REP MOVS   WORD PTR ES:[EDI],WORD PTR DS:[ESI]

; switch to RAM GDT and IDT
;
    LIDT IDT_pword
    LGDT GDT_pword

;
    MOV BX,GDT_ALIAS               ; point DS to GDT alias
    MOV DS,BX
;
; copy eprom TSS to RAM
;
    MOV BX,INIT_TSS_A              ; INIT TSS A descriptor base
                                   ; has RAM location of INIT TSS.

    MOV ES,BX                      ; ES points to TSS in RAM

    MOV BX,INIT_TSS                ; get inital task selector
    LAR DX,BX                      ; save access byte
    MOV [BX].access,DS_ACCESS      ; set access as data segment
    MOV FS,BX                      ; FS points to eprom TSS

    XOR si,si                      ; FS:si points to eprom TSS
    XOR di,di                      ; ES:di points to RAM TSS

    MOV CX,[BX].lim_0_15           ; get count to move
    INC CX

;
; move INIT_TSS to RAM.

    REP MOVS BYTE PTR ES:[di],BYTE PTR FS:[si]

    MOV [BX].access,DH             ; restore access byte

;
; change base of INIT TSS descriptor to point to RAM.

    MOV AX,INIT_TSS_A_OFFSET.bas_0_15
    MOV INIT_TSS_OFFSET.bas_0_15,AX
    MOV AL,INIT_TSS_A_OFFSET.bas_16_23
    MOV INIT_TSS_OFFSET.bas_16_23,AL
    MOV AL,INIT_TSS_A_OFFSET.bas_24_31
    MOV INIT_TSS_OFFSET.bas_24_31,AL

;
; change INIT TSS A to form a save area for TSS on first task
; switch. Use RAM at location 0.

    MOV BX,INIT_TSS_A
    MOV WORD PTR [BX].bas_0_15,0
    MOV [BX].bas_16_23,0
    MOV [BX].bas_24_31,0
    MOV [BX].access,TSS_ACCESS
    MOV [BX].gran,O
    LTR BX                         ; defines save area for TSS

;
; copy eprom LDT to RAM

    MOV BX,INIT_LDT_A              ; INIT_LDT_A descriptor has
                                   ; base address in RAM for INIT_LDT.

    MOV ES,BX                      ; ES points LDT location in RAM.

    MOV AH,[BX].bas_24_31
    MOV AL,[BX].bas_16_23
    SHL EAX,16
    MOV AX,[BX].bas_0_15           ; save INIT_LDT base (ram) in EAX

    MOV BX,INIT_LDT                ; get inital LDT selector
    LAR DX,BX                      ; save access rights
    MOV [BX].access,DS_ACCESS      ; set access as data segment
    MOV FS,BX                      ; FS points to eprom LDT

    XOR si,si                      ; FS:SI points to eprom LDT
    XOR di,di                      ; ES:DI points to RAM LDT

    MOV CX,[BX].lim_0_15           ; get count to move
    INC CX
;
; move initial LDT to RAM

    REP MOVS BYTE PTR ES:[di],BYTE PTR FS:[si]

    MOV [BX].access,DH             ; restore access rights in
                                   ; INIT_LDT descriptor

;
; change base of alias (of INIT_LDT) to point to location in RAM.

    MOV ES:[INIT_LDT_ALIAS].bas_0_15,AX
    SHR EAX,16
    MOV ES:[INIT_LDT_ALIAS].bas_16_23,AL
    MOV ES:[INIT_LDT_ALIAS].bas_24_31,AH

;
; now set the base value in INIT_LDT descriptor

    MOV AX,INIT_LDT_A_OFFSET.bas_0_15
    MOV INIT_LDT_OFFSET.bas_0_15,AX
    MOV AL,INIT_LDT_A_OFFSET.bas_16_23
    MOV INIT_LDT_OFFSET.bas_16_23,AL
    MOV AL,INIT_LDT_A_OFFSET.bas_24_31
    MOV INIT_LDT_OFFSET.bas_24_31,AL

;
; Now GDT, IDT, initial TSS and initial LDT are all set up.
;
; Start the first task!
'
   JMP ENTRY_POINTER

RESET_CODE ends
   END START, SS:DUMMY,DS:DUMMY


10.6  TLB Testing

The 80386 provides a mechanism for testing the Translation Lookaside Buffer
(TLB), the cache used for translating linear addresses to physical
addresses. Although failure of the TLB hardware is extremely unlikely, users
may wish to include TLB confidence tests among other power-up confidence
tests for the 80386.

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
NOTE
  This TLB testing mechanism is unique to the 80386 and may not be
  continued in the same way in future processors. Sortware that uses
  this mechanism may be incompatible with future processors.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

When testing the TLB it is recommended that paging be turned off (PG=0 in
CR0) to avoid interference with the test data being written to the TLB.


10.6.1  Structure of the TLB

The TLB is a four-way set-associative memory. Figure 10-3 illustrates the
structure of the TLB. There are four sets of eight entries each. Each entry
consists of a tag and data. Tags are 24-bits wide. They contain the
high-order 20 bits of the linear address, the valid bit, and three attribute
bits. The data portion of each entry contains the high-order 20 bits of the
physical address.


10.6.2  Test Registers

Two test registers, shown in Figure 10-4, are provided for the purpose of
testing. TR6 is the test command register, and TR7 is the test data
register. These registers are accessed by variants of the MOV
instruction. A test register may be either the source operand or destination
operand. The MOV instructions are defined in both real-address mode and
protected mode. The test registers are privileged resources; in protected
mode, the MOV instructions that access them can only be executed at
privilege level 0. An attempt to read or write the test registers when
executing at any other privilege level causes a general
protection exception.

The test command register (TR6) contains a command and an address tag to
use in performing the command:

C         This is the command bit. There are two TLB testing commands:
          write entries into the TLB, and perform TLB lookups. To cause an
          immediate write into the TLB entry, move a doubleword into TR6
          that contains a 0 in this bit. To cause an immediate TLB lookup,
          move a doubleword into TR6 that contains a 1 in this bit.

Linear    On a TLB write, a TLB entry is allocated to this linear address;
Address   the rest of that TLB entry is set per the value of TR7 and the
          value just written into TR6. On a TLB lookup, the TLB is
          interrogated per this value; if one and only one TLB entry
          matches, the rest of the fields of TR6 and TR7 are set from the
          matching TLB entry.

V         The valid bit for this TLB entry. The TLB uses the valid bit to
          identify entries that contain valid data. Entries of the TLB
          that have not been assigned values have zero in the valid bit.
          All valid bits can be cleared by writing to CR3.

D, D#     The dirty bit (and its complement) for/from the TLB entry.

U, U#     The U/S bit (and its complement) for/from the TLB entry.

W, W#     The R/W bit (and its complement) for/from the TLB entry.

          The meaning of these pairs of bits is given by Table 10-1,
          where X represents D, U, or W.

The test data register (TR7) holds data read from or data to be written to
the TLB.

Physical  This is the data field of the TLB. On a write to the TLB, the
Address   TLB entry allocated to the linear address in TR6 is set to this
          value. On a TLB lookup, if HT is set, the data field (physical
          address) from the TLB is read out to this field. If HT is not
          set, this field is undefined.

HT        For a TLB lookup, the HT bit indicates whether the lookup was a
          hit (HT  1) or a miss (HT  0). For a TLB write, HT must be set
          to 1.

REP       For a TLB write, selects which of four associative blocks of the
          TLB is to be written. For a TLB read, if HT is set, REP reports
          in which of the four associative blocks the tag was found; if HT
          is not set, REP is undefined.


Table 10-1. Meaning of D, U, and W Bit Pairs

X     X#      Effect during        Value of bit X
              TLB Lookup           after TLB Write

0     0       (undefined)          (undefined)
0     1       Match if X=0         Bit X becomes 0
1     0       Match if X=1         Bit X becomes 1
1     1       (undefined)          (undefined)


Figure 10-3.  TLB Structure

                                   ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
                                  7º       TAG       º      DATA      º
                                   ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                                                                    
                 ÚÄÄÄÄÄÄÄ                                           
                 ³       SET 11                                     
                 ³    ÚÄÄ          ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³           1º       TAG       º      DATA      º
                 ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³           0º       TAG       º      DATA      º
                 ³    ³            ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
                 ³    ³
                 ³    ³            ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
                 ³    ³           7º       TAG       º      DATA      º
                 ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³                                             
                 ³    ÀÄÄ                                           
                 ³       SET 10                                     
                 ³    ÚÄÄ          ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³           1º       TAG       º      DATA      º
      ³ D ³      ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
      ³ A ³      ³    ³           0º       TAG       º      DATA      º
      ³ T ÀÄÄÄÄÄÄÙ    ³            ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
      ³ A             ³
      ³   ÚÄÄÄÄÄÄ¿    ³            ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
      ³ B ³      ³    ³           7º       TAG       º      DATA      º
      ³ U ³      ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
      ³ S ³      ³    ³                                             
                 ³    ÀÄÄ                                           
                 ³       SET 01                                     
                 ³    ÚÄÄ          ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³           1º       TAG       º      DATA      º
                 ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³           0º       TAG       º      DATA      º
                 ³    ³            ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
                 ³    ³
                 ³    ³            ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
                 ³    ³           7º       TAG       º      DATA      º
                 ³    ³            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                 ³    ³                                             
                 ³    ÀÄÄ                                           
                 ³       SET 00                                     
                 ÀÄÄÄÄÄÄÄ          ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                                  1º       TAG       º      DATA      º
                                   ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÎÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
                                  0º       TAG       º      DATA      º
                                   ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ


Figure 10-4.  Test Registers

      31                23              15   11      7             0
     ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍØÍÍÍÍÍÍÍØÍÍÍÍÍÑÍÑÍÍÍÑÍÍÍ»
     º                                      ³             ³H³   ³   º
     º           PHYSICAL ADDRESS           ³0 0 0 0 0 0 0³ ³REP³0 0º TR7
     º                                      ³             ³T³   ³   º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÂÄÂÄÂÄÂÄÂÄÂÄÅÄÁÄÄÄÁÄÂĶ
     º                                      ³ ³ ³D³ ³U³ ³W³       ³ º
     º            LINEAR ADDRESS            ³V³D³ ³U³ ³ ³ ³0 0 0 0³Cº TR8
     º                                      ³ ³ ³#³ ³#³ ³#³       ³ º
     ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍØÍÏÍÏÍÏÍØÍÏÍÏÍÏÍÍÍÍÍÍÍÏͼ

     NOTE: 0 INDICATES INTEL RESERVED. NO NOT DEFINE


10.6.3  Test Operations

To write a TLB entry:

  1.  Move a doubleword to TR7 that contains the desired physical address,
      HT, and REP values. HT must contain 1. REP must point to the
      associative block in which to place the entry.

  2.  Move a doubleword to TR6 that contains the appropriate linear
      address, and values for V, D, U, and W. Be sure C=0 for "write"
      command.

Be careful not to write duplicate tags; the results of doing so are
undefined.

To look up (read) a TLB entry:

  1.  Move a doubleword to TR6 that contains the appropriate linear address
      and attributes. Be sure C=1 for "lookup" command.

  2.  Store TR7. If the HT bit in TR7 indicates a hit, then the other
      values reveal the TLB contents. If HT indicates a miss, then the other
      values in TR7 are indeterminate.

For the purposes of testing, the V bit functions as another bit of
addresss.  The V bit for a lookup request should usually be set, so that
uninitialized tags do not match. Lookups with V=0 are unpredictable if any
tags are uninitialized.


Chapter 11  Coprocessing and Multiprocessing

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The 80386 has two levels of support for multiple parallel processing units:

  þ  A highly specialized interface for very closely coupled processors of
     a type known as coprocessors.

  þ  A more general interface for more loosely coupled processors of
     unspecified type.


11.1  Coprocessing

The components of the coprocessor interface include:

  þ  ET bit of control register zero (CR0)
  þ  The EM, and MP bits of CR0
  þ  The ESC instructions
  þ  The WAIT instruction
  þ  The TS bit of CR0
  þ  Exceptions


11.1.1  Coprocessor Identification

The 80386 is designed to operate with either an 80287 or 80387 math
coprocessor. The ET bit of CR0 indicates which type of coprocessor is
present. ET is set automatically by the 80386 after RESET according to the
level detected on the ERROR# input. If desired, ET may also be set or reset
by loading CR0 with a MOV instruction. If ET is set, the 80386 uses the
32-bit protocol of the 80387; if reset, the 80386 uses the 16-bit protocol
of the 80287.


11.1.2  ESC and WAIT Instructions

The 80386 interprets the pattern 11011B in the first five bits of an
instruction as an opcode intended for a coprocessor. Instructions thus
marked are called ESCAPE or ESC instructions. The CPU performs the following
functions upon encountering an ESC instruction before sending the
instruction to the coprocessor:

  þ  Tests the emulation mode (EM) flag to determine whether coprocessor
     functions are being emulated by software.

  þ  Tests the TS flag to determine whether there has been a context change
     since the last ESC instruction.

  þ  For some ESC instructions, tests the ERROR# pin to determine whether
     the coprocessor detected an error in the previous ESC instruction.

The WAIT instruction is not an ESC instruction, but WAIT causes the CPU to
perform some of the same tests that it performs upon encountering an ESC
instruction. The processor performs the following actions for a WAIT
instruction:

  þ  Waits until the coprocessor no longer asserts the BUSY# pin.

  þ  Tests the ERROR# pin (after BUSY# goes inactive). If ERROR# is active,
     the 80386 signals exception 16, which indicates that the coprocessor
     encountered an error in the previous ESC instruction.

  þ  WAIT can therefore be used to cause exception 16 if an error is
     pending from a previous ESC instruction. Note that, if no coprocessor
     is present, the ERROR# and BUSY# pins should be tied inactive to
     prevent WAIT from waiting forever or causing spurious exceptions.


11.1.3  EM and MP Flags

The EM and MP flags of CR0 control how the processor reacts to coprocessor
instructions.

The EM bit indicates whether coprocessor functions are to be emulated. If
the processor finds EM set when executing an ESC instruction, it signals
exception 7, giving the exception handler an opportunity to emulate the ESC
instruction.

The MP (monitor coprocessor) bit indicates whether a coprocessor is
actually attached. The MP flag controls the function of the WAIT
instruction. If, when executing a WAIT instruction, the CPU finds MP set,
then it tests the TS flag; it does not otherwise test TS during a WAIT
instruction. If it finds TS set under these conditions, the CPU signals
exception 7.

The EM and MP flags can be changed with the aid of a MOV instruction using
CR0 as the destination operand and read with the aid of a MOV instruction
with CR0 as the source operand. These forms of the MOV instruction can be
executed only at privilege level zero.


11.1.4  The Task-Switched Flag

The TS bit of CR0 helps to determine when the context of the coprocessor
does not match that of the task being executed by the 80386 CPU. The 80386
sets TS each time it performs a task switch (whether triggered by software
or by hardware interrupt). If, when interpreting one of the ESC
instructions, the CPU finds TS already set, it causes exception 7. The WAIT
instruction also causes exception 7 if both TS and MP are set. Operating
systems can use this exception to switch the context of the coprocessor to
correspond to the current task. Refer to the 80386 System Software Writer's
Guide for an example.

The CLTS instruction (legal only at privilege level zero) resets the TS
flag.


11.1.5  Coprocessor Exceptions

Three exceptions aid in interfacing to a coprocessor: interrupt 7
(coprocessor not available), interrupt 9 (coprocessor segment overrun), and
interrupt 16 (coprocessor error).


11.1.5.1  Interrupt 7 ÄÄ Coprocessor Not Available

This exception occurs in either of two conditions:

  1.  The CPU encounters an ESC instruction and EM is set. In this case,
      the exception handler should emulate the instruction that caused the
      exception. TS may also be set.

  2.  The CPU encounters either the WAIT instruction or an ESC instruction
      when both MP and TS are set. In this case, the exception handler
      should update the state of the coprocessor, if necessary.


11.1.5.2  Interrupt 9 ÄÄ Coprocessor Segment Overrun

This exception occurs in protected mode under the following conditions:

  þ  An operand of a coprocessor instruction wraps around an addressing
     limit (0FFFFH for small segments, 0FFFFFFFFH for big segments, zero for
     expand-down segments). An operand may wrap around an addressing limit
     when the segment limit is near an addressing limit and the operand is
     near the largest valid address in the segment. Because of the
     wrap-around, the beginning and ending addresses of such an operand
     will be near opposite ends of the segment.

  þ  Both the first byte and the last byte of the operand (considering
     wrap-around) are at addresses located in the segment and in present and
     accessible pages.

  þ  The operand spans inaccessible addresses. There are two ways that such
     an operand may also span inaccessible addresses:

     1.  The segment limit is not equal to the addressing limit (e.g.,
         addressing limit is FFFFH and segment limit is FFFDH); therefore,
         the operand will span addresses that are not within the segment
         (e.g., an 8-byte operand that starts at valid offset FFFC will span
         addresses FFFC-FFFF and 0000-0003; however, addresses FFFE and FFFF
         are not valid, because they exceed the limit);

     2.  The operand begins and ends in present and accessible pages but
         intermediate bytes of the operand fall either in a not-present page
         or in a page to which the current procedure does not have access
         rights.

The address of the failing numerics instruction and data operand may be
lost; an FSTENV does not return reliable addresses. As with the 80286/80287,
the segment overrun exception should be handled by executing an FNINIT
instruction (i.e., an FINIT without a preceding WAIT). The return address on
the stack does not necessarily point to the failing instruction nor to the
following instruction. The failing numerics instruction is not restartable.

Case 2 can be avoided by either aligning all segments on page boundaries or
by not starting them within 108 bytes of the start or end of a page. (The
maximum size of a coprocessor operand is 108 bytes.) Case 1 can be avoided
by making sure that the gap between the last valid offset and the first
valid offset of a segment is either no less than 108 bytes or is zero (i.e.,
the segment is of full size). If neither software system design constraint
is acceptable, the exception handler should execute FNINIT and should
probably terminate the task.


11.1.5.3  Interrupt 16 ÄÄ Coprocessor Error

The numerics coprocessors can detect six different exception conditions
during instruction execution. If the detected exception is not masked by a
bit in the control word, the coprocessor communicates the fact that an error
occurred to the CPU by a signal at the ERROR# pin. The CPU causes interrupt
16 the next time it checks the ERROR# pin, which is only at the beginning of
a subsequent WAIT or certain ESC instructions. If the exception is masked,
the numerics coprocessor handles the exception according to on-board logic;
it does not assert the ERROR# pin in this case.


11.2  General Multiprocessing

The components of the general multiprocessing interface include:

  þ  The LOCK# signal

  þ  The LOCK instruction prefix, which gives programmed control of the
     LOCK# signal.

  þ  Automatic assertion of the LOCK# signal with implicit memory updates
     by the processor


11.2.1  LOCK and the LOCK# Signal

The LOCK instruction prefix and its corresponding output signal LOCK# can
be used to prevent other bus masters from interrupting a data movement
operation. LOCK may only be used with the following 80386 instructions when
they modify memory. An undefined-opcode exception results from using LOCK
before any instruction other than:

  þ  Bit test and change: BTS, BTR, BTC.
  þ  Exchange: XCHG.
  þ  Two-operand arithmetic and logical: ADD, ADC, SUB, SBB, AND, OR, XOR.
  þ  One-operand arithmetic and logical: INC, DEC, NOT, and NEG.

A locked instruction is only guaranteed to lock the area of memory defined
by the destination operand, but it may lock a larger memory area. For
example, typical 8086 and 80286 configurations lock the entire physical
memory space. The area of memory defined by the destination operand is
guaranteed to be locked against access by a processor executing a locked
instruction on exactly the same memory area, i.e., an operand with
identical starting address and identical length.

The integrity of the lock is not affected by the alignment of the memory
field. The LOCK signal is asserted for as many bus cycles as necessary to
update the entire operand.


11.2.2  Automatic Locking

In several instances, the processor itself initiates activity on the data
bus. To help ensure that such activities function correctly in
multiprocessor configurations, the processor automatically asserts the LOCK#
signal. These instances include:

  þ  Acknowledging interrupts.

     After an interrupt request, the interrupt controller uses the data bus
     to send the interrupt ID of the interrupt source to the CPU. The CPU
     asserts LOCK# to ensure that no other data appears on the data bus
     during this time.

  þ  Setting busy bit of TSS descriptor.

     The processor tests and sets the busy-bit in the type field of the TSS
     descriptor when switching to a task. To ensure that two different
     processors cannot simultaneously switch to the same task, the processor
     asserts LOCK# while testing and setting this bit.

  þ  Loading of descriptors.

     While copying the contents of a descriptor from a descriptor table into
     a segment register, the processor asserts LOCK# so that the descriptor
     cannot be modified by another processor while it is being loaded. For
     this action to be effective, operating-system procedures that update
     descriptors should adhere to the following steps:

     ÄÄ  Use a locked update to the access-rights byte to mark the
         descriptor not-present.

     ÄÄ  Update the fields of the descriptor.  (This may require several
         memory accesses; therefore, LOCK cannot be used.)

     ÄÄ  Use a locked update to the access-rights byte to mark the
         descriptor present again.

  þ  Updating page-table A and D bits.

     The processor exerts LOCK# while updating the A (accessed) and D 
     (dirty) bits of page-table entries.  Also the processor bypasses the
     page-table cache and directly updates these bits in memory.

  þ  Executing XCHG instruction.

     The 80386 always asserts LOCK during an XCHG instruction that
     references memory (even if the LOCK prefix is not used).


11.2.3  Cache Considerations

Systems programmers must take care when updating shared data that may also
be stored in on-chip registers and caches.  With the 80386, such  shared
data includes:

  þ  Descriptors, which may be held in segment registers.

     A change to a descriptor that is shared among processors should be
     broadcast to all processors.  Segment registers are effectively
     "descriptor caches".  A change to a descriptor will not be utilized by
     another processor if that processor already has a copy of the old
     version of the descriptor in a segment register.

  þ  Page tables, which may be held in the page-table cache.

     A change to a page table that is shared among processors should be
     broadcast to all processors, so that others can flush their page-table
     caches and reload them with up-to-date page tables from memory.

Systems designers can employ an interprocessor interrupt to handle the
above cases. When one processor changes data that may be cached by other
processors, it can send an interrupt signal to all other processors that may
be affected by the change. If the interrupt is serviced by an interrupt
task, the task switch automatically flushes the segment registers. The task
switch also flushes the page-table cache if the PDBR (the contents of CR3)
of the interrupt task is different from the PDBR of every other task.

In multiprocessor systems that need a cacheability signal from the CPU, it
is recommended that physical address pin A31 be used to indicate
cacheability. Such a system can then possess up to 2 Gbytes of physical
memory. The virtual address range available to the programmer is not
affected by this convention.


Chapter 12  Debugging

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The 80386 brings to Intel's line of microprocessors significant advances in
debugging power. The single-step exception and breakpoint exception of
previous processors are still available in the 80386, but the principal
debugging support takes the form of debug registers. The debug registers
support both instruction breakpoints and data breakpoints. Data breakpoints
are an important innovation that can save hours of debugging time by
pinpointing, for example, exactly when a data structure is being
overwritten. The breakpoint registers also eliminate the complexities
associated with writing a breakpoint instruction into a code segment
(requires a data-segment alias for the code segment) or a code segment
shared by multiple tasks (the breakpoint exception can occur in the context
of any of the tasks). Breakpoints can even be set in code contained in ROM.


12.1  Debugging Features of the Architecture

The features of the 80386 architecture that support debugging include:

Reserved debug interrupt vector

Permits processor to automatically invoke a debugger task or procedure when
an event occurs that is of interest to the debugger.

Four debug address registers

Permit programmers to specify up to four addresses that the CPU will
automatically monitor.

Debug control register

Allows programmers to selectively enable various debug conditions
associated with the four debug addresses.

Debug status register

Helps debugger identify condition that caused debug exception.

Trap bit of TSS (T-bit)

Permits monitoring of task switches.

Resume flag (RF) of flags register

Allows an instruction to be restarted after a debug exception without
immediately causing another debug exception due to the same condition.

Single-step flag (TF)

Allows complete monitoring of program flow by specifying whether the CPU
should cause a debug exception with the execution of every instruction.

Breakpoint instruction

Permits debugger intervention at any point in program execution and aids
debugging of debugger programs.

Reserved interrupt vector for breakpoint exception

Permits processor to automatically invoke a handler task or procedure upon
encountering a breakpoint instruction.

These features make it possible to invoke a debugger that is either a
separate task or a procedure in the context of the current task. The
debugger can be invoked under any of the following kinds of conditions:

  þ  Task switch to a specific task.
  þ  Execution of the breakpoint instruction.
  þ  Execution of every instruction.
  þ  Execution of any instruction at a given address.
  þ  Read or write of a byte, word, or doubleword at any specified address.
  þ  Write to a byte, word, or doubleword at any specified address.
  þ  Attempt to change a debug register.


12.2  Debug Registers

Six 80386 registers are used to control debug features. These registers are
accessed by variants of the MOV instruction. A debug register may be either
the source operand or destination operand. The debug registers are
privileged resources; the MOV instructions that access them can only be
executed at privilege level zero. An attempt to read or write the debug
registers when executing at any other privilege level causes a general
protection exception. Figure 12-1 shows the format of the debug registers.


Figure 12-1.  Debug Registers

      31              23              15              7             0
     ÉÍÍÍÑÍÍÍÑÍÍÍÑÍÍÍØÍÍÍÑÍÍÍÑÍÍÍÑÍÍÍØÍÍÍÑÍÑÍÍÍÍÍÑÍÑÍØÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍ»
     ºLEN³R/W³LEN³R/W³LEN³R/W³LEN³R/W³   ³ ³     ³G³L³G³L³G³L³G³L³G³Lº
     º   ³   ³   ³   ³   ³   ³   ³   ³0 0³0³0 0 0³ ³ ³ ³ ³ ³ ³ ³ ³ ³ º DR7
     º 3 ³ 3 ³ 2 ³ 2 ³ 1 ³ 1 ³ 0 ³ 0 ³   ³ ³     ³E³E³3³3³2³2³1³1³0³0º
     ÇÄÄÄÁÄÄÄÁÄÄÄÁÄÄÄÁÄÄÄÁÄÄÄÁÄÄÄÁÄÄÄÅÄÂÄÅÄÅÄÄÄÄÄÁÄÁÄÁÄÁÄÁÄÁÄÅÄÅÄÅÄÅĶ
     º                               ³B³B³B³                 ³B³B³B³Bº
     º0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0³ ³ ³ ³0 0 0 0 0 0 0 0 0³ ³ ³ ³ º DR6
     º                               ³T³S³D³                 ³3³2³1³0º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÁÄÁÄÁÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÁÄÁÄÁÄÁĶ
     º                                                               º
     º                            RESERVED                           º DR5
     º                                                               º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
     º                                                               º
     º                            RESERVED                           º DR4
     º                                                               º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
     º                                                               º
     º                 BREAKPOINT 3 LINEAR ADDRESS                   º DR3
     º                                                               º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
     º                                                               º
     º                 BREAKPOINT 2 LINEAR ADDRESS                   º DR2
     º                                                               º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
     º                                                               º
     º                 BREAKPOINT 1 LINEAR ADDRESS                   º DR1
     º                                                               º
     ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
     º                                                               º
     º                 BREAKPOINT 0 LINEAR ADDRESS                   º DR0
     º                                                               º
     ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍͼ

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
NOTE
      0 MEANS INTEL RESERVED. DO NOT DEFINE.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ


12.2.1  Debug Address Registers (DR0-DR3)

Each of these registers contains the linear address associated with one of
four breakpoint conditions. Each breakpoint condition is further defined by
bits in DR7.

The debug address registers are effective whether or not paging is enabled.
The addresses in these registers are linear addresses. If paging is enabled,
the linear addresses are translated into physical addresses by the
processor's paging mechanism (as explained in Chapter 5). If paging is not
enabled, these linear addresses are the same as physical addresses.

Note that when paging is enabled, different tasks may have different
linear-to-physical address mappings. When this is the case, an address in a
debug address register may be relevant to one task but not to another. For
this reason the 80386  has both global and local enable bits in DR7. These
bits indicate whether a given debug address has a global (all tasks) or
local (current task only) relevance.


12.2.2  Debug Control Register (DR7)

The debug control register shown in Figure 12-1 both helps to define the
debug conditions and selectively enables and disables those conditions.

For each address in registers DR0-DR3, the corresponding fields R/W0
through R/W3 specify the type of action that should cause a breakpoint. The
processor interprets these bits as follows:

   00 ÄÄ Break on instruction execution only
   01 ÄÄ Break on data writes only
   10 ÄÄ undefined
   11 ÄÄ Break on data reads or writes but not instruction fetches

Fields LEN0 through LEN3 specify the length of data item to be monitored. A
length of 1, 2, or 4 bytes may be specified. The values of the length fields
are interpreted as follows:

   00 ÄÄ one-byte length
   01 ÄÄ two-byte length
   10 ÄÄ undefined
   11 ÄÄ four-byte length

If RWn is 00 (instruction execution), then LENn should also be 00. Any other
length is undefined.

The low-order eight bits of DR7 (L0 through L3 and G0 through G3)
selectively enable the four address breakpoint conditions. There are two
levels of enabling: the local (L0 through L3) and global (G0 through G3)
levels. The local enable bits are automatically reset by the processor at
every task switch to avoid unwanted breakpoint conditions in the new task.
The global enable bits are not reset by a task switch; therefore, they can
be used for conditions that are global to all tasks.

The LE and GE bits control the "exact data breakpoint match" feature of the
processor. If either LE or GE is set, the processor slows execution so that
data breakpoints are reported on the instruction that causes them. It is
recommended that one of these bits be set whenever data breakpoints are
armed. The processor clears LE at a task switch but does not clear GE.


12.2.3  Debug Status Register (DR6)

The debug status register shown in Figure 12-1 permits the debugger to
determine which debug conditions have occurred.

When the processor detects an enabled debug exception, it sets the
low-order bits of this register (B0 thru B3) before entering the debug
exception handler. Bn is set if the condition described by DRn, LENn, and
R/Wn occurs. (Note that the processor sets Bn regardless of whether Gn or
Ln is set. If more than one breakpoint condition occurs at one time and if
the breakpoint trap occurs due to an enabled condition other than n, Bn may
be set, even though neither Gn nor Ln is set.)

The BT bit is associated with the T-bit (debug trap bit) of the TSS (refer
to 7 for the location of the T-bit). The processor sets the BT bit before
entering the debug handler if a task switch has occurred and the T-bit of
the new TSS is set. There is no corresponding bit in DR7 that enables and
disables this trap; the T-bit of the TSS is the sole enabling bit.

The BS bit is associated with the TF (trap flag) bit of the EFLAGS
register. The BS bit is set if the debug handler is entered due to the
occurrence of a single-step exception. The single-step trap is the
highest-priority debug exception; therefore, when BS is set, any of the
other debug status bits may also be set.

The BD bit is set if the next instruction will read or write one of the
eight debug registers and ICE-386 is also using the debug registers at the
same time.

Note that the bits of DR6 are never cleared by the processor. To avoid any
confusion in identifying the next debug exception, the debug handler should
move zeros to DR6 immediately before returning.


12.2.4  Breakpoint Field Recognition

The linear address and LEN field for each of the four breakpoint conditions
define a range of sequential byte addresses for a data breakpoint. The LEN
field permits specification of a one-, two-, or four-byte field. Two-byte
fields must be aligned on word boundaries (addresses that are multiples of
two) and four-byte fields must be aligned on doubleword boundaries
(addresses that are multiples of four). These requirements are enforced by
the processor; it uses the LEN bits to mask the low-order bits of the
addresses in the debug address registers. Improperly aligned code or data
breakpoint addresses will not yield the expected results.

A data read or write breakpoint is triggered if any of the bytes
participating in a memory access is within the field defined by a breakpoint
address register and the corresponding LEN field. Table 12-1 gives some
examples of breakpoint fields with memory references that both do and do not
cause traps.

To set a data breakpoint for a misaligned field longer than one byte, it
may be desirable to put two sets of entries in the breakpoint register such
that each entry is properly aligned and the two entries together span the
length of the field.

Instruction breakpoint addresses must have a length specification of one
byte (LEN = 00); other values are undefined. The processor recognizes an
instruction breakpoint address only when it points to the first byte of an
instruction. If the instruction has any prefixes, the breakpoint address
must point to the first prefix.


Table 12-1. Breakpoint Field Recognition Examples

                                    Address (hex)          Length

                       DR0             0A0001          1 (LEN0 = 00)
Register Contents      DR1             0A0002          1 (LEN1 = 00)
                       DR2             0B0002          2 (LEN2 = 01)
                       DR3             0C0000          4 (LEN3 = 11)

Some Examples of Memory                0A0001          1
References That Cause Traps            0A0002          1
                                       0A0001          2
                                       0A0002          2
                                       0B0002          2
                                       0B0001          4
                                       0C0000          4
                                       0C0001          2
                                       0C0003          1

Some Examples of Memory                0A0000          1
References That Don't Cause Traps      0A0003          4
                                       0B0000          2
                                       0C0004          4


12.3  Debug Exceptions

Two of the interrupt vectors of the 80386 are reserved for exceptions that
relate to debugging. Interrupt 1 is the primary means of invoking debuggers
designed expressly for the 80386; interrupt 3 is intended for debugging
debuggers and for compatibility with prior processors in Intel's 8086
processor family.


12.3.1  Interrupt 1 ÄÄ Debug Exceptions

The handler for this exception is usually a debugger or part of a debugging
system. The processor causes interrupt 1 for any of several conditions. The
debugger can check flags in DR6 and DR7 to determine what condition caused
the exception and what other conditions might be in effect at the same time.
Table 12-2 associates with each breakpoint condition the combination of
bits that indicate when that condition has caused the debug exception.

Instruction address breakpoint conditions are faults, while other debug
conditions are traps. The debug exception may report either or both at one
time. The following paragraphs present details for each class of debug
exception.


Table 12-2. Debug Exception Conditions

Flags to Test              Condition

BS=1                       Single-step trap
B0=1 AND (GE0=1 OR LE0=1)  Breakpoint DR0, LEN0, R/W0
B1=1 AND (GE1=1 OR LE1=1)  Breakpoint DR1, LEN1, R/W1
B2=1 AND (GE2=1 OR LE2=1)  Breakpoint DR2, LEN2, R/W2
B3=1 AND (GE3=1 OR LE3=1)  Breakpoint DR3, LEN3, R/W3
BD=1                       Debug registers not available; in use by ICE-386.
BT=1                       Task switch


12.3.1.1  Instruction Addrees Breakpoint

The processor reports an instruction-address breakpoint before it executes
the instruction that begins at the given address; i.e., an instruction-
address breakpoint exception is a fault.

The RF (restart flag) permits the debug handler to retry instructions that
cause other kinds of faults in addition to debug faults. When it detects a
fault, the processor automatically sets RF in the flags image that it pushes
onto the stack. (It does not, however, set RF for traps and aborts.)

When RF is set, it causes any debug fault to be ignored during the next
instruction. (Note, however, that RF does not cause breakpoint traps to be
ignored, nor other kinds of faults.)

The processor automatically clears RF at the successful completion of every
instruction except after the IRET instruction, after the POPF instruction,
and after a JMP, CALL, or INT instruction that causes a task switch. These
instructions set RF to the value specified by the memory image of the EFLAGS
register.

The processor automatically sets RF in the EFLAGS image on the stack before
entry into any fault handler. Upon entry into the fault handler for
instruction address breakpoints, for example, RF is set in the EFLAGS image
on the stack; therefore, the IRET instruction at the end of the handler will
set RF in the EFLAGS register, and execution will resume at the breakpoint
address without generating another breakpoint fault at the same address.

If, after a debug fault, RF is set and the debug handler retries the
faulting instruction, it is possible that retrying the instruction will
raise other faults. The retry of the instruction after these faults will
also be done with RF=1, with the result that debug faults continue to be
ignored. The processor clears RF only after successful completion of the
instruction.

Real-mode debuggers can control the RF flag by using a 32-bit IRET. A
16-bit IRET instruction does not affect the RF bit (which is in the
high-order 16 bits of EFLAGS). To use a 32-bit IRET, the debugger must
rearrange the stack so that it holds appropriate values for the 32-bit EIP,
CS, and EFLAGS (with RF set in the EFLAGS image). Then executing an IRET
with an operand size prefix causes a 32-bit return, popping the RF flag
into EFLAGS.


12.3.1.2  Data Address Breakpoint

A data-address breakpoint exception is a trap; i.e., the processor reports
a data-address breakpoint after executing the instruction that accesses the
given memory item.

When using data breakpoints it is recommended that either the LE or GE bit
of DR7 be set also. If either LE or GE is set, any data breakpoint trap is
reported exactly after completion of the instruction that accessed the
specified memory item. This exact reporting is accomplished by forcing the
80386 execution unit to wait for completion of data operand transfers before
beginning execution of the next instruction. If neither GE nor LE is set,
data breakpoints may not be reported until one instruction after the data is
accessed or may not be reported at all. This is due to the fact that,
normally, instruction execution is overlapped with memory transfers to such
a degree that execution of the next instruction may begin before memory
transfers for the prior instruction are completed.

If a debugger needs to preserve the contents of a write breakpoint
location, it should save the original contents before setting a write
breakpoint. Because data breakpoints are traps, a write into a breakpoint
location will complete before the trap condition is reported. The handler
can report the saved value after the breakpoint is triggered. The data in
the debug registers can be used to address the new value stored by the
instruction that triggered the breakpoint.


12.3.1.3  General Detect Fault

This exception occurs when an attempt is made to use the debug registers at
the same time that ICE-386 is using them. This additional protection feature
is provided to guarantee that ICE-386 can have full control over the
debug-register resources when required. ICE-386 uses the debug-registers;
therefore, a software debugger that also uses these registers cannot run
while ICE-386 is in use. The exception handler can detect this condition by
examining the BD bit of DR6.


12.3.1.4  Single-Step Trap

This debug condition occurs at the end of an instruction if the trap flag
(TF) of the flags register held the value one at the beginning of that
instruction.  Note that the exception does not occur at the end of an
instruction that sets TF. For example, if POPF is used to set TF, a
single-step trap does not occur until after the instruction that follows
POPF.

The processor clears the TF bit before invoking the handler.  If TF=1 in
the flags image of a TSS at the time of a task switch, the exception occurs
after the first instruction is executed in the new task.

The single-step flag is normally not cleared by privilege changes inside a
task.  INT instructions, however, do clear TF.  Therefore, software
debuggers that single-step code must recognize and emulate INT n or INTO
rather than executing them directly.

To maintain protection, system software should check the current execution
privilege level after any single step interrupt to see whether single
stepping should continue at the current privilege level.

The interrupt priorities in hardware guarantee that if an external
interrupt occurs, single stepping stops. When both an external interrupt and
a single step interrupt occur together, the single step interrupt is
processed first. This clears the TF bit. After saving the return address or
switching tasks, the external interrupt input is examined before the first
instruction of the single step handler executes.  If the external interrupt
is still pending, it is then serviced. The external interrupt handler is not
single-stepped. To single step an interrupt handler, just single step an INT
n instruction that refers to the interrupt handler.


12.3.1.5  Task Switch Breakpoint

The debug exception also occurs after a switch to an 80386 task if the
T-bit of the new TSS is set.  The exception occurs after control has passed
to the new task, but before the first instruction of that task is executed.
The exception handler can detect this condition by examining the BT bit of
the debug status register DR6.

Note that if the debug exception handler is a task, the T-bit of its TSS
should not be set. Failure to observe this rule will cause the processor to
enter an infinite loop.


12.3.2  Interrupt 3 ÄÄ Breakpoint Exception

This exception is caused by execution of the breakpoint instruction INT 3.
Typically, a debugger prepares a breakpoint by substituting the opcode of
the one-byte breakpoint instruction in place of the first opcode byte of the
instruction to be trapped. When execution of the INT 3 instruction causes
the exception handler to be invoked, the saved value of ES:EIP points to the
byte following the INT 3 instruction.

With prior generations of processors, this feature is used extensively for
trapping execution of specific instructions. With the 80386, the needs
formerly filled by this feature are more conveniently solved via the debug
registers and interrupt 1.  However, the breakpoint exception is still
useful for debugging debuggers, because the breakpoint exception can vector
to a different exception handler than that used by the debugger. The
breakpoint exception can also be useful when it is necessary to set a
greater number of breakpoints than permitted by the debug registers.


                          PART III  COMPATIBILITY                          


Chapter 13  Executing 80286 Protected-Mode Code

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

13.1  80286 Code Executes as a Subset of the 80386

In general, programs designed for execution in protected mode on an 80286
execute without modification on the 80386, because the features of the 80286
are a subset of those of the 80386.

All the descriptors used by the 80286 are supported by the 80386 as long as
the Intel-reserved word (last word) of the 80286 descriptor is zero.

The descriptors for data segments, executable segments, local descriptor
tables, and task gates are common to both the 80286 and the 80386. Other
80286 descriptorsÄÄTSS segment, call gate, interrupt gate, and trap
gateÄÄare supported by the 80386. The 80386 also has new versions of
descriptors for TSS segment, call gate, interrupt gate, and trap gate that
support the 32-bit nature of the 80386. Both sets of descriptors can be
used simultaneously in the same system.

For those descriptors that are common to both the 80286 and the 80386, the
presence of zeros in the final word causes the 80386 to interpret these
descriptors exactly as 80286 does; for example:

Base Address      The high-order eight bits of the 32-bit base address are
                  zero, limiting base addresses to 24 bits.

Limit             The high-order four bits of the limit field are zero,
                  restricting the value of the limit field to 64K.

Granularity bit   The granularity bit is zero, which implies that the value
                  of the 16-bit limit is interpreted in units of one byte.

B-bit             In a data-segment descriptor, the B-bit is zero, implying
                  that the segment is no larger than 64 Kbytes.

D-bit             In an executable-segment descriptor, the D-bit is zero,
                  implying that 16-bit addressing and operands are the
                  default.

For formats of these descriptors and documentation of their use refer to
the iAPX 286 Programmer's Reference Manual.


13.2  Two ways to Execute 80286 Tasks

When porting 80286 programs to the 80386, there are two cases to consider:

  1.  Porting an entire 80286 system to the 80386, complete with 80286
      operating system, loader, and system builder.

      In this case, all tasks will have 80286 TSSs. The 80386 is being used
      as a faster 286.

  2.  Porting selected 80286 applications to run in an 80386 environment
      with an 80386 operating system, loader, and system builder.

      In this case, the TSSs used to represent 80286 tasks should be
      changed to 80386 TSSs. It is theoretically possible to mix 80286 and
      80386 TSSs, but the benefits are slight and the problems are great. It
      is recommended that all tasks in a 80386 software system have 80386
      TSSs. It is not necessary to change the 80286 object modules
      themselves; TSSs are usually constructed by the operating system, by
      the loader, or by the system builder. Refer to Chapter 16 for further
      discussion of the interface between 16-bit and 32-bit code.


13.3  Differences From 80286

The few differences that do exist primarily affect operating system code.


13.3.1  Wraparound of 80286 24-Bit Physical Address Space

With the 80286, any base and offset combination that addresses beyond 16M
bytes wraps around to the first megabyte of the 80286 address space. With
the 80386, since it has a greater physical address space, any such address
falls into the 17th megabyte. In the unlikely event that any software
depends on this anomaly, the same effect can be simulated on the 80386 by
using paging to map the first 64K bytes of the 17th megabyte of logical
addresses to physical addresses in the first megabyte.


13.3.2  Reserved Word of Descriptor

Because the 80386 uses the contents of the reserved word (last word) of
every descriptor, 80286 programs that place values in this word may not
execute correctly on the 80386.


13.3.3  New Descriptor Type Codes

Operating-system code that manages space in descriptor tables often uses an
invalid value in the access-rights field of descriptor-table entries to
identify unused entries. Access rights values of 80H and 00H remain invalid
for both the 80286 and 80386. Other values that were invalid on for the
80286 may be valid for the 80386 because of the additional descriptor types
defined by the 80386.


13.3.4  Restricted Semantics of LOCK

The 80286 processor implements the bus lock function differently than the
80386. Programs that use forms of memory locking specific to the 80286 may
not execute properly when transported to a specific application of the
80386.

The LOCK prefix and its corresponding output signal should only be used to
prevent other bus masters from interrupting a data movement operation.  LOCK
may only be used with the following 80386 instructions when they modify
memory. An undefined-opcode exception results from using LOCK before any
other instruction.

  þ  Bit test and change:  BTS, BTR, BTC.
  þ  Exchange: XCHG.
  þ  One-operand arithmetic and logical: INC, DEC, NOT, and NEG.
  þ  Two-operand arithmetic and logical:  ADD, ADC, SUB, SBB, AND, OR, XOR.

A locked instruction is guaranteed to lock only the area of memory defined
by the destination operand, but may lock a larger memory area.  For example,
typical 8086 and 80286 configurations lock the entire physical memory space.
With the 80386, the defined area of memory is guaranteed to be locked
against access by a processor executing a locked instruction on exactly the
same memory area, i.e., an operand with identical starting address and
identical length.


13.3.5  Additional Exceptions

The 80386 defines new exceptions that can occur even in systems designed
for the 80286.

  þ  Exception #6 ÄÄ invalid opcode

     This exception can result from improper use of the LOCK instruction.

  þ  Exception #14 ÄÄ page fault

     This exception may occur in an 80286 program if the operating system
     enables paging. Paging can be used in a system with 80286 tasks as long
     as all tasks use the same page directory. Because there is no place in
     an 80286 TSS to store the PDBR, switching to an 80286 task does not
     change the value of PDBR. Tasks ported from the 80286 should be given
     80386 TSSs so they can take full advantage of paging.


Chapter 14  80386 Real-Address Mode

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The real-address mode of the 80386 executes object code designed for
execution on 8086, 8088, 80186, or 80188 processors, or for execution in the
real-address mode of an 80286:

In effect, the architecture of the 80386 in this mode is almost identical
to that of the 8086, 8088, 80186, and 80188. To a programmer, an 80386 in
real-address mode appears as a high-speed 8086 with extensions to the
instruction set and registers. The principal features of this architecture
are defined in Chapters 2 and 3.

This chapter discusses certain additional topics that complete the system
programmer's view of the 80386 in real-address mode:

  þ  Address formation.
  þ  Extensions to registers and instructions.
  þ  Interrupt and exception handling.
  þ  Entering and leaving real-address mode.
  þ  Real-address-mode exceptions.
  þ  Differences from 8086.
  þ  Differences from 80286 real-address mode.


14.1  Physical Address Formation

The 80386 provides a one Mbyte + 64 Kbyte memory space for an 8086 program.
Segment relocation is performed as in the 8086: the 16-bit value in a
segment selector is shifted left by four bits to form the base address of a
segment. The effective address is extended with four high order zeros and
added to the base to form a linear address as Figure 14-1 illustrates. (The
linear address is equivalent to the physical address, because paging is not
used in real-address mode.) Unlike the 8086, the resulting linear address
may have up to 21 significant bits. There is a possibility of a carry when
the base address is added to the effective address. On the 8086, the carried
bit is truncated, whereas on the 80386 the carried bit is stored in bit
position 20 of the linear address.

Unlike the 8086 and 80286, 32-bit effective addresses can be generated (via
the address-size prefix); however, the value of a 32-bit address may not
exceed 65535 without causing an exception. For full compatibility with 80286
real-address mode, pseudo-protection faults (interrupt 12 or 13 with no
error code) occur if an effective address is generated outside the range 0
through 65535.


Figure 14-1.  Real-Address Mode Address Formation

                      19                                3       0
                     ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍ»
         BASE        º     16-BIT SEGMENT SELECTOR     ³ 0 0 0 0 º
                     ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍͼ

         +
                      19        15                              0
                     ÉÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
         OFFSET      º 0 0 0 0 ³    16-BIT EFFECTIVE ADDRESS     º
                     ÈÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ

         =
                    20                                          0
         LINEAR    ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
         ADDRESS   º X X X X X X X X X X X X X X X X X X X X X X º
                   ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ


14.2  Registers and Instructions

The register set available in real-address mode includes all the registers
defined for the 8086 plus the new registers introduced by the 80386: FS, GS,
debug registers, control registers, and test registers. New instructions
that explicitly operate on the segment registers FS and GS are available,
and the new segment-override prefixes can be used to cause instructions to
utilize FS and GS for address calculations. Instructions can utilize 32-bit
operands through the use of the operand size prefix.

The instruction codes that cause undefined opcode traps (interrupt 6)
include instructions of the protected mode that manipulate or interrogate
80386 selectors and descriptors; namely, VERR, VERW, LAR, LSL, LTR, STR,
LLDT, and SLDT. Programs executing in real-address mode are able to take
advantage of the new applications-oriented instructions added to the
architecture by the introduction of the 80186/80188, 80286 and 80386:

þ New instructions introduced by 80186/80188 and 80286.

   ÄÄ PUSH immediate data
   ÄÄ Push all and pop all (PUSHA and POPA)
   ÄÄ Multiply immediate data
   ÄÄ Shift and rotate by immediate count
   ÄÄ String I/O
   ÄÄ ENTER and LEAVE
   ÄÄ BOUND

þ New instructions introduced by 80386.

   ÄÄ LSS, LFS, LGS instructions
   ÄÄ Long-displacement conditional jumps
   ÄÄ Single-bit instructions
   ÄÄ Bit scan
   ÄÄ Double-shift instructions
   ÄÄ Byte set on condition
   ÄÄ Move with sign/zero extension
   ÄÄ Generalized multiply
   ÄÄ MOV to and from control registers
   ÄÄ MOV to and from test registers
   ÄÄ MOV to and from debug registers


14.3  Interrupt and Exception Handling

Interrupts and exceptions in 80386 real-address mode work as much as they
do on an 8086. Interrupts and exceptions vector to interrupt procedures via
an interrupt table. The processor multiplies the interrupt or exception
identifier by four to obtain an index into the interrupt table. The entries
of the interrupt table are far pointers to the entry points of interrupt or
exception handler procedures. When an interrupt occurs, the processor
pushes the current values of CS:IP onto the stack, disables interrupts,
clears TF (the single-step flag), then transfers control to the location
specified in the interrupt table. An IRET instruction at the end of the
handler procedure reverses these steps before returning control to the
interrupted procedure.

The primary difference in the interrupt handling of the 80386 compared to
the 8086 is that the location and size of the interrupt table depend on the
contents of the IDTR (IDT register). Ordinarily, this fact is not apparent
to programmers, because, after RESET, the IDTR contains a base address of 0
and a limit of 3FFH, which is compatible with the 8086. However, the LIDT
instruction can be used in real-address mode to change the base and limit
values in the IDTR. Refer to Chapter 9 for details on the IDTR, and the
LIDT and SIDT instructions. If an interrupt occurs and the corresponding
entry of the interrupt table is beyond the limit stored in the IDTR, the
processor raises exception 8.


14.4  Entering and Leaving Real-Address Mode

Real-address mode is in effect after a signal on the RESET pin. Even if the
system is going to be used in protected mode, the start-up program will
execute in real-address mode temporarily while initializing for protected
mode.


14.4.1  Switching to Protected Mode

The only way to leave real-address mode is to switch to protected mode. The
processor enters protected mode when a MOV to CR0 instruction sets the PE
(protection enable) bit in CR0. (For compatibility with the 80286, the LMSW
instruction may also be used to set the PE bit.)

Refer to Chapter 10 "Initialization" for other aspects of switching to
protected mode.


14.5  Switching Back to Real-Address Mode

The processor reenters real-address mode if software clears the PE bit in
CR0 with a MOV to CR0 instruction. A procedure that attempts to do this,
however, should proceed as follows:

  1.  If paging is enabled, perform the following sequence:

      þ  Transfer control to linear addresses that have an identity mapping;
         i.e., linear addresses equal physical addresses.

      þ  Clear the PG bit in CR0.

      þ  Move zeros to CR3 to clear out the paging cache.

  2.  Transfer control to a segment that has a limit of 64K (FFFFH). This
      loads the CS register with the limit it needs to have in real mode.

  3.  Load segment registers SS, DS, ES, FS, and GS with a selector that
      points to a descriptor containing the following values, which are
      appropriate to real mode:

      þ  Limit = 64K   (FFFFH)
      þ  Byte granular (G = 0)
      þ  Expand up     (E = 0)
      þ  Writable      (W = 1)
      þ  Present       (P = 1)
      þ  Base = any value

  4.  Disable interrupts. A CLI instruction disables INTR interrupts. NMIs
      can be disabled with external circuitry.

  5.  Clear the PE bit.

  6.  Jump to the real mode code to be executed using a far JMP. This
      action flushes the instruction queue and puts appropriate values in
      the access rights of the CS register.

  7.  Use the LIDT instruction to load the base and limit of the real-mode
      interrupt vector table.

  8.  Enable interrupts.

  9.  Load the segment registers as needed by the real-mode code.


14.6  Real-Address Mode Exceptions

The 80386 reports some exceptions differently when executing in
real-address mode than when executing in protected mode. Table 14-1 details
the real-address-mode exceptions.


14.7  Differences From 8086

In general, the 80386 in real-address mode will correctly execute ROM-based
software designed for the 8086, 8088, 80186, and 80188. Following is a list
of the minor differences between 8086 execution on the 80386 and on an 8086.

  1.  Instruction clock counts.

      The 80386 takes fewer clocks for most instructions than the 8086/8088.
      The areas most likely to be affected are:

      þ  Delays required by I/O devices between I/O operations.

      þ  Assumed delays with 8086/8088 operating in parallel with an 8087.

  2.  Divide Exceptions Point to the DIV instruction.

      Divide exceptions on the 80386 always leave the saved CS:IP value
      pointing to the instruction that failed. On the 8086/8088, the CS:IP
      value points to the next instruction.

  3.  Undefined 8086/8088 opcodes.

      Opcodes that were not defined for the 8086/8088 will cause exception
      6 or will execute one of the new instructions defined for the 80386.

  4.  Value written by PUSH SP.

      The 80386 pushes a different value on the stack for PUSH SP than the
      8086/8088. The 80386 pushes the value of SP before SP is incremented
      as part of the push operation; the 8086/8088 pushes the value of SP
      after it is incremented. If the value pushed is important, replace
      PUSH SP instructions with the following three instructions:

      PUSH  BP
      MOV   BP, SP
      XCHG  BP, [BP]

      This code functions as the 8086/8088 PUSH SP instruction on the 80386.

  5.  Shift or rotate by more than 31 bits.

      The 80386 masks all shift and rotate counts to the low-order five
      bits. This MOD 32 operation limits the count to a maximum of 31 bits,
      thereby limiting the time that interrupt response is delayed while
      the instruction is executing.

  6.  Redundant prefixes.

      The 80386 sets a limit of 15 bytes on instruction length. The only
      way to violate this limit is by putting redundant prefixes before an
      instruction. Exception 13 occurs if the limit on instruction length
      is violated. The 8086/8088 has no instruction length limit.

  7.  Operand crossing offset 0 or 65,535.

      On the 8086, an attempt to access a memory operand that crosses
      offset 65,535 (e.g., MOV a word to offset 65,535) or offset 0 (e.g.,
      PUSH a word when SP = 1) causes the offset to wrap around modulo
      65,536. The 80386 raises an exception in these casesÄÄexception 13 if
      the segment is a data segment (i.e., if CS, DS, ES, FS, or GS is being
      used to address the segment), exception 12 if the segment is a stack
      segment (i.e., if SS is being used).

  8.  Sequential execution across offset 65,535.

      On the 8086, if sequential execution of instructions proceeds past
      offset 65,535, the processor fetches the next instruction byte from
      offset 0 of the same segment. On the 80386, the processor raises
      exception 13 in such a case.

  9.  LOCK is restricted to certain instructions.

      The LOCK prefix and its corresponding output signal should only be
      used to prevent other bus masters from interrupting a data movement
      operation. The 80386 always asserts the LOCK signal during an XCHG
      instruction with memory (even if the LOCK prefix is not used). LOCK
      may only be used with the following 80386 instructions when they
      update memory: BTS, BTR, BTC, XCHG, ADD, ADC, SUB, SBB, INC, DEC,
      AND, OR, XOR, NOT, and NEG. An undefined-opcode exception
      (interrupt 6) results from using LOCK before any other instruction.

 10.  Single-stepping external interrupt handlers.

      The priority of the 80386 single-step exception is different from that
      of the 8086/8088. The change prevents an external interrupt handler
      from being single-stepped if the interrupt occurs while a program is
      being single-stepped. The 80386 single-step exception has higher
      priority that any external interrupt. The 80386 will still single-step
      through an interrupt handler invoked by the INT instructions or by an
      exception.

 11.  IDIV exceptions for quotients of 80H or 8000H.

      The 80386 can generate the largest negative number as a quotient for
      the IDIV instruction. The 8086/8088 causes exception zero instead.

 12.  Flags in stack.

      The setting of the flags stored by PUSHF, by interrupts, and by
      exceptions is different from that stored by the 8086 in bit positions
      12 through 15. On the 8086 these bits are stored as ones, but in
      80386 real-address mode bit 15 is always zero, and bits 14 through 12
      reflect the last value loaded into them.

 13.  NMI interrupting NMI handlers.

      After an NMI is recognized on the 80386, the NMI interrupt is masked
      until an IRET instruction is executed.

 14.  Coprocessor errors vector to interrupt 16.

      Any 80386 system with a coprocessor must use interrupt vector 16 for
      the coprocessor error exception. If an 8086/8088 system uses another
      vector for the 8087 interrupt, both vectors should point to the
      coprocessor-error exception handler.

 15.  Numeric exception handlers should allow prefixes.

      On the 80386, the value of CS:IP saved for coprocessor exceptions
      points at any prefixes before an ESC instruction. On 8086/8088
      systems, the saved CS:IP points to the ESC instruction.

 16.  Coprocessor does not use interrupt controller.

      The coprocessor error signal to the 80386 does not pass through an
      interrupt controller (an 8087 INT signal does). Some instructions in
      a coprocessor error handler may need to be deleted if they deal with
      the interrupt controller.

 17.  Six new interrupt vectors.

      The 80386 adds six exceptions that arise only if the 8086 program has
      a hidden bug. It is recommended that exception handlers be added that
      treat these exceptions as invalid operations. This additional
      software does not significantly affect the existing 8086 software
      because the interrupts do not normally occur. These interrupt
      identifiers should not already have been used by the 8086 software,
      because they are in the range reserved by Intel. Table 14-2 describes
      the new 80386 exceptions.

 18.  One megabyte wraparound.

      The 80386 does not wrap addresses at 1 megabyte in real-address mode.
      On members of the 8086 family, it possible to specify addresses
      greater than one megabyte.  For example, with a selector value 0FFFFH
      and an offset of 0FFFFH, the effective address would be 10FFEFH (1
      Mbyte + 65519).  The 8086, which can form adresses only up to 20 bits
      long, truncates the high-order bit, thereby "wrapping" this address
      to 0FFEFH.  However, the 80386, which can form addresses up to 32
      bits long does not truncate such an address.


Table 14-1. 80386 Real-Address Mode Exceptions


Description                      Interrupt  Function that Can                   Return Address
                                 Number     Generate the Exception              Points to Faulting
                                                                                Instruction
Divide error                     0          DIV, IDIV                           YES
Debug exceptions                 1          All                                 
Some debug exceptions point to the faulting instruction, others to the
next instruction. The exception handler can determine which has occurred by
examining DR6.





Breakpoint                       3          INT                                 NO
Overflow                         4          INTO                                NO
Bounds check                     5          BOUND                               YES
Invalid opcode                   6          Any undefined opcode or LOCK        YES
                                            used with wrong instruction
Coprocessor not available        7          ESC or WAIT                         YES
Interrupt table limit too small  8          INT vector is not within IDTR       YES
                                            limit
Reserved                         9-12
Stack fault                      12         Memory operand crosses offset       YES
                                            0 or 0FFFFH
Pseudo-protection exception      13         Memory operand crosses offset       YES
                                            0FFFFH or attempt to execute
                                            past offset 0FFFFH or
                                            instruction longer than 15
                                            bytes
Reserved                         14,15
Coprocessor error                16         ESC or WAIT                         YES
Coprocessor errors are reported on the first ESC or WAIT instruction
after the ESC instruction that caused the error.





Two-byte SW interrupt            0-255      INT n                               NO


Table 14-2. New 80386 Exceptions

Interrupt   Function
Identifier

    5       A BOUND instruction was executed with a register value outside
            the limit values.

    6       An undefined opcode was encountered or LOCK was used improperly
            before an instruction to which it does not apply.

    7       The EM bit in the MSW is set when an ESC instruction was
            encountered. This exception also occurs on a WAIT instruction
            if TS is set.

    8       An exception or interrupt has vectored to an interrupt table
            entry beyond the interrupt table limit in IDTR. This can occur
            only if the LIDT instruction has changed the limit from the
            default value of 3FFH, which is enough for all 256 interrupt
            IDs.

   12       Operand crosses extremes of stack segment, e.g., MOV operation
            at offset 0FFFFH or push with SP=1 during PUSH, CALL, or INT.

   13       Operand crosses extremes of a segment other than a stack
            segment; or sequential instruction execution attempts to
            proceed beyond offset 0FFFFH; or an instruction is longer than
            15 bytes (including prefixes).


14.8  Differences From 80286 Real-Address Mode

The few differences that exist between 80386 real-address mode and 80286
real-address mode are not likely to affect any existing 80286 programs
except possibly the system initialization procedures.


14.8.1  Bus Lock

The 80286 processor implements the bus lock function differently than the
80386. Programs that use forms of memory locking specific to the 80286 may
not execute properly if transported to a specific application of the 80386.

The LOCK prefix and its corresponding output signal should only be used to
prevent other bus masters from interrupting a data movement operation.  LOCK
may only be used with the following 80386 instructions when they modify
memory.  An undefined-opcode exception results from using LOCK before any
other instruction.

  þ  Bit test and change:  BTS, BTR, BTC.
  þ  Exchange: XCHG.
  þ  One-operand arithmetic and logical: INC, DEC, NOT, and NEG.
  þ  Two-operand arithmetic and logical: ADD, ADC, SUB, SBB, AND, OR, XOR.

A locked instruction is guaranteed to lock only the area of memory defined
by the destination operand, but may lock a larger memory area.  For example,
typical 8086 and 80286 configurations lock the entire physical memory space.
With the 80386, the defined area of memory is guranteed to be locked against
access by a processor executing a locked instruction on exactly the same
memory area, i.e., an operand with identical starting address and identical
length.


14.8.2  Location of First Instruction

The starting location is 0FFFFFFF0H (sixteen bytes from end of 32-bit
address space) on the 80386 rather than 0FFFFF0H (sixteen bytes from end of
24-bit address space) as on the 80286.  Many 80286 ROM initialization
programs will work correctly in this new environment.  Others can be made to
work correctly with external hardware that redefines the signals on
A{31-20}.


14.8.3  Initial Values of General Registers

On the 80386, certain general registers may contain different values after
RESET than on the 80286. This should not cause compatibility problems,
because the content of 8086 registers after RESET is undefined.  If
self-test is requested during the reset sequence and errors are detected in
the 80386 unit, EAX will contain a nonzero value. EDX contains the component
and revision identifier. Refer to Chapter 10 for more information.


14.8.4  MSW Initialization

The 80286 initializes the MSW register to FFF0H, but the 80386 initializes
this register to 0000H. This difference should have no effect, because the
bits that are different are undefined on the 80286.  Programs that read the
value of the MSW will behave differently on the 80386 only if they depend on
the setting of the undefined, high-order bits.


Chapter 15  Virtual 8086 Mode

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The 80386 supports execution of one or more 8086, 8088, 80186, or 80188
programs in an 80386 protected-mode environment. An 8086 program runs in
this environment as part of a V86 (virtual 8086) task. V86 tasks take
advantage of the hardware support of multitasking offered by the protected
mode. Not only can there be multiple V86 tasks, each one executing an 8086
program, but V86 tasks can be multiprogrammed with other 80386 tasks.

The purpose of a V86 task is to form a "virtual machine" with which to
execute an 8086 program. A complete virtual machine consists not only of
80386 hardware but also of systems software. Thus, the emulation of an 8086
is the result of cooperation between hardware and software:

  þ  The hardware provides a virtual set of registers (via the TSS), a
     virtual memory space (the first megabyte of the linear address space of
     the task), and directly executes all instructions that deal with these
     registers and with this address space.

  þ  The software controls the external interfaces of the virtual machine
     (I/O, interrupts, and exceptions) in a manner consistent with the
     larger environment in which it executes. In the case of I/O, software
     can choose either to emulate I/O instructions or to let the hardware
     execute them directly without software intervention.

Software that helps implement virtual 8086 machines is called a V86
monitor.


15.1  Executing 8086 Code

The processor executes in V86 mode when the VM (virtual machine) bit in the
EFLAGS register is set. The processor tests this flag under two general
conditions:

  1.  When loading segment registers to know whether to use 8086-style
      address formation.

  2.  When decoding instructions to determine which instructions are
      sensitive to IOPL.

Except for these two modifications to its normal operations, the 80386 in
V86 mode operated much as in protected mode.


15.1.1  Registers and Instructions

The register set available in V86 mode includes all the registers defined
for the 8086 plus the new registers introduced by the 80386: FS, GS, debug
registers, control registers, and test registers. New instructions that
explicitly operate on the segment registers FS and GS are available, and the
new segment-override prefixes can be used to cause instructions to utilize
FS and GS for address calculations. Instructions can utilize 32-bit
operands through the use of the operand size prefix.

8086 programs running as V86 tasks are able to take advantage of the new
applications-oriented instructions added to the architecture by the
introduction of the 80186/80188, 80286 and 80386:

  þ  New instructions introduced by 80186/80188 and 80286.
     ÄÄ PUSH immediate data
     ÄÄ Push all and pop all (PUSHA and POPA)
     ÄÄ Multiply immediate data
     ÄÄ Shift and rotate by immediate count
     ÄÄ String I/O
     ÄÄ ENTER and LEAVE
     ÄÄ BOUND

  þ  New instructions introduced by 80386.
     ÄÄ LSS, LFS, LGS instructions
     ÄÄ Long-displacement conditional jumps
     ÄÄ Single-bit instructions
     ÄÄ Bit scan
     ÄÄ Double-shift instructions
     ÄÄ Byte set on condition
     ÄÄ Move with sign/zero extension
     ÄÄ Generalized multiply


15.1.2  Linear Address Formation

In V86 mode, the 80386 processor does not interpret 8086 selectors by
referring to descriptors; instead, it forms linear addresses as an 8086
would. It shifts the selector left by four bits to form a 20-bit base
address. The effective address is extended with four high-order zeros and
added to the base address to create a linear address as Figure 15-1
illustrates.

Because of the possibility of a carry, the resulting linear address may
contain up to 21 significant bits. An 8086 program may generate linear
addresses anywhere in the range 0 to 10FFEFH (one megabyte plus
approximately 64 Kbytes) of the task's linear address space.

V86 tasks generate 32-bit linear addresses. While an 8086 program can only
utilize the low-order 21 bits of a linear address, the linear address can be
mapped via page tables to any 32-bit physical address.

Unlike the 8086 and 80286, 32-bit effective addresses can be generated (via
the address-size prefix); however, the value of a 32-bit address may not
exceed 65,535 without causing an exception. For full compatibility with
80286 real-address mode, pseudo-protection faults (interrupt 12 or 13 with
no error code) occur if an address is generated outside the range 0 through
65,535.


Figure 15-1.  V86 Mode Address Formation

                      19                                3       0
                     ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍ»
         BASE        º     16-BIT SEGMENT SELECTOR     ³ 0 0 0 0 º
                     ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍͼ

         +
                      19        15                              0
                     ÉÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
         OFFSET      º 0 0 0 0 ³    16-BIT EFFECTIVE ADDRESS     º
                     ÈÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ

         =
                    20                                          0
         LINEAR    ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
         ADDRESS   º X X X X X X X X X X X X X X X X X X X X X X º
                   ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ


15.2  Structure of a V86 Task

A V86 task consists partly of the 8086 program to be executed and partly of
80386 "native mode" code that serves as the virtual-machine monitor. The
task must be represented by an 80386 TSS (not an 80286 TSS). The processor
enters V86 mode to execute the 8086 program and returns to protected mode to
execute the monitor or other 80386 tasks.

To run successfully in V86 mode, an existing 8086 program needs the
following:

  þ  A V86 monitor.
  þ  Operating-system services.

The V86 monitor is 80386 protected-mode code that executes at
privilege-level zero. The monitor consists primarily of initialization and
exception-handling procedures. As for any other 80386 program,
executable-segment descriptors for the monitor must exist in the GDT or in
the task's LDT. The linear addresses above 10FFEFH are available for the
V86 monitor, the operating system, and other systems software. The monitor
may also need data-segment descriptors so that it can examine the interrupt
vector table or other parts of the 8086 program in the first megabyte of the
address space.

In general, there are two options for implementing the 8086 operating
system:

  1.  The 8086 operating system may run as part of the 8086 code. This
      approach is desirable for any of the following reasons:

      þ  The 8086 applications code modifies the operating system.

      þ  There is not sufficient development time to reimplement the 8086
         operating system as 80386 code.

  2.  The 8086 operating system may be implemented or emulated in the V86
      monitor. This approach is desirable for any of the following reasons:

      þ  Operating system functions can be more easily coordinated among
         several V86 tasks.

      þ  The functions of the 8086 operating system can be easily emulated
         by calls to the 80386 operating system.

Note that, regardless of the approach chosen for implementing the 8086
operating system, different V86 tasks may use different 8086 operating
systems.


15.2.1  Using Paging for V86 Tasks

Paging is not necessary for a single V86 task, but paging is useful or
necessary for any of the following reasons:

  þ  To create multiple V86 tasks. Each task must map the lower megabyte of
     linear addresses to different physical locations.

  þ  To emulate the megabyte wrap. On members of the 8086 family, it is
     possible to specify addresses larger than one megabyte. For example,
     with a selector value of 0FFFFH and an offset of 0FFFFH, the effective
     address would be 10FFEFH (one megabyte + 65519). The 8086, which can
     form addresses only up to 20 bits long, truncates the high-order bit,
     thereby "wrapping" this address to 0FFEFH. The 80386, however, which
     can form addresses up to 32 bits long does not truncate such an
     address. If any 8086 programs depend on this addressing anomaly, the
     same effect can be achieved in a V86 task by mapping linear addresses
     between 100000H and 110000H and linear addresses between 0 and 10000H
     to the same physical addresses.

  þ  To create a virtual address space larger than the physical address
     space.

  þ  To share 8086 OS code or ROM code that is common to several 8086
     programs that are executing simultaneously.

  þ  To redirect or trap references to memory-mapped I/O devices.


15.2.2  Protection within a V86 Task

Because it does not refer to descriptors while executing 8086 programs, the
processor also does not utilize the protection mechanisms offered by
descriptors. To protect the systems software that runs in a V86 task from
the 8086 program, software designers may follow either of these approaches:

  þ  Reserve the first megabyte (plus 64 kilobytes) of each task's linear
     address space for the 8086 program. An 8086 task cannot generate
     addresses outside this range.

  þ  Use the U/S bit of page-table entries to protect the virtual-machine
     monitor and other systems software in each virtual 8086 task's space.
     When the processor is in V86 mode, CPL is 3. Therefore, an 8086 program
     has only user privileges. If the pages of the virtual-machine monitor
     have supervisor privilege, they cannot be accessed by the 8086 program.


15.3  Entering and Leaving V86 Mode

Figure 15-2 summarizes the ways that the processor can enter and leave an
8086 program. The processor can enter V86 by either of two means:

  1.  A task switch to an 80386 task loads the image of EFLAGS from the new
      TSS. The TSS of the new task must be an 80386 TSS, not an 80286 TSS,
      because the 80286 TSS does not store the high-order word of EFLAGS,
      which contains the VM flag. A value of one in the VM bit of the new
      EFLAGS indicates that the new task is executing 8086 instructions;
      therefore, while loading the segment registers from the TSS, the
      processor forms base addresses as the 8086 would.

  2.  An IRET from a procedure of an 80386 task loads the image of EFLAGS
      from the stack. A value of one in VM in this case indicates that the
      procedure to which control is being returned is an 8086 procedure. The
      CPL at the time the IRET is executed must be zero, else the processor
      does not change VM.

The processor leaves V86 mode when an interrupt or exception occurs. There
are two cases:

  1.  The interrupt or exception causes a task switch. A task switch from a
      V86 task to any other task loads EFLAGS from the TSS of the new task.
      If the new TSS is an 80386 TSS and the VM bit in the EFLAGS image is
      zero or if the new TSS is an 80286 TSS, then the processor clears the
      VM bit of EFLAGS, loads the segment registers from the new TSS using
      80386-style address formation, and begins executing the instructions
      of the new task according to 80386 protected-mode semantics.

  2.  The interrupt or exception vectors to a privilege-level zero
      procedure. The processor stores the current setting of EFLAGS on the
      stack, then clears the VM bit. The interrupt or exception handler,
      therefore, executes as "native" 80386 protected-mode code. If an
      interrupt or exception vectors to a conforming segment or to a
      privilege level other than three, the processor causes a
      general-protection exception; the error code is the selector of the
      executable segment to which transfer was attempted.

Systems software does not manipulate the VM flag directly, but rather
manipulates the image of the EFLAGS register that is stored on the stack or
in the TSS. The V86 monitor sets the VM flag in the EFLAGS image on the
stack or in the TSS when first creating a V86 task. Exception and interrupt
handlers can examine the VM flag on the stack. If the interrupted procedure
was executing in V86 mode, the handler may need to invoke the V86 monitor.


Figure 15-2.  Entering and Leaving the 8086 Program

                            MODE TRANSITION DIAGRAM

                                 ÉÍÍÍÍÍÍÍÍÍÍÍ»
                  TASK SWITCH    º  INITIAL  º
                ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ   ENTRY   º
                ³   OR IRET      ÈÍÍÍÍÍÍÍÍÍÍͼ
                ³
                
        ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»    INTERRUPT, EXCEPTION      ÉÍÍÍÍÍÍÍÍÍÍÍÍÍ»
        º 8086 PROGRAM ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄĺ V86 MONITOR º
        º  (V86 MODE)  ºÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ (PROTECTED  º
        ÈÍÍÍÍÍÍÍÑÍÍÍÍÍͼ            IRET              º    MODE)    º
               ³                                     ÈÍÍÍÍÍÑÍÍÍÍÍÍͼ
              ³ ³                                           ³  
              ³ ³                                           ³  ³
              ³ ³                                           ³  ³
              ³ ³TASK SWITCH ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ» TASK SWITCH ³
              ³ ÀÄÄÄÄÄÄÄÄÄÄĺ OTHER 80386 TASKS ºÄÄÄÄÄÄÄÄÄÙ  ³
              ÀÄÄÄÄÄÄÄÄÄÄÄÄÄĶ (PROTECTED MODE)  ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
                 TASK SWITCH ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ TASK SWITCH


15.3.1  Transitions Through Task Switches

A task switch to or from a V86 task may be due to any of three causes:

  1.  An interrupt that vectors to a task gate.
  2.  An action of the scheduler of the 80386 operating system.
  3.  An IRET when the NT flag is set.

In any of these cases, the processor changes the VM bit in EFLAGS according
to the image of EFLAGS in the new TSS. If the new TSS is an 80286 TSS, the
high-order word of EFLAGS is not in the TSS; the processor clears VM in this
case. The processor updates VM prior to loading the segment registers from
the images in the new TSS. The new setting of VM determines whether the
processor interprets the new segment-register images as 8086 selectors or
80386/80286 selectors.


15.3.2  Transitions Through Trap Gates and Interrupt Gates

The processor leaves V86 mode as the result of an exception or interrupt
that vectors via a trap or interrupt gate to a privilege-level zero
procedure. The exception or interrupt handler returns to the 8086 code by
executing an IRET.

Because it was designed for execution by an 8086 processor, an 8086 program
in a V86 task will have an 8086-style interrupt table starting at linear
address zero. However, the 80386 does not use this table directly. For all
exceptions and interrupts that occur in V86 mode, the processor vectors
through the IDT. The IDT entry for an interrupt or exception that occurs in
a V86 task must contain either:

  þ  A task gate.

  þ  An 80386 trap gate (type 14) or an 80386 interrupt gate (type 15),
     which must point to a nonconforming, privilege-level zero, code
     segment.

Interrupts and exceptions that have 80386 trap or interrupt gates in the
IDT vector to the appropriate handler procedure at privilege-level zero. The
contents of all the 8086 segment registers are stored on the PL 0 stack.
Figure 15-3 shows the format of the PL 0 stack after an exception or
interrupt that occurs while a V86 task is executing an 8086 program.

After the processor stores all the 8086 segment registers on the PL 0
stack, it loads all the segment registers with zeros before starting to
execute the handler procedure. This permits the interrupt handler to safely
save and restore the DS, ES, FS, and GS registers as 80386 selectors.
Interrupt handlers that may be invoked in the context of either a regular
task or a V86 task, can use the same prolog and epilog code for register
saving regardless of the kind of task. Restoring zeros to these registers
before execution of the IRET does not cause a trap in the interrupt handler.
Interrupt procedures that expect values in the segment registers or that
return values via segment registers have to use the register images stored
on the PL 0 stack. Interrupt handlers that need to know whether the
interrupt occurred in V86 mode can examine the VM bit in the stored EFLAGS
image.

An interrupt handler passes control to the V86 monitor if the VM bit is set
in the EFLAGS image stored on the stack and the interrupt or exception is
one that the monitor needs to handle. The V86 monitor may either:

  þ  Handle the interrupt completely within the V86 monitor.
  þ  Invoke the 8086 program's interrupt handler.

Reflecting an interrupt or exception back to the 8086 code involves the
following steps:

  1.  Refer to the 8086 interrupt vector to locate the appropriate handler
      procedure.

  2.  Store the state of the 8086 program on the privilege-level three
      stack.

  3.  Change the return link on the privilege-level zero stack to point to
      the privilege-level three handler procedure.

  4.  Execute an IRET so as to pass control to the handler.

  5.  When the IRET by the privilege-level three handler again traps to the
      V86 monitor, restore the return link on the privilege-level zero stack
      to point to the originally interrupted, privilege-level three
      procedure.

  6.  Execute an IRET so as to pass control back to the interrupted
      procedure.


Figure 15-3. PL 0 Stack after Interrupt in V86 Task


                WITHOUT ERROR CODE            WITH ERROR CODE
                 31            0               31            0
                ÉÍÍÍÍÍÍËÍÍÍÍÍÍÍ»ÄÄÄÄ¿        ÉÍÍÍÍÍÍËÍÍÍÍÍÍÍ»ÄÄÄÄ¿
                º±±±±±±ºOLD GS º     ³        º±±±±±±ºOLD GS º     ³
                ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹   SS:ESP     ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹   SS:ESP
      D  O      º±±±±±±ºOLD FS º  FROM TSS    º±±±±±±ºOLD FS º  FROM TSS
      I  F      ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹
      R         º±±±±±±ºOLD DS º              º±±±±±±ºOLD DS º
      E  E      ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹
      C  X      º±±±±±±ºOLD ES º              º±±±±±±ºOLD ES º
      T  P      ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍÎÍÍÍÍÍÍ͹
      I  A      º±±±±±±ºOLD SS º              º±±±±±±ºOLD SS º
      O  N      ÌÍÍÍÍÍÍÊÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍÊÍÍÍÍÍÍ͹
      N  S      º    OLD ESP   º              º    OLD ESP   º
         I      ÌÍÍÍÍÍÍÍÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
       ³ O      º  OLD EFLAGS  º              º  OLD EFLAGS  º
       ³ N      ÌÍÍÍÍÍÍËÍÍÍÍÍÍ͹              ÌÍÍÍÍÍÍËÍÍÍÍÍÍ͹
       ³        º±±±±±±ºOLD CS º   NEW        º±±±±±±ºOLD CS º
               ÌÍÍÍÍÍÍÊÍÍÍÍÍÍ͹  SS:EIP      ÌÍÍÍÍÍÍÊÍÍÍÍÍÍ͹
                º    OLD EIP   º    ³         º    OLD EIP   º   NEW
                ÌÍÍÍÍÍÍÍÍÍÍÍÍÍ͹ÄÄÄÙ         ÌÍÍÍÍÍÍÍÍÍÍÍÍÍ͹  SS:EIP
                º              º              º  ERROR CODE  º    ³
                                            ÌÍÍÍÍÍÍÍÍÍÍÍÍÍ͹ÄÄÄÙ
                                            º              º
                                                          


15.4  Additional Sensitive Instructions

When the 80386 is executing in V86 mode, the instructions PUSHF, POPF,
INT n, and IRET are sensitive to IOPL. The instructions IN, INS, OUT, and
OUTS, which are ordinarily sensitive in protected mode, are not sensitive
in V86 mode. Following is a complete list of instructions that are sensitive
in V86 mode:

   CLI     ÄÄ Clear Interrupt-Enable Flag
   STI     ÄÄ Set Interrupt-Enable Flag
   LOCK    ÄÄ Assert Bus-Lock Signal
   PUSHF   ÄÄ Push Flags
   POPF    ÄÄ Pop Flags
   INT n   ÄÄ Software Interrupt
   RET     ÄÄ Interrupt Return

CPL is always three in V86 mode; therefore, if IOPL < 3, these instructions
will trigger a general-protection exceptions. These instructions are made
sensitive so that their functions can be simulated by the V86 monitor.


15.4.1  Emulating 8086 Operating System Calls

INT n is sensitive so that the V86 monitor can intercept calls to the
8086 OS. Many 8086 operating systems are called by pushing parameters onto
the stack, then executing an INT n instruction. If IOPL < 3, INT n
instructions will be intercepted by the V86 monitor. The V86 monitor can
then emulate the function of the 8086 operating system or reflect the
interrupt back to the 8086 operating system in V86 mode.


15.4.2  Virtualizing the Interrupt-Enable Flag

When the processor is executing 8086 code in a V86 task, the instructions
PUSHF, POPF, and IRET are sensitive to IOPL so that the V86 monitor can
control changes to the interrupt-enable flag (IF). Other instructions that
affect IF (STI and CLI) are IOPL sensitive both in 8086 code and in
80386/80386 code.

Many 8086 programs that were designed to execute on single-task systems set
and clear IF to control interrupts. However, when these same programs are
executed in a multitasking environment, such control of IF can be
disruptive. If IOPL is less than three, all instructions that change or
interrogate IF will trap to the V86 monitor. The V86 monitor can then
control IF in a manner that both suits the needs of the larger environment
and is transparent to the 8086 program.


15.5  Virtual I/O

Many 8086 programs that were designed to execute on single-task systems use
I/O devices directly. However, when these same programs are executed in a
multitasking environment, such use of devices can be disruptive. The 80386
provides sufficient flexibility to control I/O in a manner that both suits
the needs of the new environment and is transparent to the 8086 program.
Designers may take any of several possible approaches to controlling I/O:

  þ  Implement or emulate the 8086 operating system as an 80386 program and
     require the 8086 application to do I/O via software interrupts to the
     operating system, trapping all attempts to do I/O directly.

  þ  Let the 8086 program take complete control of all I/O.

  þ  Selectively trap and emulate references that a task makes to specific
     I/O ports.

  þ  Trap or redirect references to memory-mapped I/O addresses.

The method of controlling I/O depends upon whether I/O ports are I/O mapped
or memory mapped.


15.5.1  I/O-Mapped I/O

I/O-mapped I/O in V86 mode differs from protected mode only in that the
protection mechanism does not consult IOPL when executing the I/O
instructions IN, INS, OUT, OUTS. Only the I/O permission bit map controls
the right for V86 tasks to execute these I/O instructions.

The I/O permission map traps I/O instructions selectively depending on the
I/O addresses to which they refer. The I/O permission bit map of each V86
task determines which I/O addresses are trapped for that task. Because each
task may have a different I/O permission bit map, the addresses trapped for
one task may be different from those trapped for others. Refer to Chapter 8
for more information about the I/O permission map.


15.5.2  Memory-Mapped I/O

In hardware designs that utilize memory-mapped I/O, the paging facilities
of the 80386 can be used to trap or redirect I/O operations. Each task that
executes memory-mapped I/O must have a page (or pages) for the memory-mapped
address space. The V86 monitor may control memory-mapped I/O by any of
these means:

  þ  Assign the memory-mapped page to appropriate physical addresses.
     Different tasks may have different physical addresses, thereby
     preventing the tasks from interfering with each other.

  þ  Cause a trap to the monitor by forcing a page fault on the
     memory-mapped page. Read-only pages trap writes. Not-present pages trap
     both reads and writes.

Intervention for every I/O might be excessive for some kinds of I/O
devices. A page fault can still be used in this case to cause intervention
on the first I/O operation. The monitor can then at least make sure that the
task has exclusive access to the device. Then the monitor can change the
page status to present and read/write, allowing subsequent I/O to proceed at
full speed.


15.5.3  Special I/O Buffers

Buffers of intelligent controllers (for example, a bit-mapped graphics
buffer) can also be virtualized via page mapping. The linear space for the
buffer can be mapped to a different physical space for each virtual 8086
task. The V86 monitor can then assume responsibility for spooling the data
or assigning the virtual buffer to the real buffer at appropriate times.


15.6  Differences From 8086

In general, V86 mode will correctly execute software designed for the 8086,
8088, 80186, and 80188. Following is a list of the minor differences between
8086 execution on the 80386 and on an 8086.

  1.  Instruction clock counts.

      The 80386 takes fewer clocks for most instructions than the 
      8086/8088. The areas most likely to be affected are:

      þ  Delays required by I/O devices between I/O operations.

      þ  Assumed delays with 8086/8088 operating in parallel with an 8087.

  2.  Divide exceptions point to the DIV instruction.

      Divide exceptions on the 80386 always leave the saved CS:IP value
      pointing to the instruction that failed. On the 8086/8088, the CS:IP
      value points to the next instruction.

  3.  Undefined 8086/8088 opcodes.

      Opcodes that were not defined for the 8086/8088 will cause exception
      6 or will execute one of the new instructions defined for the 80386.

  4.  Value written by PUSH SP.

      The 80386 pushes a different value on the stack for PUSH SP than the
      8086/8088. The 80386 pushes the value of SP before SP is incremented
      as part of the push operation; the 8086/8088 pushes the value of SP
      after it is incremented. If the value pushed is important, replace
      PUSH SP instructions with the following three instructions:

      PUSH  BP
      MOV   BP, SP
      XCHG  BP, [BP]

      This code functions as the 8086/8088 PUSH SP instruction on the 
      80386.

  5.  Shift or rotate by more than 31 bits.

      The 80386 masks all shift and rotate counts to the low-order five
      bits. This MOD 32 operation limits the count to a maximum of 31 bits,
      thereby limiting the time that interrupt response is delayed while
      the instruction is executing.

  6.  Redundant prefixes.

      The 80386 sets a limit of 15 bytes on instruction length. The only
      way to violate this limit is by putting redundant prefixes before an
      instruction. Exception 13 occurs if the limit on instruction length
      is violated. The 8086/8088 has no instruction length limit.

  7.  Operand crossing offset 0 or 65,535.

      On the 8086, an attempt to access a memory operand that crosses
      offset 65,535 (e.g., MOV a word to offset 65,535) or offset 0 (e.g.,
      PUSH a word when SP = 1) causes the offset to wrap around modulo
      65,536. The 80386 raises an exception in these casesÄÄexception 13 if
      the segment is a data segment (i.e., if CS, DS, ES, FS, or GS is
      being used to address the segment), exception 12 if the segment is a
      stack segment (i.e., if SS is being used).

  8.  Sequential execution across offset 65,535.

      On the 8086, if sequential execution of instructions proceeds past
      offset 65,535, the processor fetches the next instruction byte from
      offset 0 of the same segment. On the 80386, the processor raises
      exception 13 in such a case.

  9.  LOCK is restricted to certain instructions.

      The LOCK prefix and its corresponding output signal should only be
      used to prevent other bus masters from interrupting a data movement
      operation. The 80386 always asserts the LOCK signal during an XCHG
      instruction with memory (even if the LOCK prefix is not used). LOCK
      may only be used with the following 80386 instructions when they
      update memory: BTS, BTR, BTC, XCHG, ADD, ADC, SUB, SBB, INC, DEC,
      AND, OR, XOR, NOT, and NEG. An undefined-opcode exception (interrupt
      6) results from using LOCK before any other instruction.

 10.  Single-stepping external interrupt handlers.

      The priority of the 80386 single-step exception is different from
      that of the 8086/8088. The change prevents an external interrupt
      handler from being single-stepped if the interrupt occurs while a
      program is being single-stepped. The 80386 single-step exception has
      higher priority that any external interrupt. The 80386 will still
      single-step through an interrupt handler invoked by the INT
      instructions or by an exception.

  11.  IDIV exceptions for quotients of 80H or 8000H.

      The 80386 can generate the largest negative number as a quotient for
      the IDIV instruction. The 8086/8088 causes exception zero instead.

 12.  Flags in stack.

      The setting of the flags stored by PUSHF, by interrupts, and by
      exceptions is different from that stored by the 8086 in bit positions
      12 through 15. On the 8086 these bits are stored as ones, but in V86
      mode bit 15 is always zero, and bits 14 through 12 reflect the last
      value loaded into them.

 13.  NMI interrupting NMI handlers.

      After an NMI is recognized on the 80386, the NMI interrupt is masked
      until an IRET instruction is executed.

 14.  Coprocessor errors vector to interrupt 16.

      Any 80386 system with a coprocessor must use interrupt vector 16 for
      the coprocessor error exception. If an 8086/8088 system uses another
      vector for the 8087 interrupt, both vectors should point to the
      coprocessor-error exception handler.

 15.  Numeric exception handlers should allow prefixes.

      On the 80386, the value of CS:IP saved for coprocessor exceptions
      points at any prefixes before an ESC instruction. On 8086/8088
      systems, the saved CS:IP points to the ESC instruction itself.

 16.  Coprocessor does not use interrupt controller.

      The coprocessor error signal to the 80386 does not pass through an
      interrupt controller (an 8087 INT signal does). Some instructions in
      a coprocessor error handler may need to be deleted if they deal with
      the interrupt controller.


15.7  Differences From 80286 Real-Address Mode

The 80286 processor implements the bus lock function differently than the
80386. This fact may or may not be apparent to 8086 programs, depending on
how the V86 monitor handles the LOCK prefix. LOCKed instructions are
sensitive to IOPL; therefore, software designers can choose to emulate its
function. If, however, 8086 programs are allowed to execute LOCK directly,
programs that use forms of memory locking specific to the 8086 may not
execute properly when transported to a specific application of the 80386.

The LOCK prefix and its corresponding output signal should only be used to
prevent other bus masters from interrupting a data movement operation. LOCK
may only be used with the following 80386 instructions when they modify
memory. An undefined-opcode exception results from using LOCK before any
other instruction.

  þ  Bit test and change: BTS, BTR, BTC.
  þ  Exchange: XCHG.
  þ  One-operand arithmetic and logical: INC, DEC, NOT, and NEG.
  þ  Two-operand arithmetic and logical: ADD, ADC, SUB, SBB, AND, OR, XOR.

A locked instruction is guaranteed to lock only the area of memory defined
by the destination operand, but may lock a larger memory area. For example,
typical 8086 and 80286 configurations lock the entire physical memory space.
With the 80386, the defined area of memory is guaranteed to be locked
against access by a processor executing a locked instruction on exactly the
same memory area, i.e., an operand with identical starting address and
identical length.


Chapter 16  Mixing 16-Bit and 32 Bit Code

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The 80386 running in protected mode is a 32-bit microprocessor, but it is
designed to support 16-bit processing at three levels:

  1.  Executing 8086/80286 16-bit programs efficiently with complete 
      compatibility.

  2.  Mixing 16-bit modules with 32-bit modules.

  3.  Mixing 16-bit and 32-bit addresses and operands within one module.

The first level of support for 16-bit programs has already been discussed
in Chapter 13, Chapter 14, and Chapter 15. This chapter shows how 16-bit
and 32-bit modules can cooperate with one another, and how one module can
utilize both 16-bit and 32-bit operands and addressing.

The 80386 functions most efficiently when it is possible to distinguish
between pure 16-bit modules and pure 32-bit modules. A pure 16-bit module
has these characteristics:

  þ  All segments occupy 64 Kilobytes or less.
  þ  Data items are either 8 bits or 16 bits wide.
  þ  Pointers to code and data have 16-bit offsets.
  þ  Control is transferred only among 16-bit segments.

A pure 32-bit module has these characteristics:

  þ  Segments may occupy more than 64 Kilobytes (zero bytes to 4 
     gigabytes).

  þ  Data items are either 8 bits or 32 bits wide.

  þ  Pointers to code and data have 32-bit offsets.

  þ  Control is transferred only among 32-bit segments.

Pure 16-bit modules do exist; they are the modules designed for 16-bit
microprocessors. Pure 32-bit modules may exist in new programs designed
explicitly for the 80386. However, as systems designers move applications
from 16-bit processors to the 32-bit 80386, it will not always be possible
to maintain these ideals of pure 16-bit or 32-bit modules. It may be
expedient to execute old 16-bit modules in a new 32-bit environment without
making source-code changes to the old modules if any of the following
conditions is true:

  þ  Modules will be converted one-by-one from 16-bit environments to
     32-bit environments.

  þ  Older, 16-bit compilers and software-development tools will be
     utilized in the new32-bit operating environment until new 32-bit
     versions can be created.

  þ  The source code of 16-bit modules is not available for modification.

  þ  The specific data structures used by a given module inherently utilize
     16-bit words.

  þ  The native word size of the source language is 16 bits.

On the 80386, 16-bit modules can be mixed with 32-bit modules. To design a
system that mixes 16- and 32-bit code requires an understanding of the
mechanisms that the 80386 uses to invoke and control its 32-bit and 16-bit
features.


16.1  How the 80386 Implements 16-Bit and 32-Bit Features

The features of the architecture that permit the 80386 to work equally well
with 32-bit and 16-bit address and operand sizes include:

  þ  The D-bit (default bit) of code-segment descriptors, which determines
     the default choice of operand-size and address-size for the
     instructions of a code segment. (In real-address mode and V86 mode,
     which do not use descriptors, the default is 16 bits.) A code segment
     whose D-bit is set is known as a USE32 segment; a code segment whose
     D-bit is zero is a USE16 segment. The D-bit eliminates the need to
     encode the operand size and address size in instructions when all
     instructions use operands and effective addresses of the same size.

  þ  Instruction prefixes that explicitly override the default choice of
     operand size and address size (available in protected mode as well as
     in real-address mode and V86 mode).

  þ  Separate 32-bit and 16-bit gates for intersegment control transfers
     (including call gates, interrupt gates, and trap gates). The operand
     size for the control transfer is determined by the type of gate, not by
     the D-bit or prefix of the transfer instruction.

  þ  Registers that can be used both for 32-bit and 16-bit operands and
     effective-address calculations.

  þ  The B-bit (big bit) of data-segment descriptors, which determines the
     size of stack pointer (32-bit ESP or 16-bit SP) used by the CPU for
     implicit stack references.


16.2  Mixing 32-Bit and 16-Bit Operations

The 80386 has two instruction prefixes that allow mixing of 32-bit and
16-bit operations within one segment:

  þ  The operand-size prefix (66H)
  þ  The address-size prefix (67H)

These prefixes reverse the default size selected by the D-bit. For example,
the processor can interpret the word-move instruction MOV mem, reg in any of
four ways:

  þ  In a USE32 segment:

     1.  Normally moves 32 bits from a 32-bit register to a 32-bit
         effective address in memory.

     2.  If preceded by an operand-size prefix, moves 16 bits from a 16-bit
         register to 32-bit effective address in memory.

     3.  If preceded by an address-size prefix, moves 32 bits from a 32-bit
         register to a16-bit effective address in memory.

     4.  If preceded by both an address-size prefix and an operand-size
         prefix, moves 16 bits from a 16-bit register to a 16-bit effective
         address in memory.

  þ  In a USE16 segment:

     1.  Normally moves 16 bits from a 16-bit register to a 16-bit
         effective address in memory.

     2.  If preceded by an operand-size prefix, moves 32 bits from a 32-bit
         register to 16-bit effective address in memory.

     3.  If preceded by an address-size prefix, moves 16 bits from a 16-bit
         register to a32-bit effective address in memory.

     4.  If preceded by both an address-size prefix and an operand-size
         prefix, moves 32 bits from a 32-bit register to a 32-bit effective
         address in memory.

These examples illustrate that any instruction can generate any combination
of operand size and address size regardless of whether the instruction is in
a USE16 or USE32 segment. The choice of the USE16 or USE32 attribute for a
code segment is based upon these criteria:

  1.  The need to address instructions or data in segments that are larger
      than 64 Kilobytes.

  2.  The predominant size of operands.

  3.  The addressing modes desired. (Refer to Chapter 17 for an explanation
      of the additional addressing modes that are available when 32-bit
      addressing is used.)

Choosing a setting of the D-bit that is contrary to the predominant size of
operands requires the generation of an excessive number of operand-size
prefixes.


16.3  Sharing Data Segments Among Mixed Code Segments

Because the choice of operand size and address size is defined in code
segments and their descriptors, data segments can be shared freely among
both USE16 and USE32 code segments. The only limitation is the one imposed
by pointers with 16-bit offsets, which can only point to the first 64
Kilobytes of a segment. When a data segment that contains more than 64
Kilobytes is to be shared among USE32 and USE16 segments, the data that is
to be accessed by the USE16 segments must be located within the first 64
Kilobytes.

A stack that spans addresses less than 64K can be shared by both USE16 and
USE32 code segments. This class of stacks includes:

  þ  Stacks in expand-up segments with G=0 and B=0.

  þ  Stacks in expand-down segments with G=0 and B=0.

  þ  Stacks in expand-up segments with G=1 and B=0, in which the stack is
     contained completely within the lower 64 Kilobytes. (Offsets greater
     than 64K can be used for data, other than the stack, that is not
     shared.)

The B-bit of a stack segment cannot, in general, be used to change the size
of stack used by a USE16 code segment. The size of stack pointer used by the
processor for implicit stack references is controlled by the B-bit of the
data-segment descriptor for the stack. Implicit references are those caused
by interrupts, exceptions, and instructions such as PUSH, POP, CALL, and
RET. One might be tempted, therefore, to try to increase beyond 64K the
size of the stack used by 16-bit code simply by supplying a larger stack
segment with the B-bit set. However, the B-bit does not control explicit
stack references, such as accesses to parameters or local variables. A USE16
code segment can utilize a "big" stack only if the code is modified so that
all explicit references to the stack are preceded by the address-size
prefix, causing those references to use 32-bit addressing.

In big, expand-down segments (B=1, G=1, and E=1), all offsets are greater
than 64K, therefore USE16 code cannot utilize such a stack segment unless
the code segment is modified to employ 32-bit addressing. (Refer to Chapter
6 for a review of the B, G, and E bits.)


16.4  Transferring Control Among Mixed Code Segments

When transferring control among procedures in USE16 and USE32 code
segments, programmers must be aware of three points:

  þ  Addressing limitations imposed by pointers with 16-bit offsets.

  þ  Matching of operand-size attribute in effect for the CALL/RET pair and
     theInterrupt/IRET pair so as to manage the stack correctly.

  þ  Translation of parameters, especially pointer parameters.

Clearly, 16-bit effective addresses cannot be used to address data or code
located beyond 64K in a 32-bit segment, nor can large 32-bit parameters be
squeezed into a 16-bit word; however, except for these obvious limits, most
interfacing problems between 16-bit and 32-bit modules can be solved. Some
solutions involve inserting interface procedures between the procedures in
question.


16.4.1  Size of Code-Segment Pointer

For control-transfer instructions that use a pointer to identify the next
instruction (i.e., those that do not use gates), the size of the offset
portion of the pointer is determined by the operand-size attribute. The
implications of the use of two different sizes of code-segment pointer are:

  þ  JMP, CALL, or RET from 32-bit segment to 16-bit segment is always
     possible using a 32-bit operand size.

  þ  JMP, CALL, or RET from 16-bit segment using a 16-bit operand size
     cannot address the target in a 32-bit segment if the address of the
     target is greater than 64K.

An interface procedure can enable transfers from USE16 segments to 32-bit
addresses beyond 64K without requiring modifications any more extensive than
relinking or rebinding the old programs. The requirements for such an
interface procedure are discussed later in this chapter.


16.4.2  Stack Management for Control Transfers

Because stack management is different for 16-bit CALL/RET than for 32-bit
CALL/RET, the operand size of RET must match that of CALL. (Refer to Figure
16-1.) A 16-bit CALL pushes the 16-bit IP and (for calls between privilege
levels) the 16-bit SP register. The corresponding RET must also use a 16-bit
operand size to POP these 16-bit values from the stack into the 16-bit
registers. A 32-bit CALL pushes the 32-bit EIP and (for interlevel calls)
the 32-bit ESP register. The corresponding RET must also use a 32-bit
operand size to POP these 32-bit values from the stack into the 32-bit
registers. If the two halves of a CALL/RET pair do not have matching operand
sizes, the stack will not be managed correctly and the values of the
instruction pointer and stack pointer will not be restored to correct
values.

When the CALL and its corresponding RET are in segments that have D-bits
with the same values (i.e., both have 32-bit defaults or both have 16-bit
defaults), there is no problem. When the CALL and its corresponding RET are
in segments that have different D-bit values, however, programmers (or
program development software) must ensure that the CALL and RET match.

There are three ways to cause a 16-bit procedure to execute a 32-bit call:

  1.  Use a 16-bit call to a 32-bit interface procedure that then uses a
      32-bit call to invoke the intended target.

  2.  Bind the 16-bit call to a 32-bit call gate.

  3.  Modify the 16-bit procedure, inserting an operand-size prefix before
      the call, thereby changing it to a 32-bit call.

Likewise, there are three ways to cause a 32-bit procedure to execute a
16-bit call:

  1.  Use a 32-bit call to a 32-bit interface procedure that then uses a
      16-bit call to invoke the intended target.

  2.  Bind the 32-bit call to a 16-bit call gate.

  3.  Modify the 32-bit procedure, inserting an operand-size prefix before
      the call, thereby changing it to a 16-bit call. (Be certain that the
      return offset does not exceed 64K.)

Programmers can utilize any of the preceding methods to make a CALL in a
USE16 segment match the corresponding RET in a USE32 segment, or to make a
CALL in a USE32 segment match the corresponding RET in a USE16 segment.


Figure 16-1.  Stack after Far 16-Bit and 32-Bit Calls

                           WITHOUT PRIVILEGE TRANSITION

               AFTER 16-BIT CALL                AFTER 32-BIT CALL

               31             0               31             0
       D  O    º               º                º               º
       I  F    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       R       º±±±±±±±±±±±±±±±º                º±±±±±±±±±±±±±±±º
       E  E    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       C  X    º PARM2 ³ PARM1 º                º     PARM2     º
       T  P    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       I  A    º  CS   ³  IP   ºÄÄSP           º     PARM1     º
       O  N    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       N  S    º               º                º±±±±±±±³  CS   º
          I    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
        ³ O    º               º                º      EIP      ºÄÄESP
        ³ N    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
        ³      º               º                º               º
                                                            

                           WITH PRIVILEGE TRANSITION

               AFTER 16-BIT CALL                AFTER 32-BIT CALL

       D  O     31            0                  31            0
       I  F    ÉÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ»                ÉÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ»
       R       º   SS  ³  SP   º                º±±±±±±±³  SS   º
       E  E    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       C  X    º PARM2 ³ PARM1 º                º      ESP      º
       T  P    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       I  A    º  CS   ³  IP   ºÄÄSP           º     PARM2     º
       O  N    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
       N  S    º               º                º     PARM1     º
          I    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
        ³ O    º               º                º±±±±±±±³  CS   º
        ³ N    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
        ³      º               º                º      EIP      ºÄÄESP
              ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹                ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍ͹
               º               º                º               º
                                                             


16.4.2.1  Controlling the Operand-Size for a Call

When the selector of the pointer referenced by a CALL instruction selects a
segment descriptor, the operand-size attribute in effect for the CALL
instruction is determined by the D-bit in the segment descriptor and by any
operand-size instruction prefix.

When the selector of the pointer referenced by a CALL instruction selects a
gate descriptor, the type of call is determined by the type of call gate. A
call via an 80286 call gate (descriptor type 4)  always has a 16-bit
operand-size attribute; a call via an 80386 call gate (descriptor type 12)
always has a 32-bit operand-size attribute. The offset of the target
procedure is taken from the gate descriptor; therefore, even a 16-bit
procedure can call a procedure that is located more than 64 kilobytes from
the base of a 32-bit segment, because a 32-bit call gate contains a 32-bit
target offset.

An unmodified 16-bit code segment that has run successfully on an 8086 or
real-mode 80286 will always have a D-bit of zero and will not use
operand-size override prefixes; therefore, it will always execute 16-bit
versions of CALL. The only modification needed to make a16-bit procedure
effect a 32-bit call is to relink the call to an 80386 call gate.


16.4.2.2  Changing Size of Call

When adding 32-bit gates to 16-bit procedures, it is important to consider
the number of parameters. The count field of the gate descriptor specifies
the size of the parameter string to copy from the current stack to the stack
of the more privileged procedure. The count field of a 16-bit gate specifies
the number of words to be copied, whereas the count field of a 32-bit gate
specifies the number of doublewords to be copied; therefore, the 16-bit
procedure must use an even number of words as parameters.


16.4.3  Interrupt Control Transfers

With a control transfer due to an interrupt or exception, a gate is always
involved. The operand-size attribute for the interrupt is determined by the
type of IDT gate.

A 386 interrupt or trap gate (descriptor type 14 or 15) to a 32-bit
interrupt procedure can be used to interrupt either 32-bit or 16-bit
procedures. However, it is not generally feasible to permit an interrupt or
exception to invoke a 16-bit handler procedure when 32-bit code is
executing, because a 16-bit interrupt procedure has a return offset of only
16-bits on its stack. If the 32-bit procedure is executing at an address
greater than 64K, the 16-bit interrupt procedure cannot return correctly.


16.4.4  Parameter Translation

When segment offsets or pointers (which contain segment offsets) are passed
as parameters between 16-bit and 32-bit procedures, some translation is
required. Clearly, if a 32-bit procedure passes a pointer to data located
beyond 64K to a 16-bit procedure, the 16-bit procedure cannot utilize it.
Beyond this natural limitation, an interface procedure can perform any
format conversion between 32-bit and 16-bit pointers that may be needed.

Parameters passed by value between 32-bit and 16-bit code may also require
translation between 32-bit and 16-bit formats. Such translation requirements
are application dependent. Systems designers should take care to limit the
range of values passed so that such translations are possible.


16.4.5  The Interface Procedure

Interposing an interface procedure between 32-bit and 16-bit procedures can
be the solution to any of several interface requirements:

  þ  Allowing procedures in 16-bit segments to transfer control to
     instructions located beyond 64K in 32-bit segments.

  þ  Matching of operand size for CALL/RET.

  þ  Parameter translation.

Interface procedures between USE32 and USE16 segments can be constructed
with these properties:

  þ  The procedures reside in a code segment whose D-bit is set, indicating
     a default operand size of 32-bits.

  þ  All entry points that may be called by 16-bit procedures have offsets
     that are actually less than 64K.

  þ  All points to which called 16-bit procedures may return also lie
     within 64K.

The interface procedures do little more than call corresponding procedures
in other segments. There may be two kinds of procedures:

  þ  Those that are called by 16-bit procedures and call 32-bit procedures.
     These interface procedures are called by 16-bit CALLs and use the
     operand-size prefix before RET instructions to cause a 16-bit RET.
     CALLs to 32-bit segments are 32-bit calls (by default, because the
     D-bit is set), and the 32-bit code returns with 32-bit RET
     instructions.

  þ  Those that are called by 32-bit procedures and call 16-bit procedures.
     These interface procedures are called by 32-bit CALL instructions, and
     return with 32-bit RET instructions (by default, because the D-bit is
     set).  CALLs to 16-bit procedures use the operand-size prefix;
     procedures in the 16-bit code return with 16-bit RET instructions.


                         PART IV  INSTRUCTION SET                          


Chapter 17  80386 Instruction Set

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

This chapter presents instructions for the 80386 in alphabetical order. For
each instruction, the forms are given for each operand combination,
including object code produced, operands required, execution time, and a
description. For each instruction, there is an operational description and a
summary of exceptions generated.


17.1  Operand-Size and Address-Size Attributes

When executing an instruction, the 80386 can address memory using either 16
or 32-bit addresses. Consequently, each instruction that uses memory
addresses has associated with it an address-size attribute of either 16 or
32 bits. 16-bit addresses imply both the use of a 16-bit displacement in
the instruction and the generation of a 16-bit address offset (segment
relative address) as the result of the effective address calculation.
32-bit addresses imply the use of a 32-bit displacement and the generation
of a 32-bit address offset. Similarly, an instruction that accesses words
(16 bits) or doublewords (32 bits) has an operand-size attribute of either
16 or 32 bits.

The attributes are determined by a combination of defaults, instruction
prefixes, and (for programs executing in protected mode) size-specification
bits in segment descriptors.


17.1.1  Default Segment Attribute

For programs executed in protected mode, the D-bit in executable-segment
descriptors determines the default attribute for both address size and
operand size. These default attributes apply to the execution of all
instructions in the segment. A value of zero in the D-bit sets the default
address size and operand size to 16 bits; a value of one, to 32 bits.

Programs that execute in real mode or virtual-8086 mode have 16-bit
addresses and operands by default.


17.1.2  Operand-Size and Address-Size Instruction Prefixes

The internal encoding of an instruction can include two byte-long prefixes:
the address-size prefix, 67H, and the operand-size prefix, 66H. (A later
section, "Instruction Format," shows the position of the prefixes in an
instruction's encoding.) These prefixes override the default segment
attributes for the instruction that follows. Table 17-1 shows the effect of
each possible combination of defaults and overrides.


17.1.3  Address-Size Attribute for Stack

Instructions that use the stack implicitly (for example: POP EAX also have
a stack address-size attribute of either 16 or 32 bits. Instructions with a
stack address-size attribute of 16 use the 16-bit SP stack pointer register;
instructions with a stack address-size attribute of 32 bits use the 32-bit
ESP register to form the address of the top of the stack.

The stack address-size attribute is controlled by the B-bit of the
data-segment descriptor in the SS register. A value of zero in the B-bit
selects a stack address-size attribute of 16; a value of one selects a stack
address-size attribute of 32.


Table 17-1. Effective Size Attributes

Segment Default D = ...      0    0    0    0    1    1    1    1
Operand-Size Prefix 66H      N    N    Y    Y    N    N    Y    Y
Address-Size Prefix 67H      N    Y    N    Y    N    Y    N    Y

Effective Operand Size      16   16   32   32   32   32   16   16
Effective Address Size      16   32   16   32   32   16   32   16

Y = Yes, this instruction prefix is present
N = No, this instruction prefix is not present


17.2  Instruction Format

All instruction encodings are subsets of the general instruction format
shown in Figure 17-1. Instructions consist of optional instruction
prefixes, one or two primary opcode bytes, possibly an address specifier
consisting of the ModR/M byte and the SIB (Scale Index Base) byte, a
displacement, if required, and an immediate data field, if required.

Smaller encoding fields can be defined within the primary opcode or
opcodes. These fields define the direction of the operation, the size of the
displacements, the register encoding, or sign extension; encoding fields
vary depending on the class of operation.

Most instructions that can refer to an operand in memory have an addressing
form byte following the primary opcode byte(s). This byte, called the ModR/M
byte, specifies the address form to be used. Certain encodings of the ModR/M
byte indicate a second addressing byte, the SIB (Scale Index Base) byte,
which follows the ModR/M byte and is required to fully specify the
addressing form.

Addressing forms can include a displacement immediately following either
the ModR/M or SIB byte. If a displacement is present, it can be 8-, 16- or
32-bits.

If the instruction specifies an immediate operand, the immediate operand
always follows any displacement bytes. The immediate operand, if specified,
is always the last field of the instruction.

The following are the allowable instruction prefix codes:

   F3H    REP prefix (used only with string instructions)
   F3H    REPE/REPZ prefix (used only with string instructions
   F2H    REPNE/REPNZ prefix (used only with string instructions)
   F0H    LOCK prefix

The following are the segment override prefixes:

   2EH    CS segment override prefix
   36H    SS segment override prefix
   3EH    DS segment override prefix
   26H    ES segment override prefix
   64H    FS segment override prefix
   65H    GS segment override prefix
   66H    Operand-size override
   67H    Address-size override


Figure 17-1.  80386 Instruction Format

      ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
      º  INSTRUCTION  º   ADDRESS-    º    OPERAND-   º   SEGMENT     º
      º    PREFIX     º  SIZE PREFIX  º  SIZE PREFIX  º   OVERRIDE    º
      ÌÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍ͹
      º     0 OR 1         0 OR 1           0 OR 1         0 OR 1     º
      ÇÄ Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä¶
      º                        NUMBER OF BYTES                        º
      ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ

      ÉÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍ»
      º  OPCODE  º  MODR/M   º  SIB  º   DISPLACEMENT   º  IMMEDIATE  º
      º          º           º       º                  º             º
      ÌÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍ͹
      º  1 OR 2     0 OR 1    0 OR 1      0,1,2 OR 4       0,1,2 OR 4 º
      ÇÄ Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä¶
      º                        NUMBER OF BYTES                        º
      ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ


17.2.1  ModR/M and SIB Bytes

The ModR/M and SIB bytes follow the opcode byte(s) in many of the 80386
instructions. They contain the following information:

  þ  The indexing type or register number to be used in the instruction
  þ  The register to be used, or more information to select the instruction
  þ  The base, index, and scale information

The ModR/M byte contains three fields of information:

  þ  The mod field, which occupies the two most significant bits of the 
     byte, combines with the r/m field to form 32 possible values: eight
     registers and 24 indexing modes

  þ  The reg field, which occupies the next three bits following the mod
     field, specifies either a register number or three more bits of opcode
     information. The meaning of the reg field is determined by the first
     (opcode) byte of the instruction.

  þ  The r/m field, which occupies the three least significant bits of the
     byte, can specify a register as the location of an operand, or can form
     part of the addressing-mode encoding in combination with the field as
     described above

The based indexed and scaled indexed forms of 32-bit addressing require the
SIB byte. The presence of the SIB byte is indicated by certain encodings of
the ModR/M byte. The SIB byte then includes the following fields:

  þ  The ss field, which occupies the two most significant bits of the
     byte, specifies the scale factor

  þ  The index field, which occupies the next three bits following the ss
     field and specifies the register number of the index register

  þ  The base field, which occupies the three least significant bits of the
     byte, specifies the register number of the base register

Figure 17-2 shows the formats of the ModR/M and SIB bytes.

The values and the corresponding addressing forms of the ModR/M and SIB
bytes are shown in Tables 17-2, 17-3, and 17-4. The 16-bit addressing
forms specified by the ModR/M byte are in Table 17-2. The 32-bit addressing
forms specified by ModR/M are in Table 17-3. Table 17-4 shows the 32-bit
addressing forms specified by the SIB byte


Figure 17-2.  ModR/M and SIB Byte Formats

                                 MODR/M BYTE

                     7    6    5    4    3    2    1    0
                    ÉÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍ»
                    º  MOD   º REG/OPCODE  º     R/M     º
                    ÈÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍͼ

                          SIB (SCALE INDEX BASE) BYTE

                     7    6    5    4    3    2    1    0
                    ÉÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍËÍÍÍÍÍÍÍÍÍÍÍÍÍ»
                    º   SS   º    INDEX    º    BASE     º
                    ÈÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÍÍÍÍÍÍÍÍÍÍÍͼ


Table 17-2. 16-Bit Addressing Forms with the ModR/M Byte


r8(/r)                     AL    CL    DL    BL    AH    CH    DH    BH
r16(/r)                    AX    CX    DX    BX    SP    BP    SI    DI
r32(/r)                    EAX   ECX   EDX   EBX   ESP   EBP   ESI   EDI
/digit (Opcode)            0     1     2     3     4     5     6     7
REG =                      000   001   010   011   100   101   110   111

   Effective
ÚÄÄÄAddress
disp8 denotes an 8-bit displacement following the ModR/M byte, to be
sign-extended and added to the index. disp16 denotes a 16-bit displacement
following the ModR/M byte, to be added to the index. Default segment
register is SS for the effective addresses containing a BP index, DS for
other effective addresses.ÄÄ¿ ÚMod R/M¿ ÚÄÄÄÄÄÄÄÄModR/M Values in HexadecimalÄÄÄÄÄÄÄÄ¿

[BX + SI]            000   00    08    10    18    20    28    30    38
[BX + DI]            001   01    09    11    19    21    29    31    39
[BP + SI]            010   02    0A    12    1A    22    2A    32    3A
[BP + DI]            011   03    0B    13    1B    23    2B    33    3B
[SI]             00  100   04    0C    14    1C    24    2C    34    3C
[DI]                 101   05    0D    15    1D    25    2D    35    3D
disp16               110   06    0E    16    1E    26    2E    36    3E
[BX]                 111   07    0F    17    1F    27    2F    37    3F

[BX+SI]+disp8        000   40    48    50    58    60    68    70    78
[BX+DI]+disp8        001   41    49    51    59    61    69    71    79
[BP+SI]+disp8        010   42    4A    52    5A    62    6A    72    7A
[BP+DI]+disp8        011   43    4B    53    5B    63    6B    73    7B
[SI]+disp8       01  100   44    4C    54    5C    64    6C    74    7C
[DI]+disp8           101   45    4D    55    5D    65    6D    75    7D
[BP]+disp8           110   46    4E    56    5E    66    6E    76    7E
[BX]+disp8           111   47    4F    57    5F    67    6F    77    7F

[BX+SI]+disp16       000   80    88    90    98    A0    A8    B0    B8
[BX+DI]+disp16       001   81    89    91    99    A1    A9    B1    B9
[BX+SI]+disp16       010   82    8A    92    9A    A2    AA    B2    BA
[BX+DI]+disp16       011   83    8B    93    9B    A3    AB    B3    BB
[SI]+disp16      10  100   84    8C    94    9C    A4    AC    B4    BC
[DI]+disp16          101   85    8D    95    9D    A5    AD    B5    BD
[BP]+disp16          110   86    8E    96    9E    A6    AE    B6    BE
[BX]+disp16          111   87    8F    97    9F    A7    AF    B7    BF

EAX/AX/AL            000   C0    C8    D0    D8    E0    E8    F0    F8
ECX/CX/CL            001   C1    C9    D1    D9    E1    E9    F1    F9
EDX/DX/DL            010   C2    CA    D2    DA    E2    EA    F2    FA
EBX/BX/BL            011   C3    CB    D3    DB    E3    EB    F3    FB
ESP/SP/AH        11  100   C4    CC    D4    DC    E4    EC    F4    FC
EBP/BP/CH            101   C5    CD    D5    DD    E5    ED    F5    FD
ESI/SI/DH            110   C6    CE    D6    DE    E6    EE    F6    FE
EDI/DI/BH            111   C7    CF    D7    DF    E7    EF    F7    FF


ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
NOTES:
  disp8 denotes an 8-bit displacement following the ModR/M byte, to be
  sign-extended and added to the index. disp16 denotes a 16-bit displacement
  following the ModR/M byte, to be added to the index. Default segment
  register is SS for the effective addresses containing a BP index, DS for
  other effective addresses.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ


Table 17-3. 32-Bit Addressing Forms with the ModR/M Byte


r8(/r)                     AL    CL    DL    BL    AH    CH    DH    BH
r16(/r)                    AX    CX    DX    BX    SP    BP    SI    DI
r32(/r)                    EAX   ECX   EDX   EBX   ESP   EBP   ESI   EDI
/digit (Opcode)            0     1     2     3     4     5     6     7
REG =                      000   001   010   011   100   101   110   111

   Effective
ÚÄÄÄAddress
[--] [--] means a SIB follows the ModR/M byte. disp8 denotes an 8-bit
displacement following the SIB byte, to be sign-extended and added to the
index. disp32 denotes a 32-bit displacement following the ModR/M byte, to
be added to the index.ÄÄ¿ ÚMod R/M¿ ÚÄÄÄÄÄÄÄÄÄModR/M Values in HexadecimalÄÄÄÄÄÄÄ¿

[EAX]                000   00    08    10    18    20    28    30    38
[ECX]                001   01    09    11    19    21    29    31    39
[EDX]                010   02    0A    12    1A    22    2A    32    3A
[EBX]                011   03    0B    13    1B    23    2B    33    3B
[--] [--]        00  100   04    0C    14    1C    24    2C    34    3C
disp32               101   05    0D    15    1D    25    2D    35    3D
[ESI]                110   06    0E    16    1E    26    2E    36    3E
[EDI]                111   07    0F    17    1F    27    2F    37    3F

disp8[EAX]           000   40    48    50    58    60    68    70    78
disp8[ECX]           001   41    49    51    59    61    69    71    79
disp8[EDX]           010   42    4A    52    5A    62    6A    72    7A
disp8[EPX];          011   43    4B    53    5B    63    6B    73    7B
disp8[--] [--]   01  100   44    4C    54    5C    64    6C    74    7C
disp8[ebp]           101   45    4D    55    5D    65    6D    75    7D
disp8[ESI]           110   46    4E    56    5E    66    6E    76    7E
disp8[EDI]           111   47    4F    57    5F    67    6F    77    7F

disp32[EAX]          000   80    88    90    98    A0    A8    B0    B8
disp32[ECX]          001   81    89    91    99    A1    A9    B1    B9
disp32[EDX]          010   82    8A    92    9A    A2    AA    B2    BA
disp32[EBX]          011   83    8B    93    9B    A3    AB    B3    BB
disp32[--] [--]  10  100   84    8C    94    9C    A4    AC    B4    BC
disp32[EBP]          101   85    8D    95    9D    A5    AD    B5    BD
disp32[ESI]          110   86    8E    96    9E    A6    AE    B6    BE
disp32[EDI]          111   87    8F    97    9F    A7    AF    B7    BF

EAX/AX/AL            000   C0    C8    D0    D8    E0    E8    F0    F8
ECX/CX/CL            001   C1    C9    D1    D9    E1    E9    F1    F9
EDX/DX/DL            010   C2    CA    D2    DA    E2    EA    F2    FA
EBX/BX/BL            011   C3    CB    D3    DB    E3    EB    F3    FB
ESP/SP/AH        11  100   C4    CC    D4    DC    E4    EC    F4    FC
EBP/BP/CH            101   C5    CD    D5    DD    E5    ED    F5    FD
ESI/SI/DH            110   C6    CE    D6    DE    E6    EE    F6    FE
EDI/DI/BH            111   C7    CF    D7    DF    E7    EF    F7    FF


ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
NOTES:
  [--] [--] means a SIB follows the ModR/M byte. disp8 denotes an 8-bit
  displacement following the SIB byte, to be sign-extended and added to the
  index. disp32 denotes a 32-bit displacement following the ModR/M byte, to
  be added to the index.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ


Table 17-4. 32-Bit Addressing Forms with the SIB Byte


   r32                      EAX   ECX   EDX   EBX   ESP   [*]
[*] means a disp32 with no base if MOD is 00, [ESP] otherwise. This provides
the following addressing modes:
      disp32[index]        (MOD=00)
      disp8[EBP][index]    (MOD=01)
      disp32[EBP][index]   (MOD=10)  ESI   EDI
   Base =                   0     1     2     3     4     5     6     7
   Base =                   000   001   010   011   100   101   110   111

ÚScaled Index
[*] means a disp32 with no base if MOD is 00, [ESP] otherwise. This provides
the following addressing modes:
      disp32[index]        (MOD=00)
      disp8[EBP][index]    (MOD=01)
      disp32[EBP][index]   (MOD=10)¿ÚSS Index¿  ÚÄÄÄÄÄÄÄÄModR/M Values in HexadecimalÄÄÄÄÄÄÄÄ¿

[EAX]                000    00    01    02    03    04    05    06    07
[ECX]                001    08    09    0A    0B    0C    0D    0E    0F
[EDX]                010    10    11    12    13    14    15    16    17
[EBX]                011    18    19    1A    1B    1C    1D    1E    1F
none             00  100    20    21    22    23    24    25    26    27
[EBP]                101    28    29    2A    2B    2C    2D    2E    2F
[ESI]                110    30    31    32    33    34    35    36    37
[EDI]                111    38    39    3A    3B    3C    3D    3E    3F

[EAX*2]              000    40    41    42    43    44    45    46    47
[ECX*2]              001    48    49    4A    4B    4C    4D    4E    4F
[ECX*2]              010    50    51    52    53    54    55    56    57
[EBX*2]              011    58    59    5A    5B    5C    5D    5E    5F
none             01  100    60    61    62    63    64    65    66    67
[EBP*2]              101    68    69    6A    6B    6C    6D    6E    6F
[ESI*2]              110    70    71    72    73    74    75    76    77
[EDI*2]              111    78    79    7A    7B    7C    7D    7E    7F

[EAX*4]              000    80    81    82    83    84    85    86    87
[ECX*4]              001    88    89    8A    8B    8C    8D    8E    8F
[EDX*4]              010    90    91    92    93    94    95    96    97
[EBX*4]              011    98    89    9A    9B    9C    9D    9E    9F
none             10  100    A0    A1    A2    A3    A4    A5    A6    A7
[EBP*4]              101    A8    A9    AA    AB    AC    AD    AE    AF
[ESI*4]              110    B0    B1    B2    B3    B4    B5    B6    B7
[EDI*4]              111    B8    B9    BA    BB    BC    BD    BE    BF

[EAX*8]              000    C0    C1    C2    C3    C4    C5    C6    C7
[ECX*8]              001    C8    C9    CA    CB    CC    CD    CE    CF
[EDX*8]              010    D0    D1    D2    D3    D4    D5    D6    D7
[EBX*8]              011    D8    D9    DA    DB    DC    DD    DE    DF
none             11  100    E0    E1    E2    E3    E4    E5    E6    E7
[EBP*8]              101    E8    E9    EA    EB    EC    ED    EE    EF
[ESI*8]              110    F0    F1    F2    F3    F4    F5    F6    F7
[EDI*8]              111    F8    F9    FA    FB    FC    FD    FE    FF


ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
NOTES:
  [*] means a disp32 with no base if MOD is 00, [ESP] otherwise. This
  provides the following addressing modes:
      disp32[index]        (MOD=00)
      disp8[EBP][index]    (MOD=01)
      disp32[EBP][index]   (MOD=10)
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ


17.2.2  How to Read the Instruction Set Pages

The following is an example of the format used for each 80386 instruction
description in this chapter:

CMC ÄÄ Complement Carry Flag

Opcode   Instruction         Clocks      Description

F5        CMC                  2            Complement carry flag

The above table is followed by paragraphs labelled "Operation,"
"Description," "Flags Affected," "Protected Mode Exceptions," "Real
Address Mode Exceptions," and, optionally, "Notes." The following sections
explain the notational conventions and abbreviations used in these
paragraphs of the instruction descriptions.


17.2.2.1  Opcode

The "Opcode" column gives the complete object code produced for each form
of the instruction. When possible, the codes are given as hexadecimal bytes,
in the same order in which they appear in memory. Definitions of entries
other than hexadecimal bytes are as follows:

/digit: (digit is between 0 and 7) indicates that the ModR/M byte of the
instruction uses only the r/m (register or memory) operand. The reg field
contains the digit that provides an extension to the instruction's opcode.

/r: indicates that the ModR/M byte of the instruction contains both a
register operand and an r/m operand.

cb, cw, cd, cp: a 1-byte (cb), 2-byte (cw), 4-byte (cd) or 6-byte (cp)
value following the opcode that is used to specify a code offset and
possibly a new value for the code segment register.

ib, iw, id: a 1-byte (ib), 2-byte (iw), or 4-byte (id) immediate operand to
the instruction that follows the opcode, ModR/M bytes or scale-indexing
bytes. The opcode determines if the operand is a signed value. All words and
doublewords are given with the low-order byte first.

+rb, +rw, +rd: a register code, from 0 through 7, added to the hexadecimal
byte given at the left of the plus sign to form a single opcode byte. The
codes areÄÄ

      rb         rw         rd
    AL = 0     AX = 0     EAX = 0
    CL = 1     CX = 1     ECX = 1
    DL = 2     DX = 2     EDX = 2
    BL = 3     BX = 3     EBX = 3
    AH = 4     SP = 4     ESP = 4
    CH = 5     BP = 5     EBP = 5
    DH = 6     SI = 6     ESI = 6
    BH = 7     DI = 7     EDI = 7


17.2.2.2  Instruction

The "Instruction" column gives the syntax of the instruction statement as
it would appear in an ASM386 program. The following is a list of the symbols
used to represent operands in the instruction statements:

rel8: a relative address in the range from 128 bytes before the end of the
instruction to 127 bytes after the end of the instruction.

rel16, rel32: a relative address within the same code segment as the
instruction assembled. rel16 applies to instructions with an operand-size
attribute of 16 bits; rel32 applies to instructions with an operand-size
attribute of 32 bits.

ptr16:16, ptr16:32: a FAR pointer, typically in a code segment different
from that of the instruction. The notation 16:16 indicates that the value of
the pointer has two parts. The value to the right of the colon is a 16-bit
selector or value destined for the code segment register. The value to the
left corresponds to the offset within the destination segment. ptr16:16 is
used when the instruction's operand-size attribute is 16 bits; ptr16:32 is
used with the 32-bit attribute.

r8: one of the byte registers AL, CL, DL, BL, AH, CH, DH, or BH.

r16: one of the word registers AX, CX, DX, BX, SP, BP, SI, or DI.

r32: one of the doubleword registers EAX, ECX, EDX, EBX, ESP, EBP, ESI, or
EDI.

imm8: an immediate byte value. imm8 is a signed number between -128 and
+127 inclusive. For instructions in which imm8 is combined with a word or
doubleword operand, the immediate value is sign-extended to form a word or
doubleword. The upper byte of the word is filled with the topmost bit of the
immediate value.

imm16: an immediate word value used for instructions whose operand-size
attribute is 16 bits. This is a number between -32768 and +32767 inclusive.

imm32: an immediate doubleword value used for instructions whose
operand-size attribute is 32-bits. It allows the use of a number between
+2147483647 and -2147483648.

r/m8: a one-byte operand that is either the contents of a byte register
(AL, BL, CL, DL, AH, BH, CH, DH), or a byte from memory.

r/m16: a word register or memory operand used for instructions whose
operand-size attribute is 16 bits. The word registers are: AX, BX, CX, DX,
SP, BP, SI, DI. The contents of memory are found at the address provided by
the effective address computation.

r/m32: a doubleword register or memory operand used for instructions whose
operand-size attribute is 32-bits. The doubleword registers are: EAX, EBX,
ECX, EDX, ESP, EBP, ESI, EDI. The contents of memory are found at the
address provided by the effective address computation.

m8: a memory byte addressed by DS:SI or ES:DI (used only by string
instructions).

m16: a memory word addressed by DS:SI or ES:DI (used only by string
instructions).

m32: a memory doubleword addressed by DS:SI or ES:DI (used only by string
instructions).

m16:16, M16:32: a memory operand containing a far pointer composed of two
numbers. The number to the left of the colon corresponds to the pointer's
segment selector. The number to the right corresponds to its offset.

m16 & 32, m16 & 16, m32 & 32: a memory operand consisting of data item pairs
whose sizes are indicated on the left and the right side of the ampersand.
All memory addressing modes are allowed. m16 & 16 and m32 & 32 operands are
used by the BOUND instruction to provide an operand containing an upper and
lower bounds for array indices. m16 & 32 is used by LIDT and LGDT to
provide a word with which to load the limit field, and a doubleword with
which to load the base field of the corresponding Global and Interrupt
Descriptor Table Registers.

moffs8, moffs16, moffs32: (memory offset) a simple memory variable of type
BYTE, WORD, or DWORD used by some variants of the MOV instruction. The
actual address is given by a simple offset relative to the segment base. No
ModR/M byte is used in the instruction. The number shown with moffs
indicates its size, which is determined by the address-size attribute of the
instruction.

Sreg: a segment register. The segment register bit assignments are ES=0,
CS=1, SS=2, DS=3, FS=4, and GS=5.


17.2.2.3  Clocks

The "Clocks" column gives the number of clock cycles the instruction takes
to execute. The clock count calculations makes the following assumptions:

  þ  The instruction has been prefetched and decoded and is ready for
     execution.

  þ  Bus cycles do not require wait states.

  þ  There are no local bus HOLD requests delaying processor access to the
     bus.

  þ  No exceptions are detected during instruction execution.

  þ  Memory operands are aligned.

Clock counts for instructions that have an r/m (register or memory) operand
are separated by a slash. The count to the left is used for a register
operand; the count to the right is used for a memory operand.

The following symbols are used in the clock count specifications:

  þ  n, which represents a number of repetitions.

  þ  m, which represents the number of components in the next instruction
     executed, where the entire displacement (if any) counts as one
     component, the entire immediate data (if any) counts as one component,
     and every other byte of the instruction and prefix(es) each counts as
     one component.

  þ  pm=, a clock count that applies when the instruction executes in
     Protected Mode. pm= is not given when the clock counts are the same for
     Protected and Real Address Modes.

When an exception occurs during the execution of an instruction and the
exception handler is in another task, the instruction execution time is
increased by the number of clocks to effect a task switch. This parameter
depends on several factors:

  þ  The type of TSS used to represent the current task (386 TSS or 286
     TSS).

  þ  The type of TSS used to represent the new task.

  þ  Whether the current task is in V86 mode.

  þ  Whether the new task is in V86 mode.

Table 17-5 summarizes the task switch times for exceptions.


Table 17-5. Task Switch Times for Exceptions

                       New Task

Old              386 TSS     286 TSS
Task             VM = 0

386   VM = 0       309        282
TSS

386   VM = 1       314        231
TSS

286                307        282
TSS


17.2.2.4  Description

The "Description" column following the "Clocks" column briefly explains the
various forms of the instruction. The "Operation" and "Description" sections
contain more details of the instruction's operation.


17.2.2.5  Operation

The "Operation" section contains an algorithmic description of the
instruction which uses a notation similar to the Algol or Pascal language.
The algorithms are composed of the following elements:

Comments are enclosed within the symbol pairs "(*" and "*)".

Compound statements are enclosed between the keywords of the "if" statement
(IF, THEN, ELSE, FI) or of the "do" statement (DO, OD), or of the "case"
statement (CASE ... OF, ESAC).

A register name implies the contents of the register. A register name
enclosed in brackets implies the contents of the location whose address is
contained in that register. For example, ES:[DI] indicates the contents of
the location whose ES segment relative address is in register DI. [SI]
indicates the contents of the address contained in register SI relative to
SI's default segment (DS) or overridden segment.

Brackets also used for memory operands, where they mean that the contents
of the memory location is a segment-relative offset. For example, [SRC]
indicates that the contents of the source operand is a segment-relative
offset.

A  B; indicates that the value of B is assigned to A.

The symbols =, <>, ò, and ó are relational operators used to compare two
values, meaning equal, not equal, greater or equal, less or equal,
respectively. A relational expression such as A = B is TRUE if the value of
A is equal to B; otherwise it is FALSE.

The following identifiers are used in the algorithmic descriptions:

  þ  OperandSize represents the operand-size attribute of the instruction,
     which is either 16 or 32 bits. AddressSize represents the address-size
     attribute, which is either 16 or 32 bits. For example,

   IF instruction = CMPSW
   THEN OperandSize  16;
   ELSE
      IF instruction = CMPSD
      THEN OperandSize  32;
      FI;
   FI;

indicates that the operand-size attribute depends on the form of the CMPS
instruction used. Refer to the explanation of address-size and operand-size
attributes at the beginning of this chapter for general guidelines on how
these attributes are determined.

  þ  StackAddrSize represents the stack address-size attribute associated
     with the instruction, which has a value of 16 or 32 bits, as explained
     earlier in the chapter.

  þ  SRC represents the source operand. When there are two operands, SRC is
     the one on the right.

  þ  DEST represents the destination operand. When there are two operands,
     DEST is the one on the left.

  þ  LeftSRC, RightSRC distinguishes between two operands when both are
     source operands.

  þ  eSP represents either the SP register or the ESP register depending on
     the setting of the B-bit for the current stack segment.

The following functions are used in the algorithmic descriptions:

  þ  Truncate to 16 bits(value) reduces the size of the value to fit in 16
     bits by discarding the uppermost bits as needed.

  þ  Addr(operand) returns the effective address of the operand (the result
     of the effective address calculation prior to adding the segment base).

  þ  ZeroExtend(value) returns a value zero-extended to the operand-size
     attribute of the instruction. For example, if OperandSize = 32,
     ZeroExtend of a byte value of -10 converts the byte from F6H to
     doubleword with hexadecimal value 000000F6H. If the value passed to
     ZeroExtend and the operand-size attribute are the same size,
     ZeroExtend returns the value unaltered.

  þ  SignExtend(value) returns a value sign-extended to the operand-size
     attribute of the instruction. For example, if OperandSize = 32,
     SignExtend of a byte containing the value -10 converts the byte from
     F6H to a doubleword with hexadecimal value FFFFFFF6H. If the value
     passed to SignExtend and the operand-size attribute are the same size,
     SignExtend returns the value unaltered.

  þ  Push(value) pushes a value onto the stack. The number of bytes pushed
     is determined by the operand-size attribute of the instruction. The
     action of Push is as follows:

   IF StackAddrSize = 16
   THEN
      IF OperandSize = 16
      THEN
         SP  SP - 2;
         SS:[SP]  value; (* 2 bytes assigned starting at
                             byte address in SP *)
      ELSE (* OperandSize = 32 *)
         SP  SP - 4;
         SS:[SP]  value; (* 4 bytes assigned starting at
                             byte address in SP *)
      FI;
   ELSE (* StackAddrSize = 32 *)
      IF OperandSize = 16
      THEN
         ESP  ESP - 2;
         SS:[ESP]  value; (* 2 bytes assigned starting at
                              byte address in ESP*)
      ELSE (* OperandSize = 32 *)
         ESP  ESP - 4;
         SS:[ESP]  value; (* 4 bytes assigned starting at
                              byte address in ESP*)
      FI;
   FI;

  þ  Pop(value) removes the value from the top of the stack and returns it.
     The statement EAX  Pop( ); assigns to EAX the 32-bit value that Pop
     took from the top of the stack. Pop will return either a word or a
     doubleword depending on the operand-size attribute. The action of Pop
     is as follows:

   IF StackAddrSize = 16
   THEN
      IF OperandSize = 16
      THEN
         ret val  SS:[SP]; (* 2-byte value *)
         SP  SP + 2;
      ELSE (* OperandSize = 32 *)
         ret val  SS:[SP]; (* 4-byte value *)
         SP  SP + 4;
      FI;
   ELSE (* StackAddrSize = 32 *)
      IF OperandSize = 16
      THEN
         ret val  SS:[ESP]; (* 2 bytes value *)
         ESP  ESP + 2;
      ELSE (* OperandSize = 32 *)
         ret val  SS:[ESP]; (* 4 bytes value *)
         ESP  ESP + 4;
      FI;
   FI;
   RETURN(ret val); (*returns a word or doubleword*)

  þ  Bit[BitBase, BitOffset] returns the address of a bit within a bit
     string, which is a sequence of bits in memory or a register. Bits are
     numbered from low-order to high-order within registers and within
     memory bytes. In memory, the two bytes of a word are stored with the
     low-order byte at the lower address.

   If the base operand is a register, the offset can be in the range 0..31.
   This offset addresses a bit within the indicated register. An example,
   "BIT[EAX, 21]," is illustrated in Figure 17-3.

   If BitBase is a memory address, BitOffset can range from -2 gigabits to 2
   gigabits. The addressed bit is numbered (Offset MOD 8) within the byte at
   address (BitBase + (BitOffset DIV 8)), where DIV is signed division with
   rounding towards negative infinity, and MOD returns a positive number.
   This is illustrated in Figure 17-4.

  þ  I-O-Permission(I-O-Address, width) returns TRUE or FALSE depending on
   the I/O permission bitmap and other factors. This function is defined as
   follows:

   IF TSS type is 286 THEN RETURN FALSE; FI;
   Ptr  [TSS + 66]; (* fetch bitmap pointer *)
   BitStringAddr  SHR (I-O-Address, 3) + Ptr;
   MaskShift  I-O-Address AND 7;
   CASE width OF:
         BYTE: nBitMask  1;
         WORD: nBitMask  3;
         DWORD: nBitMask  15;
   ESAC;
   mask  SHL (nBitMask, MaskShift);
   CheckString  [BitStringAddr] AND mask;
   IF CheckString = 0
   THEN RETURN (TRUE);
   ELSE RETURN (FALSE);
   FI;

  þ  Switch-Tasks is the task switching function described in Chapter 7.


17.2.2.6  Description

The "Description" section contains further explanation of the instruction's
operation.


Figure 17-3.  Bit Offset for BIT[EAX, 21]

   31                    21                                               0
  ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍËÍËÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
  º                     º º                                               º
  ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÊÍÊÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
                                                                         
                         ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄBITOFFSET = 21ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ


Figure 17-4.  Memory Bit Indexing

                         BIT INDEXING (POSITIVE OFFSET)

               7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
             ÉÍÍÍÍÑÍÑÍÍÍÍÍÍÍÍÍÑÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÑÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
             º    ³ ³         ³               ³                º
             ÈÍÍÍÍÏÍÏÍÍÍÍÍÍÍÍÍÏÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
             ³  BITBASE + 1   ³    BITBASE    ³  BITBASE - 1   ³
                                             ³
                   ÀÄÄÄÄÄÄÄÄOFFSET = 13ÄÄÄÄÄÄÄÙ

                         BIT INDEXING (NEGATIVE OFFSET)

               7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
             ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÑÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÑÍÍÍÑÍÑÍÍÍÍÍÍÍÍÍÍ»
             º                ³               ³   ³ ³          º
             ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÏÍÍÍÏÍÏÍÍÍÍÍÍÍÍÍͼ
             ³    BITBASE     ³  BITBASE - 1  ³  BITBASE - 2   ³
                              ³                    
                              ÀÄÄÄÄÄOFFSET = -11ÄÄÄÙ


17.2.2.7  Flags Affected

The "Flags Affected" section lists the flags that are affected by the
instruction, as follows:

  þ  If a flag is always cleared or always set by the instruction, the
     value is given (0 or 1) after the flag name. Arithmetic and logical
     instructions usually assign values to the status flags in the uniform
     manner described in Appendix C. Nonconventional assignments are
     described in the "Operation" section.

  þ  The values of flags listed as "undefined" may be changed by the
     instruction in an indeterminate manner.

All flags not listed are unchanged by the instruction.


17.2.2.8  Protected Mode Exceptions

This section lists the exceptions that can occur when the instruction is
executed in 80386 Protected Mode. The exception names are a pound sign (#)
followed by two letters and an optional error code in parentheses. For
example, #GP(0) denotes a general protection exception with an error code of
0. Table 17-6 associates each two-letter name with the corresponding
interrupt number.

Chapter 9 describes the exceptions and the 80386 state upon entry to the
exception.

Application programmers should consult the documentation provided with
their operating systems to determine the actions taken when exceptions
occur.


Table 17-6. 80386 Exceptions

Mnemonic     Interrupt    Description



17.2.2.9  Real Address Mode Exceptions

Because less error checking is performed by the 80386 in Real Address Mode,
this mode has fewer exception conditions. Refer to Chapter 14 for further
information on these exceptions.

continue

Created by: Astalalista - Last modification: 2004/Jan/26 at 02:57
Labelled with ICRA This page is powered by Copyright Button(TM).
Click here to read how this page is protected by copyright laws.
Please send any
comments to Webmaster


KingsClick Sponsor - Click Here
KingsClick-Your Website Deserves to be King