Intel 253668-032US - Manuals
Intel 253668-032US – Manual in PDF format online.
Manuals:
Manual Intel 253668-032US
Summary
ii Vol. 3A INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUME...
Vol. 3A iii CONTENTS PAGE CHAPTER 1 ABOUT THIS MANUAL 1.1 PROCESSORS COVERED IN THIS MANUAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 1...
CONTENTS iv Vol. 3A PAGE 2.7.5 Controlling the Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31 2.7.6 Reading Performance-Monitoring and Time-Stamp Counters . . . . . . . . . . . . . . . . . . . . . 2-32 2.7.6.1 Reading Coun...
Vol. 3A v CONTENTS PAGE 4.9.3 Caching Paging-Related Information about Memory Typing . . . . . . . . . . . . . . . . . . . . . . .4-38 4.10 CACHING TRANSLATION INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38 4.10.1 . . . . . . . . . . . . . . ...
CONTENTS vi Vol. 3A PAGE 5.8.7.1 SYSENTER and SYSEXIT Instructions in IA-32e Mode. . . . . . . . . . . . . . . . . . . . . . . . . . 5-31 5.8.8 Fast System Calls in 64-bit Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32 5.9 PRIVILEGED INSTRUCT...
Vol. 3A vii CONTENTS PAGE 6.14 EXCEPTION AND INTERRUPT HANDLING IN 64-BIT MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22 6.14.1 64-Bit Mode IDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23 6.14.2 ...
CONTENTS viii Vol. 3A PAGE CHAPTER 8 MULTIPLE-PROCESSOR MANAGEMENT 8.1 LOCKED ATOMIC OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.1.1 Guaranteed Atomic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Vol. 3A ix CONTENTS PAGE 8.7.9 Memory Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-42 8.7.10 Serializing Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
CONTENTS x Vol. 3A PAGE 9.5 MEMORY TYPE RANGE REGISTERS (MTRRS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9 9.6 INITIALIZING SSE/SSE2/SSE3/SSSE3 EXTENSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 9.7 SOFTWARE INITIALIZATION F...
Vol. 3A xi CONTENTS PAGE CHAPTER 10 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.1 LOCAL AND I/O APIC OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.2 SYSTEM BUS VS. APIC BUS . . . . . . . . . . . . . . . . . . . . . . ....
CONTENTS xii Vol. 3A PAGE 10.7.2.4 Deriving Logical x2APIC ID from the Local x2APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . 10-50 10.7.2.5 Broadcast/Self Delivery Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-51 10.7.2.6 Lowest Prior...
Vol. 3A xiii CONTENTS PAGE 11.11 MEMORY TYPE RANGE REGISTERS (MTRRS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-30 11.11.1 MTRR Feature Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32 11.11....
CONTENTS xiv Vol. 3A PAGE 13.1.6.1 Numeric Error flag and IGNNE# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.2 EMULATION OF SSE/SSE2/SSE3/SSSE3/SSE4 EXTENSIONS. . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.3 SAVING AND RESTORING TH...
Vol. 3A xv CONTENTS PAGE 15.3 MACHINE-CHECK MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2 15.3.1 Machine-Check Global Control MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
CONTENTS xvi Vol. 3A PAGE CHAPTER 16 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.1 OVERVIEW OF DEBUG SUPPORT FACILITIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.2 DEBUG REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Vol. 3A xvii CONTENTS PAGE 16.9 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-43 16.10 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) . . . . . . . . . . . . . . ....
CONTENTS xviii Vol. 3A PAGE CHAPTER 18 MIXING 16-BIT AND 32-BIT CODE 18.1 DEFINING 16-BIT AND 32-BIT PROGRAM MODULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2 18.2 MIXING 16-BIT AND 32-BIT OPERATIONS WITHIN A CODE SEGMENT . . . . . . . . . . . . . . . . . 18-2 18.3 SHARI...
Vol. 3A xix CONTENTS PAGE 19.18.6.3 Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-14 19.18.6.4 Exception Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-14 ...
CONTENTS xx Vol. 3A PAGE 19.25 EXCEPTIONS AND/OR EXCEPTION CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-28 19.25.1 Machine-Check Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-30 19.25.2 Pri...
Vol. 3A xxi CONTENTS PAGE 20.5 VIRTUAL-MACHINE CONTROL STRUCTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.6 DISCOVERING SUPPORT FOR VMX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.7 ENABLIN...
CONTENTS xxii Vol. 3A PAGE CHAPTER 22 VMX NON-ROOT OPERATION 22.1 INSTRUCTIONS THAT CAUSE VM EXITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.1.1 Relative Priority of Faults and VM Exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Vol. 3A xxiii CONTENTS PAGE 23.3.1.3 Checks on Guest Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-15 23.3.1.4 Checks on Guest RIP and RFLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-15 23.3.1.5 Checks on G...
CONTENTS xxiv Vol. 3A PAGE 24.5.6 Clearing Address-Range Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-37 24.6 LOADING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Vol. 3A xxv CONTENTS PAGE 26.11 SMBASE RELOCATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-19 26.11.1 Relocating SMRAM to an Address Above 1 MByte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-20 26.12 I...
CONTENTS xxvi Vol. 3A PAGE 27.7.1 Handling VM Exits Due to Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.1 Reflecting Exceptions to Guest Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.2 Resum...
Vol. 3A xxvii CONTENTS PAGE CHAPTER 29 HANDLING BOUNDARY CONDITIONS IN A VIRTUAL MACHINE MONITOR 29.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-1 29.2 INTERRUPT HANDLING IN VMX OPERATION. ...
Vol. 3A xxix CONTENTS PAGE 30.10.3 Incrementing the Time-Stamp Counter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-77 30.10.4 Non-Halted Reference Clockticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-77 30.1...
Vol. 3A xxxi CONTENTS PAGE E.4.3 Processor Model Specific Error Code Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-21 E.4.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MCA Error Type A: L3 ErrorE-21 E...
CONTENTS xxxii Vol. 3A PAGE H.4.2 Natural-Width Read-Only Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10 H.4.3 Natural-Width Guest-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10...
Vol. 3A xxxiii CONTENTS PAGE FIGURES Figure 1-1. Bit and Byte Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation. . . . . . . . . . . . . . . . . . . . . . . . . ....
CONTENTS xxxiv Vol. 3A PAGE Figure 6-2. IDT Gate Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15 Figure 6-3. Interrupt Procedure Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Vol. 3A xxxv CONTENTS PAGE Figure 10-14. Error Status Register (ESR) in x2APIC Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-36 Figure 10-15. Divide Configuration Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-37 Fi...
Vol. 3A xxxvii CONTENTS PAGE Figure 29-1. Host External Interrupts and Guest Virtual Interrupts . . . . . . . . . . . . . . . . . . . . . . . . .29-5 Figure 30-1. Layout of IA32_PERFEVTSELx MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30-4 Figure 30-2. Layout...
CONTENTS xxxviii Vol. 3A PAGE TABLES Table 2-1. Action Taken By x87 FPU Instructions for Different Combinations of EM, MP, and TS2-21 Table 2-2. Summary of System Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27 Table 3-1. Code- and Data-Segme...
Vol. 3A xli CONTENTS PAGE Table 21-4. Format of Pending-Debug-Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21-8 Table 21-5. Definitions of Pin-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-11 Table 21-6. Definiti...
CONTENTS xliv Vol. 3A PAGE Table F-2. Short Message (21 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-2 Table F-3. Non-Focused Lowest Priority Message (34 Cycles). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-3 Table F-...
Vol. 3 1-1 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (order number 253668) and the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2 (order number...
Vol. 3 1-3 ABOUT THIS MANUAL The Intel ® Core TM i7 processor and the Intel ® Core TM i5 processor are based on the Intel ® microarchitecture (Nehalem) and support Intel 64 architecture. Processors based on the Next Generation Intel Processor, codenamed Westmere, support Intel 64 architecture.P6 fam...
1-4 Vol. 3 ABOUT THIS MANUAL Chapter 6 — Interrupt and Exception Handling. Describes the basic interrupt mechanisms defined in the Intel 64 and IA-32 architectures, shows how interrupts and exceptions relate to protection, and describes how the architecture handles each exception type. Reference inf...
1-6 Vol. 3 ABOUT THIS MANUAL Chapter 30 — Performance Monitoring. Describes the Intel 64 and IA-32 archi-tectures’ facilities for monitoring performance.Appendix A — Performance-Monitoring Events. Lists architectural performance events. Non-architectural performance events (i.e. model-specific event...
Vol. 3 1-7 ABOUT THIS MANUAL means the bytes of a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions. 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are m...
1-8 Vol. 3 ABOUT THIS MANUAL 1.3.3 Instruction Operands When instructions are represented symbolically, a subset of assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 where: • A label is an identifier which is followed ...
Vol. 3 1-11 ABOUT THIS MANUAL This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown ...
2-2 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW initiates the switch from real-address mode to protected mode. If IA-32e mode oper-ation is desired, software also initiates a switch from protected mode to IA-32e mode. 2.1 OVERVIEW OF THE SYSTEM-LEVEL ARCHITECTURE System-level architecture consists of a set ...
Vol. 3 2-5 SYSTEM ARCHITECTURE OVERVIEW 2.1.1 Global and Local Descriptor Tables When operating in protected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment...
2-6 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The architecture also defines a set of special descriptors called gates (call gates, interrupt gates, trap gates, and task gates). These provide protected gateways to system procedures and handlers that may operate at a different privilege level than applicati...
Vol. 3 2-7 SYSTEM ARCHITECTURE OVERVIEW 2. Loads the task register with the segment selector for the new task.3. Accesses the new TSS through a segment descriptor in the GDT.4. Loads the state of the new task from the new TSS into the general-purpose registers, the segment registers, the LDTR, contr...
2-8 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The IDTR register is expanded to hold a 64-bit base address. Task gates are not supported. 2.1.5 Memory Management System architecture supports either direct physical addressing of memory or virtual memory (through paging). When physical addressing is used, a ...
Vol. 3 2-9 SYSTEM ARCHITECTURE OVERVIEW 2.1.6 System Registers To assist in initializing the processor and controlling system operations, the system architecture provides system flags in the EFLAGS register and several system registers: • The system flags and IOPL field in the EFLAGS register contro...
2-10 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW On systems that support IA-32e mode, the extended feature enable register (IA32_EFER) is available. This model-specific register controls activation of IA-32e mode and other IA-32e mode operations. In addition, there are several model-specific registers that ...
Vol. 3 2-11 SYSTEM ARCHITECTURE OVERVIEW running program or task. SMM-specific code may then be executed transparently. Upon returning from SMM, the processor is placed back into its state prior to the SMI. • Virtual-8086 mode — In protected mode, the processor supports a quasi-operating mode known ...
2-12 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The VM flag in the EFLAGS register determines whether the processor is operating in protected mode or virtual-8086 mode. Transitions between protected mode and virtual-8086 mode are generally carried out as part of a task switch or a return from an interrupt ...
Vol. 3 2-13 SYSTEM ARCHITECTURE OVERVIEW IF Interrupt enable (bit 9) — Controls the response of the processor to maskable hardware interrupt requests (see also: Section 6.3.2, “Maskable Hardware Interrupts”). The flag is set to respond to maskable hardware interrupts; cleared to inhibit maskable har...
Vol. 3 2-15 SYSTEM ARCHITECTURE OVERVIEW VIP Virtual interrupt pending (bit 20) — Set by software to indicate that an interrupt is pending; cleared to indicate that no interrupt is pending. This flag is used in conjunction with the VIF flag. The processor reads this flag but never modifies it. The p...
Vol. 3 2-17 SYSTEM ARCHITECTURE OVERVIEW 2.4.3 IDTR Interrupt Descriptor Table Register The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and 16-bit table limit for the IDT. The base address specifies the linear address of byte 0 of the IDT; the table limit...
Vol. 3 2-21 SYSTEM ARCHITECTURE OVERVIEW delayed until an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed by the new task. The processor sets this flag on every task switch and tests it when executing x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instructions. • If the TS flag is set an...
Vol. 3 2-25 SYSTEM ARCHITECTURE OVERVIEW processor will generate an invalid opcode exception (#UD) if it attempts to execute any SSE/SSE2/SSE3and instruction, with the exception of PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT. The operating system or executive must ex...
2-26 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW all interrupts are enabled. This field is available in 64-bit mode. A value of 15 means all interrupts will be disabled. 2.5.1 CPUID Qualification of Control Register Flags The VME, PVI, TSD, DE, PSE, PAE, MCE, PGE, PCE, OSFXSR, and OSXMMEXCPT flags in contro...
Vol. 3 2-27 SYSTEM ARCHITECTURE OVERVIEW state, SSE state, or a future processor extended state) is represented by a bit in XCR0. The OS can enable future processor extended states in a forward manner by specifying the appropriate bit mask value using the XSETBV instruction according to the results ...
Vol. 3 2-29 SYSTEM ARCHITECTURE OVERVIEW 2.7.1 Loading and Storing System Registers The GDTR, LDTR, IDTR, and TR registers each have a load and store instruction for loading data into and storing data from the register: • LGDT (Load GDTR Register) — Loads the GDT base address and limit from memory i...
2-30 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The LMSW (load machine status word) and SMSW (store machine status word) instructions operate on bits 0 through 15 of control register CR0. These instructions are provided for compatibility with the 16-bit Intel 286 processor. Programs written to run on 32-bi...
Vol. 3 2-31 SYSTEM ARCHITECTURE OVERVIEW Instructions),” for a detailed explanation of the function and use of this instruction. 2.7.3 Loading and Storing Debug Registers Internal debugging facilities in the processor are controlled by a set of 8 debug regis-ters (DR0-DR7). The MOV instruction allow...
2-32 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW introduced with the Pentium Pro processor). If any non-wake events are pending during shutdown, they will be handled after the wake event from shutdown is processed (for example, A20M# interrupts).The LOCK prefix invokes a locked (atomic) read-modify-write op...
Vol. 3 2-33 SYSTEM ARCHITECTURE OVERVIEW Fixed-function performance counters record only specific events that are defined in Chapter 20, “Introduction to Virtual-Machine Extensions”, and the width/number of fixed-function counters are enumerated by CPUID leaf 0AH.The time-stamp counter is a model-sp...
2-34 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW 2.7.7.1 Reading and Writing Model-Specific Registers in 64-Bit Mode RDMSR and WRMSR require an index to specify the address of an MSR. In 64-bit mode, the index is 32 bits; it is specified using ECX. 2.7.8 Enabling Processor Extended States The XSETBV instruc...
Vol. 3 3-1 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT This chapter describes the Intel 64 and IA-32 architecture’s protected-mode memory management facilities, including the physical memory requirements, segmentation mechanism, and paging mechanism.See also: Chapter 5, “Protection” (for a descriptio...
3-2 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT segment, the segment type, and the location of the first byte of the segment in the linear address space (called the base address of the segment). The offset part of the logical address is added to the base address for the segment to locate a byte within t...
Vol. 3 3-3 PROTECTED-MODE MEMORY MANAGEMENT storage. When using paging, each segment is divided into pages (typically 4 KBytes each in size), which are stored either in physical memory or on the disk. The oper-ating system or executive maintains a page directory and a set of page tables to keep trac...
3-4 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT FFFF_FFF0H. RAM (DRAM) is placed at the bottom of the address space because the initial base address for the DS data segment after reset initialization is 0. 3.2.2 Protected Flat Model The protected flat model is similar to the basic flat model, except the...
Vol. 3 3-5 PROTECTED-MODE MEMORY MANAGEMENT More complexity can be added to this protected flat model to provide more protec-tion. For example, for the paging mechanism to provide isolation between user and supervisor code and data, four segments need to be defined: code and data segments at privile...
3-6 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT Access checks can be used to protect not only against referencing an address outside the limit of a segment, but also against performing disallowed operations in certain segments. For example, since code segments are designated as read-only segments, hardw...
Vol. 3 3-7 PROTECTED-MODE MEMORY MANAGEMENT In 64-bit mode, segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to the effective address. The FS ...
3-8 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.3.1 Intel ® 64 Processors and Physical Address Space On processors that support Intel 64 architecture (CPUID.80000001:EDX[29] = 1), the size of the physical address range is implementation-specific and indicated by CPUID.80000008H:EAX[bits 7-0]. For the ...
Vol. 3 3-9 PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the processor maps the linear address directly to a physical address (that is, the linear address goes out on the processor’s address bus). If the linear address space is paged, a second level of address translation is used to trans-...
3-12 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.4.4 Segment Loading Instructions in IA-32e Mode Because ES, DS, and SS segment registers are not used in 64-bit mode, their fields (base, limit, and attribute) in segment descriptor registers are ignored. Some forms of segment load instructions are also...
Vol. 3 3-13 PROTECTED-MODE MEMORY MANAGEMENT 3.4.5 Segment Descriptors A segment descriptor is a data structure in a GDT or LDT that provides the processor with the size and location of a segment, as well as access control and status informa-tion. Segment descriptors are typically created by compile...
3-14 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT to the segment limit. Offsets greater than the segment limit generate general-protection exceptions (#GP). For expand-down segments, the segment limit has the reverse function; the offset can range from the segment limit to FFFFFFFFH or FFFFH, depending o...
Vol. 3 3-15 PROTECTED-MODE MEMORY MANAGEMENT store its own data, such as information regarding the whereabouts of the missing segment. D/B (default operation size/default stack pointer size and/or upper bound) flag Performs different functions depending on whether the segment descriptor is an execut...
3-16 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT G (granularity) flag Determines the scaling of the segment limit field. When the granularity flag is clear, the segment limit is interpreted in byte units; when flag is set, the segment limit is interpreted in 4-KByte units. (This flag does not affect the...
3-18 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT For code segments, the three low-order bits of the type field are interpreted as accessed (A), read enable (R), and conforming (C). Code segments can be execute-only or execute/read, depending on the setting of the read-enable bit. An execute/read segment...
3-20 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT See also: Section 3.5.1, “Segment Descriptor Tables”, and Section 7.2.2, “TSS Descriptor” (for more information on the system-segment descriptors); see Section 5.8.3, “Call Gates”, Section 6.11, “IDT Descriptors”, and Section 7.2.5, “Task-Gate Descriptor”...
3-22 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.5.2 Segment Descriptor Tables in IA-32e Mode In IA-32e mode, a segment descriptor table can contain up to 8192 (2 13 ) 8-byte descriptors. An entry in the segment descriptor table can be 8 bytes. System descrip-tors are expanded to 16 bytes (occupying t...
Vol. 3 4-1 CHAPTER 4 PAGING Chapter 3 explains how segmentation converts logical addresses to linear addresses. Paging (or linear-address translation) is the process of translating linear addresses so that they can be used to access memory or I/O devices. Paging translates each linear address to a p...
4-2 Vol. 3 PAGING paging modes. Section 4.1.3 discusses how CR0.WP, CR4.PSE, CR4.PGE, and IA32_EFER.NXE modify the operation of the different paging modes. 4.1.1 Three Paging Modes If CR0.PG = 0, paging is not used. The logical processor treats all linear addresses as if they were physical addresses...
Vol. 3 4-3 PAGING linear addresses larger than 32 bits, 32-bit paging and PAE paging translate 32-bit linear addresses.Because it is used only if IA32_EFER.LME = 1, IA-32e paging is used only in IA-32e mode. (In fact, it is the use of IA-32e paging that defines IA-32e mode.) IA-32e mode has two sub-...
4-4 Vol. 3 PAGING enable these modes and make transitions between them. The following items identify certain limitations and other details: • IA32_EFER.LME cannot be modified while paging is enabled (CR0.PG = 1). Attempts to do so using WRMSR cause a general-protection exception (#GP(0)). • Paging c...
Vol. 3 4-5 PAGING • Software can always disable paging by clearing CR0.PG with MOV to CR0. • Software can make transitions between 32-bit paging and PAE paging by changing the value of CR4.PAE with MOV to CR4. • Software cannot make transitions directly between IA-32e paging and either of the other ...
4-6 Vol. 3 PAGING 4.1.4 Enumeration of Paging Features by CPUID Software can discover support for different paging features using the CPUID instruc-tion: • PSE: page-size extensions for 32-bit paging.If CPUID.01H:EDX.PSE [bit 3] = 1, CR4.PSE may be set to 1, enabling support for 4-MByte pages with 3...
Vol. 3 4-7 PAGING 4.2 HIERARCHICAL PAGING STRUCTURES: AN OVERVIEW All three paging modes translate linear addresses use hierarchical paging struc-tures. This section provides an overview of their operation. Section 4.3, Section 4.4, and Section 4.5 provide details for the three paging modes.Every pa...
4-8 Vol. 3 PAGING and bits 20:12 identify a fourth. Again, the last identifies the page frame. (See Figure 4-8 for an illustration.) The translation process in each of the examples above completes by identifying a page frame. However, the paging structures may be configured so that translation termi...
Vol. 3 4-9 PAGING corresponds to 1 TByte, linear addresses are limited to 32 bits; at most 4 GBytes of linear-address space may be accessed at any given time.32-bit paging uses a hierarchy of paging structures to produce a translation for a linear address. CR3 is used to locate the first paging-stru...
4-16 Vol. 3 PAGING ters. (This is different from the other paging modes, in which there is one hierarchy referenced by CR3.)Section 4.4.1 discusses the PDPTE registers. Section 4.4.2 describes linear-address translation with PAE paging. 4.4.1 PDPTE Registers When PAE paging is used, CR3 references t...
Vol. 3 4-17 PAGING Table 4-8 gives the format of a PDPTE. If any of the PDPTEs sets both the P flag (bit 0) and any reserved bit, the MOV to CR instruction causes a general-protection exception (#GP(0)) and the PDPTEs are not loaded. 1 As show in Table 4-8, bits 2:1, 8:5, and 63:MAXPHYADDR are reser...
Vol. 3 4-27 PAGING Table 4-13. Format of an IA-32e PML4 Entry (PML4E) that References a Page- Directory-Pointer Table Bit Position(s) Contents 0 (P) Present; must be 1 to reference a page-directory-pointer table 1 (R/W) Read/write; if 0, writes may not be allowed to the 512-GByte region controlled b...
4-28 Vol. 3 PAGING • If the PDE’s PS flag is 1, the PDE maps a 2-MByte page (see Table 4-15). The final physical address is computed as follows: Table 4-14. Format of an IA-32e Page-Directory-Pointer-Table Entry (PDPTE) that References a Page Directory Bit Position(s) Contents 0 (P) Present; must be...
4-32 Vol. 3 PAGING • If the P flag of a PML4E or a PDPTE is 1, the PS flag is reserved. • If the P flag and the PS flag of a PDE are both 1, bits 20:13 are reserved. • If IA32_EFER.NXE = 0 and the P flag of a paging-structure entry is 1, the XD flag (bit 63) is reserved. A reference using a linear a...
4-34 Vol. 3 PAGING both the R/W flag and the U/S flag are 1 in every paging-structure entry controlling the translation. — Instruction fetches. • For 32-bit paging or if IA32_EFER.NXE = 0, instructions may be fetched from any linear address with a valid translation for which the U/S flag is 1 in eve...
4-36 Vol. 3 PAGING Page-fault exceptions occur only due to an attempt to use a linear address. Failures to load the PDPTE registers with PAE paging (see Section 4.4.1) cause general-protection exceptions (#GP(0)) and not page-fault exceptions. 4.8 ACCESSED AND DIRTY FLAGS For any paging-structure en...
Vol. 3 4-37 PAGING 4.9 PAGING AND MEMORY TYPING The memory type of a memory access refers to the type of caching used for that access. Chapter 11, “Memory Cache Control” provides many details regarding memory typing in the Intel-64 and IA-32 architectures. This section describes how paging contribut...
4-38 Vol. 3 PAGING The PAT is a 64-bit MSR (IA32_PAT; MSR index 277H) comprising eight (8) 8-bit entries (entry i comprises bits 8i+7:8i of the MSR).For any access to a physical address, the table combines the memory type specified for that physical address by the MTRRs with a memory type selected f...
Vol. 3 4-41 PAGING entries in memory. See Section 4.10.3.2 for how software can ensure that the processor uses the modified paging-structure entries.If the paging structures specify a translation using a page larger than 4 KBytes, some processors may choose to cache multiple smaller-page TLB entries...
4-46 Vol. 3 PAGING 4.10.3 Invalidation of TLBs and Paging-Structure Caches As noted in Section 4.10.1 and Section 4.10.2, the processor may create entries in the TLBs and the paging-structure caches when linear addresses are translated, and it may retain these entries even after the paging structure...
Vol. 3 4-49 PAGING in response to an attempted user-mode access) but no other adverse behavior. Such an exception will occur at most once for each affected linear address (see Section 4.10.3.1). • If a paging-structure entry is modified to change the XD flag from 1 to 0, failure to perform an invali...
Vol. 3 4-51 PAGING 4.11 INTERACTIONS WITH VIRTUAL-MACHINE EXTENSIONS (VMX) The architecture for virtual-machine extensions (VMX) includes features that interact with paging. Section 4.11.1 discusses ways in which VMX-specific control transfers, called VMX transitions specially affect paging. Section...
4-52 Vol. 3 PAGING concurrently information for multiple address spaces in its TLBs and paging-structure caches. See Section 25.1 for details.When EPT is in use, the addresses in the paging-structures are not used as physical addresses to access memory and memory-mapped I/O. Instead, they are treate...
Vol. 3 4-53 PAGING segments can be mapped to pages in several ways. To implement a flat (unseg-mented) addressing environment, for example, all the code, data, and stack modules can be mapped to one or more large segments (up to 4-GBytes) that share same range of linear addresses (see Figure 3-2 in ...
Vol. 3 5-1 CHAPTER 5 PROTECTION In protected mode, the Intel 64 and IA-32 architectures provide a protection mecha-nism that operates at both the segment level and the page level. This protection mechanism provides the ability to limit access to certain segments or pages based on privilege levels (f...
Vol. 3 5-3 PROTECTION procedure. The term current privilege level (CPL) refers to the setting of this field. • User/supervisor (U/S) flag — (Bit 2 of paging-structure entries.) Determines the type of page: user or supervisor. • Read/write (R/W) flag — (Bit 1 of paging-structure entries.) Determines ...
5-4 Vol. 3 PROTECTION Many different styles of protection schemes can be implemented with these fields and flags. When the operating system creates a descriptor, it places values in these fields and flags in keeping with the particular protection style chosen for an operating system or executive. Ap...
Vol. 3 5-5 PROTECTION The following sections describe how the processor uses these fields and flags to perform the various categories of checks described in the introduction to this chapter. 5.2.1 Code Segment Descriptor in 64-bit Mode Code segments continue to exist in 64-bit mode even though, for ...
5-6 Vol. 3 PROTECTION 5.3 LIMIT CHECKING The limit field of a segment descriptor prevents programs or procedures from addressing memory locations outside the segment. The effective value of the limit depends on the setting of the G (granularity) flag (see Figure 5-1). For data segments, the limit al...
Vol. 3 5-7 PROTECTION • A doubleword at an offset greater than the (effective-limit – 3) • A quadword at an offset greater than the (effective-limit – 7) For expand-down data segments, the segment limit has the same function but is interpreted differently. Here, the effective limit specifies the las...
Vol. 3 5-9 PROTECTION instruction. If the descriptor type is for a code segment or call gate, a call or jump to another code segment is indicated; if the descriptor type is for a TSS or task gate, a task switch is indicated. — On a call or jump through a call gate (or on an interrupt- or exception-h...
Vol. 3 5-11 PROTECTION example, if the DPL of a data segment is 1, only programs running at a CPL of 0 or 1 can access the segment. — Nonconforming code segment (without using a call gate) — The DPL indicates the privilege level that a program or task must be at to access the segment. For example, i...
5-12 Vol. 3 PROTECTION loads the segment selector into the segment register if the DPL is numerically greater than or equal to both the CPL and the RPL. Otherwise, a general-protection fault is generated and the segment register is not loaded. Figure 5-5 shows four procedures (located in codes segme...
Vol. 3 5-13 PROTECTION As demonstrated in the previous examples, the addressable domain of a program or task varies as its CPL changes. When the CPL is 0, data segments at all privilege levels are accessible; when the CPL is 1, only data segments at privilege levels 1 through 3 are accessible; when ...
5-14 Vol. 3 PROTECTION • Load a data-segment register with a segment selector for a nonconforming, readable, code segment. • Load a data-segment register with a segment selector for a conforming, readable, code segment. • Use a code-segment override prefix (CS) to read a readable, code segment whose...
Vol. 3 5-15 PROTECTION • The target operand points to a TSS, which contains the segment selector for the target code segment. • The target operand points to a task gate, which points to a TSS, which in turn contains the segment selector for the target code segment. The following sections describe fi...
5-16 Vol. 3 PROTECTION • The RPL of the segment selector of the destination code segment. • The conforming (C) flag in the segment descriptor for the destination code segment, which determines whether the segment is a conforming (C flag is set) or nonconforming (C flag is clear) code segment. See Se...
Vol. 3 5-17 PROTECTION The RPL of the segment selector that points to a nonconforming code segment has a limited effect on the privilege check. The RPL must be numerically less than or equal to the CPL of the calling procedure for a successful control transfer to occur. So, in the example in Figure ...
5-18 Vol. 3 PROTECTION In the example in Figure 5-7, code segment D is a conforming code segment. There-fore, calling procedures in both code segment A and B can access code segment D (using either segment selector D1 or D2, respectively), because they both have CPLs that are greater than or equal t...
Vol. 3 5-19 PROTECTION 5.8.3 Call Gates Call gates facilitate controlled transfers of program control between different privi-lege levels. They are typically used only in operating systems or executives that use the privilege-level protection mechanism. Call gates are also useful for transferring pr...
5-20 Vol. 3 PROTECTION Note that the P flag in a gate descriptor is normally always set to 1. If it is set to 0, a not present (#NP) exception is generated when a program attempts to access the descriptor. The operating system can use the P flag for special purposes. For example, it could be used to...
5-22 Vol. 3 PROTECTION 5.8.4 Accessing a Code Segment Through a Call Gate To access a call gate, a far pointer to the gate is provided as a target operand in a CALL or JMP instruction. The segment selector from this pointer identifies the call gate (see Figure 5-10); the offset from the pointer is r...
Vol. 3 5-23 PROTECTION The privilege checking rules are different depending on whether the control transfer was initiated with a CALL or a JMP instruction, as shown in Table 5-1. The DPL field of the call-gate descriptor specifies the numerically highest privilege level from which a calling procedur...
Vol. 3 5-25 PROTECTION Call gates allow a single code segment to have procedures that can be accessed at different privilege levels. For example, an operating system located in a code segment may have some services which are intended to be used by both the oper-ating system and application software ...
Vol. 3 5-27 PROTECTION 3. Checks the stack-segment descriptor for the proper privileges and type and generates an invalid TSS (#TS) exception if violations are detected. 4. Temporarily saves the current values of the SS and ESP registers.5. Loads the segment selector and stack pointer for the new st...
5-28 Vol. 3 PROTECTION dure, one of the parameters can be a pointer to a data structure, or the saved contents of the SS and ESP registers may be used to access parameters in the old stack space. The size of the data items passed to the called procedure depends on the call gate size, as described in...
5-30 Vol. 3 PROTECTION 5. (If the RET instruction includes a parameter count operand.) Adds the parameter count (in bytes obtained from the RET instruction) to the current ESP register value, to step past the parameters on the calling procedure’s stack. The resulting ESP value is not checked against...
Vol. 3 5-31 PROTECTION • Stack segment — Computed by adding 24 to the value in IA32_SYSENTER_CS. • Stack pointer — Reads this from ECX. The SYSENTER and SYSEXIT instructions preform “fast” calls and returns because they force the processor into a predefined privilege level 0 state when SYSENTER is e...
5-32 Vol. 3 PROTECTION When SYSEXIT transfers control to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: • Target code segment — Computed by adding 16 to the value in IA32_SYSENTER_CS. • New CS attributes — L-bit = 0 (go to co...
Vol. 3 5-33 PROTECTION When SYSRET transfers control to 32-bit mode user code using a 32-bit operand size, the processor gets the privilege level 3 target instruction and stack pointer from: • Target code segment — Reads a non-NULL selector from IA32_STAR[63:48]. • Target instruction — Copies the va...
5-34 Vol. 3 PROTECTION general-protection exception (#GP) is generated. The following system instructions are privileged instructions: • LGDT — Load GDT register. • LLDT — Load LDT register. • LTR — Load task register. • LIDT — Load IDT register. • MOV (control registers) — Load and store control re...
5-36 Vol. 3 PROTECTION 5.10.2 Checking Read/Write Rights (VERR and VERW Instructions) When the processor accesses any code or data segment it checks the read/write priv-ileges assigned to the segment to verify that the intended read or write operation is allowed. Software can check read/write rights...
Vol. 3 5-37 PROTECTION destination register and sets the ZF flag in the EFLAGS register. If the segment selector is not visible at the current privilege level or is an invalid type for the LSL instruction, the instruction does not modify the destination register and clears the ZF flag. Once loaded i...
Vol. 3 5-39 PROTECTION The example in Figure 5-15 demonstrates how the ARPL instruction is intended to be used. When the operating-system receives segment selector D2 from the application program, it uses the ARPL instruction to compare the RPL of the segment selector with the privilege level of the...
5-40 Vol. 3 PROTECTION page-fault exception mechanism. This chapter describes the protection violations which lead to page-fault exceptions. 5.11.1 Page-Protection Flags Protection information for pages is contained in two flags in a paging-structure entry (see Chapter 4): the read/write flag (bit 1...
Vol. 3 5-41 PROTECTION When the processor is in supervisor mode and the WP flag in register CR0 is clear (its state following reset initialization), all pages are both readable and writable (write-protection is ignored). When the processor is in user mode, it can write only to user-mode pages that a...
Vol. 3 5-43 PROTECTION 5.13 PAGE-LEVEL PROTECTION AND EXECUTE-DISABLE BIT In addition to page-level protection offered by the U/S and R/W flags, paging struc-tures used with PAE paging and IA-32e paging (see Chapter 4) provide the execute-disable bit. This bit offers additional protection for data p...
5-44 Vol. 3 PROTECTION 5.13.2 Execute-Disable Page Protection The execute-disable bit in the paging structures enhances page protection for data pages. Instructions cannot be fetched from a memory page if IA32_EFER.NXE =1 and the execute-disable bit is set in any of the paging-structure entries used...
Vol. 3 5-45 PROTECTION 5.13.3 Reserved Bit Checking The processor enforces reserved bit checking in paging data structure entries. The bits being checked varies with paging mode and may vary with the size of physical address space. Table 5-8 shows the reserved bits that are checked when the execute ...
5-46 Vol. 3 PROTECTION If execute disable bit capability is not enabled or not available, reserved bit checking in 64-bit mode includes bit 63 and additional bits. This and reserved bit checking for legacy 32-bit paging modes are shown in Table 5-10. Table 5-8. IA-32e Mode Page Level Protection Matr...
Vol. 3 5-47 PROTECTION 5.13.4 Exception Handling When execute disable bit capability is enabled (IA32_EFER.NXE = 1), conditions for a page fault to occur include the same conditions that apply to an Intel 64 or IA-32 processor without execute disable bit capability plus the following new condition: ...
Vol. 3 6-1 CHAPTER 6 INTERRUPT AND EXCEPTION HANDLING This chapter describes the interrupt and exception-handling mechanism when oper-ating in protected mode on an Intel 64 or IA-32 processor. Most of the information provided here also applies to interrupt and exception mechanisms used in real-addre...
6-2 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.2 EXCEPTION AND INTERRUPT VECTORS To aid in handling exceptions and interrupts, each architecturally defined exception and each interrupt condition requiring special handling by the processor is assigned a unique identification number, called a vector. T...
6-4 Vol. 3 INTERRUPT AND EXCEPTION HANDLING The processor’s local APIC is normally connected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC’s pins can be directed to the local APIC through the system bus (Pentium 4, Intel Core Duo, Intel Core 2, Intel Atom, and Intel ...
Vol. 3 6-5 INTERRUPT AND EXCEPTION HANDLING defined interrupt vectors from 0 through 255; those that can be delivered through the local APIC include interrupt vectors 16 through 255. The IF flag in the EFLAGS register permits all maskable hardware interrupts to be masked as a group (see Section 6.8....
6-6 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.4.2 Software-Generated Exceptions The INTO, INT 3, and BOUND instructions permit exceptions to be generated in soft-ware. These instructions allow checks for exception conditions to be performed at points in the instruction stream. For example, INT 3 cau...
Vol. 3 6-7 INTERRUPT AND EXCEPTION HANDLING • Aborts — An abort is an exception that does not always report the precise location of the instruction causing the exception and does not allow a restart of the program or task that caused the exception. Aborts are used to report severe errors, such as ha...
6-8 Vol. 3 INTERRUPT AND EXCEPTION HANDLING EFLAGS.OF (overflow) flag. The trap handler for this exception resolves the overflow condition. Upon return from the trap handler, program or task execution continues at the instruction following the INTO instruction.The abort-class exceptions do not suppo...
Vol. 3 6-9 INTERRUPT AND EXCEPTION HANDLING It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processor’s NMI-handling hardw...
6-10 Vol. 3 INTERRUPT AND EXCEPTION HANDLING is an interrupt. As with the INT n instruction (see Section 6.4.2, “Software-Generated Exceptions”), when an interrupt is generated through the INTR pin to an exception vector, the processor does not push an error code on the stack, so the exception handl...
Vol. 3 6-11 INTERRUPT AND EXCEPTION HANDLING 6.8.3 Masking Exceptions and Interrupts When Switching Stacks To switch to a different stack segment, software often uses a pair of instructions, for example: MOV SS, AXMOV ESP, StackTop If an interrupt or exception occurs after the segment selector has b...
6-14 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.11 IDT DESCRIPTORS The IDT may contain any of three kinds of gate descriptors: • Task-gate descriptor • Interrupt-gate descriptor • Trap-gate descriptor Figure 6-2 shows the formats for the task-gate, interrupt-gate, and trap-gate descriptors. The forma...
Vol. 3 6-15 INTERRUPT AND EXCEPTION HANDLING 6.12 EXCEPTION AND INTERRUPT HANDLING The processor handles calls to exception- and interrupt-handlers similar to the way it handles calls with a CALL instruction to a procedure or a task. When responding to an exception or interrupt, the processor uses t...
6-16 Vol. 3 INTERRUPT AND EXCEPTION HANDLING “Returning from a Called Procedure”). If index points to a task gate, the processor executes a task switch to the exception- or interrupt-handler task in a manner similar to a CALL to a task gate (see Section 7.3, “Task Switching”). 6.12.1 Exception- or I...
6-20 Vol. 3 INTERRUPT AND EXCEPTION HANDLING of the EFLAGS register on the stack. Accessing a handler procedure through a trap gate does not affect the IF flag. 6.12.2 Interrupt Tasks When an exception or interrupt handler is accessed through a task gate in the IDT, a task switch results. Handling a...
Vol. 3 6-21 INTERRUPT AND EXCEPTION HANDLING 6.13 ERROR CODE When an exception condition is related to a specific segment, the processor pushes an error code onto the stack of the exception handler (whether it is a procedure or task). The error code has the format shown in Figure 6-6. The error code...
6-22 Vol. 3 INTERRUPT AND EXCEPTION HANDLING clear, indicates that the index refers to a descriptor in the GDT or the current LDT. TI GDT/LDT (bit 2) — Only used when the IDT flag is clear. When set, the TI flag indicates that the index portion of the error code refers to a segment or gate descripto...
6-24 Vol. 3 INTERRUPT AND EXCEPTION HANDLING ware attempts to reference an interrupt gate with a target RIP that is not in canonical form.The target code segment referenced by the interrupt gate must be a 64-bit code segment (CS.L = 1, CS.D = 0). If the target is not a 64-bit code segment, a general...
Vol. 3 6-25 INTERRUPT AND EXCEPTION HANDLING 6.14.3 IRET in IA-32e Mode In IA-32e mode, IRET executes with an 8-byte operand size. There is nothing that forces this requirement. The stack is formatted in such a way that for actions where IRET is required, the 8-byte IRET operand size works correctly...
6-26 Vol. 3 INTERRUPT AND EXCEPTION HANDLING In summary, a stack switch in IA-32e mode works like the legacy stack switch, except that a new SS selector is not loaded from the TSS. Instead, the new SS is forced to NULL. 6.14.5 Interrupt Stack Table In IA-32e mode, a new interrupt stack table (IST) m...
Vol. 3 6-27 INTERRUPT AND EXCEPTION HANDLING 6.15 EXCEPTION AND INTERRUPT REFERENCE The following sections describe conditions which generate exceptions and interrupts. They are arranged in the order of vector numbers. The information contained in these sections are as follows: • Exception Class — I...
6-28 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 0—Divide Error Exception (#DE) Exception Class Fault. Description Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the result cannot be represented in the number of bits specified for the destination operand. Exception Er...
Vol. 3 6-29 INTERRUPT AND EXCEPTION HANDLING Interrupt 1—Debug Exception (#DB) Exception Class Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the other debug registers. Description Indicates that one or more of several debug-exceptio...
6-30 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 2—NMI Interrupt Exception Class Not applicable. Description The nonmaskable interrupt (NMI) is generated externally by asserting the processor’s NMI pin or through an NMI request set by the I/O APIC to the local APIC. This interrupt causes the N...
6-34 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 6—Invalid Opcode Exception (#UD) Exception Class Fault. Description Indicates that the processor did one of the following things: • Attempted to execute an invalid or reserved opcode. • Attempted to execute an instruction with an operand type th...
Vol. 3 6-35 INTERRUPT AND EXCEPTION HANDLING processor and earlier IA-32 processors, this exception is not generated as the result of prefetching and preliminary decoding of an invalid instruction. (See Section 6.5, “Exception Classifications,” for general rules for taking of interrupts and exceptio...
Vol. 3 6-37 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS and EIP registers point to the floating-point instruction or the WAIT/FWAIT instruction that generated the exception. Program State Change A program-state change does not accompany a device-not-available ...
6-38 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 8—Double Fault Exception (#DF) Exception Class Abort. Description Indicates that the processor detected a second exception while calling an exception handler for a prior exception. Normally, when the processor detects another excep-tion while tr...
Vol. 3 6-39 INTERRUPT AND EXCEPTION HANDLING A segment or page fault may be encountered while prefetching instructions; however, this behavior is outside the domain of Table 6-5. Any further faults gener-ated while the processor is attempting to transfer control to the appropriate fault handler coul...
Vol. 3 6-41 INTERRUPT AND EXCEPTION HANDLING Interrupt 9—Coprocessor Segment Overrun Exception Class Abort. (Intel reserved; do not use. Recent IA-32 processors do not generate this exception.) Description Indicates that an Intel386 CPU-based systems with an Intel 387 math coprocessor detected a pag...
6-42 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 10—Invalid TSS Exception (#TS) Exception Class Fault. Description Indicates that there was an error related to a TSS. Such an error might be detected during a task switch or during the execution of instructions that use information from a TSS. T...
6-44 Vol. 3 INTERRUPT AND EXCEPTION HANDLING This exception can generated either in the context of the original task or in the context of the new task (see Section 7.3, “Task Switching”). Until the processor has completely verified the presence of the new TSS, the exception is generated in the conte...
6-50 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 13—General Protection Exception (#GP) Exception Class Fault. Description Indicates that the processor detected one of a class of protection violations called “general-protection violations.” The conditions that cause this exception to be gener-a...
6-52 Vol. 3 INTERRUPT AND EXCEPTION HANDLING • A selector from a TSS involved in a task switch. • IDT vector number. Saved Instruction Pointer The saved contents of CS and EIP registers point to the instruction that generated the exception. Program State Change In general, a program-state change doe...
6-56 Vol. 3 INTERRUPT AND EXCEPTION HANDLING second page fault can occur. 1 If a page fault is caused by a page-level protection violation, the access flag in the page-directory entry is set when the fault occurs. The behavior of IA-32 processors regarding the access flag in the corresponding page-t...
Vol. 3 6-57 INTERRUPT AND EXCEPTION HANDLING description for “Interrupt 10—Invalid TSS Exception (#TS)” in this chapter for addi-tional information on how to handle this situation.) Additional Exception-Handling Information Special care should be taken to ensure that an exception that occurs during ...
6-58 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 16—x87 FPU Floating-Point Error (#MF) Exception Class Fault. Description Indicates that the x87 FPU has detected a floating-point error. The NE flag in the register CR0 must be set for an interrupt 16 (floating-point error exception) to be gener...
Vol. 3 6-59 INTERRUPT AND EXCEPTION HANDLING Prior to executing a waiting x87 FPU instruction or the WAIT/FWAIT instruction, the x87 FPU checks for pending x87 FPU floating-point exceptions (as described in step 2 above). Pending x87 FPU floating-point exceptions are ignored for “non-waiting” x87 FP...
6-60 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 17—Alignment Check Exception (#AC) Exception Class Fault. Description Indicates that the processor detected an unaligned memory operand when alignment checking was enabled. Alignment checks are only carried out in data (or stack) accesses (not i...
Vol. 3 6-67 INTERRUPT AND EXCEPTION HANDLING Interrupts 32 to 255—User Defined Interrupts Exception Class Not applicable. Description Indicates that the processor did one of the following things: • Executed an INT n instruction where the instruction operand is one of the vector numbers from 32 throu...
Vol. 3 7-1 CHAPTER 7 TASK MANAGEMENT This chapter describes the IA-32 architecture’s task management facilities. These facilities are only available when the processor is running in protected mode.This chapter focuses on 32-bit tasks and the 32-bit TSS structure. For information on 16-bit tasks and ...
7-2 Vol. 3 TASK MANAGEMENT 7.1.2 Task State The following items define the state of the currently executing task: • The task’s current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS). • The state of the general-purpose registers. • The state of...
Vol. 3 7-3 TASK MANAGEMENT 7.1.3 Executing a Task Software or the processor can dispatch a task for execution in one of the following ways: • A explicit call to a task with the CALL instruction. • A explicit jump to a task with the JMP instruction. • An implicit call (by the processor) to an interru...
7-4 Vol. 3 TASK MANAGEMENT page tables as other privilege-level-3 tasks can access code and corrupt data and the stack of other tasks.Use of task management facilities for handling multitasking applications is optional. Multitasking can be handled in software, with each software defined task execute...
Vol. 3 7-7 TASK MANAGEMENT • Task switches are carried out faster if the pages containing these structures are present in memory before the task switch is initiated. 7.2.2 TSS Descriptor The TSS, like all other segments, is defined by a segment descriptor. Figure 7-3 shows the format of a TSS descri...
7-8 Vol. 3 TASK MANAGEMENT of a TSS. Attempting to switch to a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS exception (#TS). A larger limit is required if an I/O permission bit map is included or if the operating system stores additional data. The processor does not c...
Vol. 3 7-9 TASK MANAGEMENT 7.2.4 Task Register The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (see Figure 2-5). This information is copied from the TSS descri...
7-12 Vol. 3 TASK MANAGEMENT to be handled by handler tasks. When an interrupt or exception vector points to a task gate, the processor switches to the specified task. Figure 7-7 illustrates how a task gate in an LDT, a task gate in the GDT, and a task gate in the IDT can all point to the same task. ...
7-14 Vol. 3 TASK MANAGEMENT 10. If the task switch was initiated with a CALL instruction, JMP instruction, an exception, or an interrupt, the processor sets the busy (B) flag in the new task’s TSS descriptor; if initiated with an IRET instruction, the busy (B) flag is left set. 11. Loads the task re...
Vol. 3 7-15 TASK MANAGEMENT rules control access to a TSS, software does not need to perform explicit privilege checks on a task switch.Table 7-1 shows the exception conditions that the processor checks for when switching tasks. It also shows the exception that is generated for each check if an erro...
7-16 Vol. 3 TASK MANAGEMENT The TS (task switched) flag in the control register CR0 is set every time a task switch occurs. System software uses the TS flag to coordinate the actions of floating-point unit when generating floating-point exceptions with the rest of the processor. The TS flag indicate...
Vol. 3 7-17 TASK MANAGEMENT Table 7-2 shows the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task switch.The NT flag may be modified by software executing at any privilege level. It is possible for a program to s...
7-18 Vol. 3 TASK MANAGEMENT 7.4.1 Use of Busy Flag To Prevent Recursive Task Switching A TSS allows only one context to be saved for a task; therefore, once a task is called (dispatched), a recursive (or re-entrant) call to the task would cause the current state of the task to be lost. The busy flag...
Vol. 3 7-19 TASK MANAGEMENT In a multiprocessing system, additional synchronization and serialization operations must be added to this procedure to insure that the TSS and its segment descriptor are both locked when the previous task link field is changed and the busy flag is cleared. 7.5 TASK ADDRE...
7-20 Vol. 3 TASK MANAGEMENT and the page tables point to different pages of physical memory, then the tasks do not share physical addresses.With either method of mapping task linear address spaces, the TSSs for all tasks must lie in a shared area of the physical space, which is accessible to all tas...
7-22 Vol. 3 TASK MANAGEMENT 7.7 TASK MANAGEMENT IN 64-BIT MODE In 64-bit mode, task structure and task state are similar to those in protected mode. However, the task switching mechanism available in protected mode is not supported in 64-bit mode. Task management and switching must be performed by s...
Vol. 3 8-1 CHAPTER 8 MULTIPLE-PROCESSOR MANAGEMENT The Intel 64 and IA-32 architectures provide mechanisms for managing and improving the performance of multiple processors connected to the same system bus. These include: • Bus locking and/or cache coherency management for performing atomic operatio...
8-2 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT • To distribute interrupt handling among a group of processors — When several processors are operating in a system in parallel, it is useful to have a centralized mechanism for receiving interrupts and distributing them to available processors for servicing. ...
Vol. 3 8-3 MULTIPLE-PROCESSOR MANAGEMENT software to manage the fairness of semaphores and exclusive locking functions. The mechanisms for handling locked atomic operations have evolved with the complexity of IA-32 processors. More recent IA-32 processors (such as the Pentium 4, Intel Xeon, and P6 f...
8-4 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT the hardware designer to make the LOCK# signal available in system hardware to control memory accesses among processors.For the P6 and more recent processor families, if the memory area being accessed is cached internally in the processor, the LOCK# signal is...
Vol. 3 8-5 MULTIPLE-PROCESSOR MANAGEMENT 8.1.2.2 Software Controlled Bus Locking To explicitly force the LOCK semantics, software can use the LOCK prefix with the following instructions when they are used to modify a memory location. An invalid-opcode exception (#UD) is generated when the LOCK prefi...
Vol. 3 8-7 MULTIPLE-PROCESSOR MANAGEMENT The act of one processor writing data into the currently executing code segment of a second processor with the intent of having the second processor execute that data as code is called cross-modifying code. As with self-modifying code, IA-32 processors exhibi...
8-8 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT have cached the same area of memory from simultaneously modifying data in that area. 8.2 MEMORY ORDERING The term memory ordering refers to the order in which the processor issues reads (loads) and writes (stores) through the system bus to system memory. The ...
Vol. 3 8-9 MULTIPLE-PROCESSOR MANAGEMENT among processors are explicitly required to obey program ordering through the use of appropriate locking or serializing operations (see Section 8.2.5, “Strengthening or Weakening the Memory-Ordering Model”). 8.2.2 Memory Ordering in P6 and More Recent Process...
Vol. 3 8-11 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3 Examples Illustrating the Memory-Ordering Principles This section provides a set of examples that illustrate the behavior of the memory-ordering principles introduced in Section 8.2.2. They are designed to give software writers an understanding of how ...
8-12 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Section 8.2.3.2 through Section 8.2.3.7 give examples using the MOV instruction. The principles that underlie these examples apply to load and store accesses in general and to other instructions that load from or store to memory. Section 8.2.3.8 and Section ...
Vol. 3 8-13 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3.3 Stores Are Not Reordered With Earlier Loads The Intel-64 memory-ordering model ensures that a store by a processor may not occur before a previous load by the same processor. This is illustrated by the following example: Assume r1 == 1. • Because r1 ...
8-14 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT has the two loads occurring before the two stores. This would result in each load returning value 0.The fact that a load may not be reordered with an earlier store to the same location is illustrated by the following example: The Intel-64 memory-ordering mod...
Vol. 3 8-15 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3.6 Stores Are Transitively Visible The memory-ordering model ensures transitive visibility of stores; stores that are causally related appear to all processors to occur in an order consistent with the causality relation. This is illustrated by the follo...
8-16 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT By the principles discussed in Section 8.2.3.2, • processor 2’s first and second load cannot be reordered, • processor 3’s first and second load cannot be reordered. • If r1 == 1 and r2 == 0, processor 0’s store appears to precede processor 1’s store with re...
8-18 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.2.4 Out-of-Order Stores For String Operations The Intel Core 2 Duo, Intel Core, Pentium 4, and P6 family processors modify the processors operation during the string store operations (initiated with the MOVS and STOS instructions) to maximize performance. ...
Vol. 3 8-19 MULTIPLE-PROCESSOR MANAGEMENT 2. Stores from separate string operations (for example, stores from consecutive string operations) do not execute out of order. All the stores from an earlier string operation will complete before any store from a later string operation. 3. String operations...
8-22 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.2.5 Strengthening or Weakening the Memory-Ordering Model The Intel 64 and IA-32 architectures provide several mechanisms for strengthening or weakening the memory-ordering model to handle special programming situations. These mechanisms include: • The I/O ...
Vol. 3 8-27 MULTIPLE-PROCESSOR MANAGEMENT 8.4.1 BSP and AP Processors The MP initialization protocol defines two classes of processors: the bootstrap processor (BSP) and the application processors (APs). Following a power-up or RESET of an MP system, system hardware dynamically selects one of the pr...
8-28 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.4.3 MP Initialization Protocol Algorithm for Intel Xeon Processors Following a power-up or RESET of an MP system, the processors in the system execute the MP initialization protocol algorithm to initialize each of the logical proces-sors on the system bus ...
Vol. 3 8-29 MULTIPLE-PROCESSOR MANAGEMENT • The newly established BSP broadcasts an FIPI message to “all including self,” which the BSP and APs treat as an end of MP initialization signal. Only the processor with its BSP flag set responds to the FIPI message. It responds by fetching and executing th...
8-30 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT SVR EQU 0FEE000F0H APIC_ID EQU 0FEE00020H LVT3 EQU 0FEE00370H APIC_ENABLED EQU 0100H BOOT_ID DD ? COUNT EQU 00H VACANT EQU 00H 8.4.4.1 Typical BSP Initialization Sequence After the BSP and APs have been selected (by means of a hardware protocol, see Section ...
8-32 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT MOV EAX, 000C46XXH; Load ICR encoding from broadcast SIPI IP; to all APs into EAX where xx is the vector computed in step 8. 16. Waits for the timer interrupt.17. Reads and evaluates the COUNT variable and establishes a processor count.18. If necessary, reco...
Vol. 3 8-33 MULTIPLE-PROCESSOR MANAGEMENT 8.4.5 Identifying Logical Processors in an MP System After the BIOS has completed the MP initialization protocol, each logical processor can be uniquely identified by its local APIC ID. Software can access these APIC IDs in either of the following ways: • Re...
8-34 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT during power-up and initialization is 8 bits. Bits 2:1 form a 2-bit physical package identifier (which can also be thought of as a socket identifier). In systems that configure physical processors in clusters, bits 4:3 form a 2-bit cluster ID. Bit 0 is used ...
Vol. 3 8-35 MULTIPLE-PROCESSOR MANAGEMENT 8.5 INTEL ® HYPER-THREADING TECHNOLOGY AND INTEL ® MULTI-CORE TECHNOLOGY Intel Hyper-Threading Technology and Intel multi-core technology are extensions to Intel 64 and IA-32 architectures that enable a single physical processor to execute two or more separa...
8-36 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT number of addressable IDs attributable to processor cores (Y) in the physical package. • Extended Processor Topology Enumeration parameters for 32-bit APIC ID: Intel 64 processors supporting CPUID leaf 0BH will assign unique APIC IDs to each logical processo...
Vol. 3 8-37 MULTIPLE-PROCESSOR MANAGEMENT During initialization, each logical processor is assigned an APIC ID that is stored in the local APIC ID register for each logical processor. If two or more processors supporting Intel Hyper-Threading Technology are present, each logical processor on the sys...
8-38 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.7 INTEL ® HYPER-THREADING TECHNOLOGY ARCHITECTURE Figure 8-4 shows a generalized view of an Intel processor supporting Intel Hyper-Threading Technology, using the original Intel Xeon processor MP as an example. This implementation of the Intel Hyper-Thread...
Vol. 3 8-39 MULTIPLE-PROCESSOR MANAGEMENT 8.7.1 State of the Logical Processors The following features are part of the architectural state of logical processors within Intel 64 or IA-32 processors supporting Intel Hyper-Threading Technology. The features can be subdivided into three groups: • Duplic...
8-40 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT • Debug registers (DR0, DR1, DR2, DR3, DR6, DR7) and the debug control MSRs • Machine check global status (IA32_MCG_STATUS) and machine check capability (IA32_MCG_CAP) MSRs • Thermal clock modulation and ACPI Power management control MSRs • Time stamp counte...
Vol. 3 8-41 MULTIPLE-PROCESSOR MANAGEMENT gives software a consistent view of memory, independent of the processor on which it is running. See Section 11.11, “Memory Type Range Registers (MTRRs),” for infor-mation on setting up MTRRs. 8.7.4 Page Attribute Table (PAT) Each logical processor has its o...
8-42 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.7.7 Performance Monitoring Counters Performance counters and their companion control MSRs are shared between the logical processors within a processor core for processors based on Intel NetBurst microarchitecture. As a result, software must manage the use ...
Vol. 3 8-43 MULTIPLE-PROCESSOR MANAGEMENT 8.7.11 MICROCODE UPDATE Resources In an Intel processor supporting Intel Hyper-Threading Technology, the microcode update facilities are shared between the logical processors; either logical processor can initiate an update. Each logical processor has its ow...
8-46 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.8 MULTI-CORE ARCHITECTURE This section describes the architecture of Intel 64 and IA-32 processors supporting dual-core and quad-core technology. The discussion is applicable to the Intel Pentium processor Extreme Edition, Pentium D, Intel Core Duo, Intel ...
Vol. 3 8-47 MULTIPLE-PROCESSOR MANAGEMENT 8.8.3 Performance Monitoring Counters Performance counters and their companion control MSRs are shared between two logical processors sharing a processor core if the processor core supports Intel Hyper-Threading Technology and is based on Intel NetBurst micr...
8-48 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT provided for each logical processors (see Section 8.7, “Intel ® Hyper-Threading Tech- nology Architecture,” and Section 8.8, “Multi-Core Architecture”). From a software programming perspective, control transfer of processor operation is managed at the granul...
Vol. 3 8-49 MULTIPLE-PROCESSOR MANAGEMENT If the processor supports CPUID leaf 0BH, the 32-bit APIC ID can represent cluster plus several levels of topology within the physical processor package. The exact number of hierarchical levels within a physical processor package must be enumer-ated through ...
8-50 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.9.2 Hierarchical Mapping of CPUID Extended Topology Leaf CPUID leaf 0BH provides enumeration parameters for software to identify each hier-archy of the processor topology in a deterministic manner. Each hierarchical level of the topology starting from the ...
Vol. 3 8-51 MULTIPLE-PROCESSOR MANAGEMENT For m = 0, m < N, m ++;{ cumulative_width[m] = CPUID.(EAX=0BH, ECX= m): EAX[4:0]; }BitWidth[0] = cumulative_width[0];For m = 1, m < N, m ++; BitWidth[m] = cumulative_width[m] - cumulative_width[m-1]; Currently, only the following encoding of hierarchic...
8-52 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Table 8-2 shows the initial APIC IDs for a hypothetical situation with a dual processor system. Each physical package providing two processor cores, and each processor core also supporting Intel Hyper-Threading Technology. Figure 8-7. Topological Relationshi...
Vol. 3 8-53 MULTIPLE-PROCESSOR MANAGEMENT 8.9.3.1 Hierarchical ID of Logical Processors with x2APIC ID Table 8-3 shows an example of possible x2APIC ID assignments for a dual processor system that support x2APIC. Each physical package providing four processor cores, and each processor core also supp...
8-54 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.9.4 Algorithm for Three-Level Mappings of APIC_ID Software can gather the initial APIC_IDs for each logical processor supported by the operating system at runtime 5 and extract identifiers corresponding to the three levels of sharing topology (package, cor...
Vol. 3 8-57 MULTIPLE-PROCESSOR MANAGEMENT int DeriveCore_Mask_Offsets (void){ if (!HWMTSupported()) return -1; execute cpuid with eax = 11, ECX = 0; while( ECX[15:8] ) { // level type encoding is valid If (returned level type encoding in ECX[15:8] matches CORE) { Mask_Core_shift = EAX[4:0]; // neede...
8-58 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT unsigned char MaxLPIDsPerPackage(void){ if (!HWMTSupported()) return 1; execute cpuid with eax = 1 store returned value of ebxreturn (unsigned char) ((reg_ebx & NUM_LOGICAL_BITS) >> 16); } b. Find the size of address space for processor cores in a ...
8-60 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Software must not assume local APIC_ID values in an MP system are consecutive. Non-consecutive local APIC_IDs may be the result of hardware configurations or debug features implemented in the BIOS or OS.An identifier for each hierarchical level can be extrac...
8-64 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT }if (i == CoreNum) { //Did not match any bucket, start new bucketCoreIDBucket[i] = PackageID[ProcessorNum] | CoreID[ProcessorNum];CoreProcessorMask[i] = ProcessorMask;CoreNum++; } }// CoreNum has the number of cores started in the OS// CoreProcessorMask[] ar...
Vol. 3 8-67 MULTIPLE-PROCESSOR MANAGEMENT Power management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause the monitor event pending flag to be cleared. Faults will not cause the monitor event pending flag to be cleared.Software should not allow for volu...
8-68 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT the two parameters should default to be the same (the size of the monitor triggering area is the same as the system coherence line size).Based on the monitor line sizes returned by the CPUID, the OS should dynamically allocate structures with appropriate pad...
8-72 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT { MONITOR WorkQueue // Setup of eax with WorkQueue LinearAddress, // ECX, EDX = 0 IF (WorkQueue != 0) THEN { STIMWAIT // EAX, ECX = 0 } } 8.10.6.5 Guidelines for Scheduling Threads on Logical Processors Sharing Execution Resources Because the logical process...
Vol. 3 8-73 MULTIPLE-PROCESSOR MANAGEMENT • A high resolution timer within the processor (such as, the local APIC timer or the time-stamp counter). For additional information, see the Intel® 64 and IA-32 Architectures Optimization Reference Manual. 8.10.6.7 Place Locks and Semaphores in Aligned, 128...
Vol. 3 9-1 CHAPTER 9 PROCESSOR MANAGEMENT AND INITIALIZATION This chapter describes the facilities provided for managing processor wide functions and for initializing the processor. The subjects covered include: processor initializa-tion, x87 FPU initialization, processor configuration, feature dete...
9-2 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION The software-initialization code performs all system-specific initialization of the BSP or primary processor and the system logic.At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or secondary) processor to enab...
Vol. 3 9-5 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.3 Model and Stepping Information Following a hardware reset, the EDX register contains component identification and revision information (see Figure 9-2). For example, the model, family, and processor type returned for the first processor in the...
9-6 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.4 First Instruction Executed The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 bytes below the processor’s uppermost physical address. The EPROM containing ...
Vol. 3 9-7 PROCESSOR MANAGEMENT AND INITIALIZATION The EM flag determines whether floating-point instructions are executed by the x87 FPU (EM is cleared) or a device-not-available exception (#NM) is generated for all floating-point instructions so that an exception handler can emulate the floating-p...
9-8 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION • It allows x87 FPU code to run on an IA-32 processor that has neither an integrated x87 FPU nor is connected to an external math coprocessor, by using a floating-point emulator. • It allows floating-point code to be executed using a special or nons...
9-10 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION all the MTRRs must be cleared to 0, which selects the uncached (UC) memory type. See Section 11.11, “Memory Type Range Registers (MTRRs),” for detailed informa-tion on the MTRRs. 9.6 INITIALIZING SSE/SSE2/SSE3/SSSE3 EXTENSIONS For processors that c...
Vol. 3 9-11 PROCESSOR MANAGEMENT AND INITIALIZATION mode. The protected-mode data structures that must be loaded are described in Section 9.8, “Software Initialization for Protected-Mode Operation.” 9.7.1 Real-Address Mode IDT In real-address mode, the only system data structure that must be loaded ...
9-12 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION modules into memory to support reliable operation of the processor in protected mode. These data structures include the following: • A IDT. • A GDT. • A TSS. • (Optional) An LDT. • If paging is to be used, at least one page directory and one page t...
Vol. 3 9-13 PROCESSOR MANAGEMENT AND INITIALIZATION descriptors in the GDT. Some operating systems allocate new segments and LDTs as they are needed. This provides maximum flexibility for handling a dynamic program-ming environment. However, many operating systems use a single LDT for all tasks, all...
9-14 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.4 Initializing Multitasking If the multitasking mechanism is not going to be used and changes between privilege levels are not allowed, it is not necessary load a TSS into memory or to initialize the task register.If the multitasking mechanism ...
Vol. 3 9-15 PROCESSOR MANAGEMENT AND INITIALIZATION following instructions must be located in an identity-mapped page (until such time that a branch to non-identity mapped pages can be effected). 64-bit mode paging tables must be located in the first 4 GBytes of physical-address space prior to activ...
9-16 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.5.3 64-bit Mode and Compatibility Mode Operation IA-32e mode uses two code segment-descriptor bits (CS.L and CS.D, see Figure 3-8) to control the operating modes after IA-32e mode is initialized. If CS.L = 1 and CS.D = 0, the processor is runni...
Vol. 3 9-17 PROCESSOR MANAGEMENT AND INITIALIZATION from 64-bit mode through compatibility mode to legacy or real mode and then back through compatibility mode to 64-bit mode. 9.9 MODE SWITCHING To use the processor in protected mode after hardware or software reset, a mode switch must be performed ...
9-18 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 7. If a local descriptor table is going to be used, execute the LLDT instruction to load the segment selector for the LDT in the LDTR register. 8. Execute the LTR instruction to load the task register with a segment selector to the initial protecte...
Vol. 3 9-19 PROCESSOR MANAGEMENT AND INITIALIZATION 4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor containing the following values, which are appropriate for real-address mode:— Limit = 64 KBytes (0FFFFH)— Byte granular (G = 0)— Expand up (E = 0)— Writable (W = 1)—...
Vol. 3 9-21 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-3. Processor State After Reset Table 9-4. Main Initialization Steps in STARTUP.ASM Source Listing STARTUP.ASM Line Numbers Description From To 157 157 Jump (short) to the entry code in the EPROM 162 169 Construct a temporary GDT in RAM wit...
9-22 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.1 Assembler Usage In this example, the Intel assembler ASM386 and build tools BLD386 are used to assemble and build the initialization code module. The following assumptions are used when using the Intel ASM386 and BLD386 tools. • The ASM386 w...
Vol. 3 9-23 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.2 STARTUP.ASM Listing Example 9-1 provides high-level sample code designed to move the processor into protected mode. This listing does not include any opcode and offset information. Example 9-1. STARTUP.ASM MS-DOS* 5.0(045-N) 386(TM) MACRO AS...
9-34 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION CODE SEGMENT ER use32 PUBLIC main_start: nop nop nop CODE ENDS END main_start, ds:data, ss:stack 9.10.4 Supporting Files The batch file shown in Example 9-3 can be used to assemble the source code files STARTUP.ASM and MAIN.ASM and build the final ...
Vol. 3 9-35 PROCESSOR MANAGEMENT AND INITIALIZATION TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ( 10: PROTECTED_MODE_TASK , startup.startup_code , startup.startup_data , main_module.data , main_module.code , main_module.stack ) ), IDT ( LOCATION = IDT_EPROM ); MEMORY ( RESERVE = (0..3FFFH -- Area for...
9-36 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11 MICROCODE UPDATE FACILITIES The Pentium 4, Intel Xeon, and P6 family processors have the capability to correct errata by loading an Intel-supplied data block into the processor. The data block is called a microcode update. This section describ...
Vol. 3 9-37 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.1 Microcode Update A microcode update consists of an Intel-supplied binary that contains a descriptive header and data. No executable code resides within the update. Each microcode update is tailored for a specific list of processor signatures...
9-38 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION NOTE The optional extended signature table is supported starting with processor family 0FH, model 03H. . Table 9-6. Microcode Update Field Definitions Field Name Offset (bytes) Length (bytes) Description Header Version 0 4 Version number of the upd...
Vol. 3 9-41 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.2 Optional Extended Signature Table The extended signature table is a structure that may be appended to the end of the encrypted data when the encrypted data only supports a single processor signature (optional case). The extended signature ta...
9-42 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION a processor signature embedded in the microcode update with the processor signa-ture returned by CPUID will cause the BIOS to reject the update.Example 9-5 shows how to check for a valid processor signature match between the processor and microcode...
9-44 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION } Else { // // Assume the Data Size has been used to calculate the // location of Update.ProcessorSignature[N] and a match // on Update.ProcessorSignature[N] has already succeeded // If (Update.ProcessorFlags[n] & Flag) { Load Update } } } 9.11...
Vol. 3 9-47 PROCESSOR MANAGEMENT AND INITIALIZATION If processor core supports Intel Hyper-Threading Technology, the guideline described in Section 9.11.6.3 also applies. 9.11.6.5 Update Loader Enhancements The update loader presented in Section 9.11.6, “Microcode Update Loader,” is a minimal implem...
9-50 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION There are no optional functions. BIOS must load the appropriate update for each processor during system initialization.A Header Version of an update block containing the value 0FFFFFFFFH indicates that the update block is unused and available for s...
9-52 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION } } NOTES The platform Id bits in IA32_PLATFORM_ID are encoded as a three-bit binary coded decimal field. The platform bits in the microcode update header are individually bit encoded. The algorithm must do a translation from one format to the othe...
Vol. 3 9-55 PROCESSOR MANAGEMENT AND INITIALIZATION } // // Compare the Update read to that written // If (Update read != Update written) { Display Diagnostic exit } I ← I + (size of microcode update / 2048) } // // Enable Update Loading, and inform user // Issue the Update Control function with Tas...
9-56 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION In general, each function returns with CF cleared and AH contains the returned status. The general return codes and other constant definitions are listed in Section 9.11.8.9, “Return Codes.”The OEM error field (AL) is provided for the OEM to return...
Vol. 3 9-57 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.8.6 Function 01H—Write Microcode Update Data This function integrates a new microcode update into the BIOS storage device. Table 9-14 lists the parameters and return codes for the function. Table 9-14. Parameters for the Write Update Data Func...
9-58 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION Description The BIOS is responsible for selecting an appropriate update block in the non-volatile storage for storing the new update. This BIOS is also responsible for ensuring the integrity of the information provided by the caller, including auth...
Vol. 3 9-63 PROCESSOR MANAGEMENT AND INITIALIZATION The READ_FAILURE error code returned by this function has meaning only if the control function is implemented in the BIOS NVRAM. The state of this feature (enabled/disabled) can also be implemented using CMOS RAM bits where READ failure errors cann...
Vol. 3 10-1 CHAPTER 10 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The Advanced Programmable Interrupt Controller (APIC), referred to in the following sections as the local APIC, was introduced into the IA-32 processors with the Pentium processor (see Section 19.27, “Advanced Programmable Inte...
10-2 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) interrupt pins (LINT0 and LINT1). The I/O devices may also be connected to an 8259-type interrupt controller that is in turn connected to the processor through one of the local interrupt pins. • Externally connected I/O devices — These in...
10-4 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) also be delivered to the individual processors through the local interrupt pins; however, this mechanism is commonly not used in MP systems. Figure 10-2. Local APICs and I/O APIC When Intel Xeon Processors Are Used in Multiple-Processor S...
Vol. 3 10-5 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IPI mechanism is typically used in MP systems to send fixed interrupts (inter-rupts for a specific vector number) and special-purpose interrupts to processors on the system bus. For example, a local APIC can use an IPI to forward a fi...
10-6 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) forward extendability for future Intel platform innovations. These extensions and modifications are noted in the following sections. 10.4 LOCAL APIC The following sections describe the architecture of the local APIC and how to detect it, ...
10-8 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-1 shows how the APIC registers are mapped into the 4-KByte APIC register space. Registers are 32 bits, 64 bits, or 256 bits in width; all are aligned on 128-bit boundaries. All 32-bit registers should be accessed using 128-bit al...
10-10 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.4.2 Presence of the Local APIC Beginning with the P6 family processors, the presence or absence of an on-chip local APIC can be detected using the CPUID instruction. When the CPUID instruction is executed with a source operand of 1 in...
Vol. 3 10-11 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (MSR address 1BH; see Figure 10-5 ): — When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an IA-32 processor without an on-chip APIC. The CPUID...
10-12 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • APIC Global Enable flag, bit 11 ⎯ Enables or disables the local APIC (see Section 10.4.3, “Enabling or Disabling the Local APIC” ). This flag is available in the Pentium 4, Intel Xeon, and P6 family processors. It is not guaranteed to ...
Vol. 3 10-13 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is alw...
10-14 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC will introduce 32-bit ID; see Section 10.5 . 10.4.7.1 Local APIC State After Power-Up or Reset Following a power-up or RESET of the processor, the state of local APIC and its regis-ters are as follows: • The following registers ar...
Vol. 3 10-15 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • The mask bits for all the LVT entries are set. Attempts to reset these bits will be ignored. • (For Pentium and P6 family processors) The local APIC continues to listen to all bus messages in order to keep its arbitration ID synchroniz...
10-16 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.5 EXTENDED XAPIC (X2APIC) The x2APIC architecture extends the xAPIC architecture (described in Section 9.4) in a backward compatible manner and provides forward extendability for future Intel platform innovations. Specifically, x2APIC...
Vol. 3 10-17 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-2 , “x2APIC operating mode configurations” describe the possible combina- tions of the enable bit (EN - bit 11) and the extended mode bit (EXTD - bit 10) in the IA32_APIC_BASE MSR. Once the local APIC has been switched to x2APIC...
10-18 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 32-bit register. Similarly executing the WRMSR instruction with the APIC register address in ECX, writes bits 0 to 31 of register EAX to bits 0 to 31 of the specified APIC register. If the register is a 64-bit register then bits 0 to 31 ...
10-22 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) to enable BIOS and/or platform firmware to re-configure the x2APIC IDs in some clusters to provide for unique and non-overlapping system wide IDs before config-uring the disconnected components into a single system. 10.5.2 x2APIC Registe...
Vol. 3 10-23 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) field, VM-exit MSR-load address filed, and VM-entry MSR-load address field in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B).The X2APIC MSRs cannot to be loaded and stored on VMX transitions. A VMX transi-tion ...
10-24 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The default value for SVR[bit 12] is clear, indicating that an EOI broadcast will be performed.The support for Directed EOI capability can be detected by means of bit 24 in the Local APIC Version Register. This feature is supported in bo...
10-26 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC After RESET The valid transitions from the xAPIC mode state are: • to the x2APIC mode by setting EXT to 1 (resulting EN=1, EXTD= 1). The physical x2APIC ID (see Figure 10-6 ) is preserved across this transition and the logical x2A...
Vol. 3 10-27 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC Transitions From x2APIC Mode From the x2APIC mode, the only valid x2APIC transition using IA32_APIC_BASE is to the state where the x2APIC is disabled by setting EN to 0 and EXTD to 0. The x2APIC ID (32 bits) and the legacy local x...
10-28 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Support for the x2APIC architecture can be implemented in the local APIC unit. All existing PCI/MSI capable devices and IOxAPIC unit should work with the x2APIC extensions defined in this document. The x2APIC architecture also provides f...
10-30 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.6 HANDLING LOCAL INTERRUPTS The following sections describe facilities that are provided in the local APIC for handling local interrupts. These include: the processor’s LINT0 and LINT1 pins, the APIC timer, the performance-monitoring ...
10-32 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The setup information that can be specified in the registers of the LVT table is as follows:Vector Interrupt vector number. Delivery Mode Specifies the type of interrupt to be sent to the processor. Some delivery modes will only operate ...
Vol. 3 10-33 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Interrupt Input Pin Polarity Specifies the polarity of the corresponding interrupt pin: (0) active high or (1) active low. Remote IRR Flag (Read Only) For fixed mode, level-triggered interrupts; this flag is set when the local APIC accep...
10-36 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) If the ICR is programmed with lowest priority delivery mode then the "Re-directible IPI" bit will be set in x2APIC modes (same as legacy xAPIC behavior) and the inter-rupt will not be processed.Write to the ICR with both lowest p...
Vol. 3 10-37 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The time base for the timer is derived from the processor’s bus clock, divided by the value specified in the divide configuration register.The timer can be configured through the timer LVT entry for one-shot or periodic operation. In one...
10-38 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.6.5 Local Interrupt Acceptance When a local interrupt is sent to the processor core, it is subject to the acceptance criteria specified in the interrupt acceptance flow chart in Figure 10-25 . If the inter- rupt is accepted, it is log...
Vol. 3 10-39 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ICR consists of the following fields. Vector The vector number of the interrupt being sent. Delivery Mode Specifies the type of IPI to be sent. This field is also know as the IPI message type field. 000 (Fixed) Delivers the interrupt...
Vol. 3 10-41 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Destination Mode Selects either physical (0) or logical (1) destination mode (see Section 10.7.2, “Determining IPI Destination” ). Delivery Status (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) There is currently no ...
10-42 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination field set to FH for Pentium and P6 family processors and to FFH for Pentium 4 and Intel Xeon processors. 11: (All Excluding Self) The IPI is sent to all processors in a system with the exception of the processor sending the I...
Vol. 3 10-43 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Self Invalid X Lowest Priority, NMI, INIT, SMI, Start- Up X All Including Self Valid Edge Fixed X All Including Self Invalid 2 Level Fixed X All Including Self Invalid X Lowest Priority, NMI, INIT, SMI, Start- Up X All Excluding Self Val...
10-46 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.7.2 Determining IPI Destination The destination of an IPI can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specifies the destination of an IPI with the following APIC registers and fields...
Vol. 3 10-49 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) lowest priority delivery mode is not supported in cluster mode and must not be configured by software.The hierarchical cluster destination model can be used with Pentium 4, Intel Xeon, P6 family, or Pentium processors. With this model, a...
10-50 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) mode is not supported in the x2APIC mode. Hence the Destination Format Register (DFR) is eliminated in x2APIC mode. The 32-bit logical x2APIC ID field of LDR is partitioned into two sub-fields: • Cluster ID (LDR[31:16]): is the address o...
10-52 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 10-26 ), the IRRV value is the vector number for the highest priority bit that is set in the IRR (see Figure 10-28 ) or 00H (if no IRR bit is set), and the ISRV value ...
Vol. 3 10-53 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The SELF IPI register is a write-only register. A RDMSR instruction with address of the SELF IPI register will raise a GP fault. The handling and prioritization of a self-IPI sent via the SELF IPI register is architec-turally identical t...
10-54 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) priorities of the local APICs by resetting Arb ID register of each agent to its current APIC ID value. (The Pentium 4 and Intel Xeon processors do not implement the Arb ID register.) Section 10.11, “APIC Bus Message Passing Mechanism and...
Vol. 3 10-55 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, the local APIC sets the appropriate bit in the IRR. 4. When interrupts are pe...
Vol. 3 10-59 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Its value in the PPR is computed as follows: IF TPR[7:4] ≥ ISRV[7:4] THEN PPR[7:0] ← TPR[7:0] ELSE PPR[7:4] ← ISRV[7:4] PPR[3:0] ← 0 Here, the ISRV value is the vector number of the highest priority ISR bit that is set, or 00H if no ISR ...
Vol. 3 10-61 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) bit is cleared for edge-triggered interrupts and set for level-triggered interrupts. If a TMR bit is set when an EOI cycle for its corresponding interrupt vector is generated, an EOI message is sent to all I/O APICs. 10.9.5 Signaling Int...
10-62 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • Loading the TPR with a value of 8 (01000B) blocks all interrupts with a priority of 8 or less while allowing all interrupts with a priority of nine or more to be recognized. • Loading the TPR with zero enables all external interrupts. ...
Vol. 3 10-63 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) There are no ordering mechanisms between direct updates of the APIC.TPR and CR8. Operating software should implement either direct APIC TPR updates or CR8 style TPR updates but not mix them. Software can use a serializing instruction (fo...
10-64 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.11 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY, PENTIUM PROCESSORS) The Pentium 4 and Intel Xeon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechanism a...
Vol. 3 10-65 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) the bus regardless of its sender’s arbitration priority, unless more than one APIC issues an EOI message simultaneously. In the latter case, the APICs sending the EOI messages arbitrate using their arbitration priorities.If the APICs are...
10-66 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.12.1 Message Address Register Format The format of the Message Address Register (lower 32-bits) is shown in Figure 10-32. Fields in the Message Address Register are as follows:1. Bits 31-20 — These bits contain a fixed value for inter...
Vol. 3 10-67 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination mode and only the processor in the system that has the matching APIC ID is considered for delivery of that interrupt (this means no re-direction). If RH is 1 and DM is 1, the Destination ID Field is interpreted as in logical ...
10-68 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Reserved fields are not assumed to be any value. Software must preserve their contents on writes. Other fields in the Message Data Register are described below.1. Vector — This 8-bit field contains the interrupt vector associated with th...
11-2 Vol. 3 MEMORY CACHE CONTROL Figure 11-2 shows the cache arrangement of Intel Core i7 processor. Figure 11-2. Cache Structure of the Intel Core i7 Processors Table 11-1. Characteristics of the Caches, TLBs, Store Buffer, and Write Combining Buffer in Intel 64 and IA-32 Processors Cache or Buffer...
Vol. 3 11-7 MEMORY CACHE CONTROL Processors based on Intel Core microarchitectures implement one level of instruction TLB and two levels of data TLB. Intel Core i7 processor provides a second-level unified TLB. The store buffer is associated with the processors instruction execution units. It allows...
11-8 Vol. 3 MEMORY CACHE CONTROL (depending on the write policy currently in force) can also write it out to memory. If the operand is to be written out to memory, it is written first into the store buffer, and then written from the store buffer to memory when the system bus is available. (Note that...
Vol. 3 11-9 MEMORY CACHE CONTROL registers to access UC memory that may have read or write side effects. • Uncacheable (UC-) — Has same characteristics as the strong uncacheable (UC) memory type, except that this memory type can be overridden by programming the MTRRs for the WC memory type. This mem...
Vol. 3 11-11 MEMORY CACHE CONTROL 11.3.1 Buffering of Write Combining Memory Locations Writes to the WC memory type are not cached in the typical sense of the word cached. They are retained in an internal write combining buffer (WC buffer) that is separate from the internal L1, L2, and L3 caches and...
11-12 Vol. 3 MEMORY CACHE CONTROL The WC memory type is weakly ordered by definition. Once the eviction of a WC buffer has started, the data is subject to the weak ordering semantics of its defini-tion. Ordering is not maintained between the successive allocation/deallocation of WC buffers (for exam...
Vol. 3 11-13 MEMORY CACHE CONTROL large data structure should be marked as uncacheable, or reading it will evict cached lines that the processor will be referencing again. A similar example would be a write-only data structure that is written to (to export the data to another agent), but never read ...
11-14 Vol. 3 MEMORY CACHE CONTROL The L1 instruction cache in P6 family processors implements only the “SI” part of the MESI protocol, because the instruction cache is not writable. The instruction cache monitors changes in the data cache to maintain consistency between the caches when instructions ...
Vol. 3 11-15 MEMORY CACHE CONTROL 11.5.1 Cache Control Registers and Bits Figure 11-3 depicts cache-control mechanisms in IA-32 processors. Other than for the matter of memory address space, these work the same in Intel 64 processors.The Intel 64 and IA-32 architectures provide the following cache-c...
11-18 Vol. 3 MEMORY CACHE CONTROL • NW flag, bit 29 of control register CR0 — Controls the write policy for system memory locations (see Section 2.5, “Control Registers”). If the NW and CD flags are clear, write-back is enabled for the whole of system memory, but may be restricted for individual pag...
11-20 Vol. 3 MEMORY CACHE CONTROL page-table entries) permit caching in an external L2 cache to be controlled on a page-by-page basis, consistent with the control exercised on the L1 cache of these processors. The P6 and more recent processor families do not provide these pins because the L2 cache i...
Vol. 3 11-21 MEMORY CACHE CONTROL When normal caching is in effect, the effective memory type shown in Table 11-6 is determined using the following rules:1. If the PCD and PWT attributes for the page are both 0, then the effective memory type is identical to the MTRR-defined memory type. 2. If the P...
11-22 Vol. 3 MEMORY CACHE CONTROL 11.5.2.2 Selecting Memory Types for Pentium III and More Recent Processor Families The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Intel Core Solo, Pentium M, Pentium 4, Intel Xeon, and Pentium III processors use the PAT to select effective page-level memory types...
Vol. 3 11-23 MEMORY CACHE CONTROL 11.5.2.3 Writing Values Across Pages with Different Memory Types If two adjoining pages in memory have different memory types, and a word or longer operand is written to a memory location that crosses the page boundary between those two pages, the operand might be w...
11-24 Vol. 3 MEMORY CACHE CONTROL 11.5.3 Preventing Caching To disable the L1, L2, and L3 caches after they have been enabled and have received cache fills, perform the following steps:1. Enter the no-fill cache mode. (Set the CD flag in control register CR0 to 1 and the NW flag to 0. 2. Flush all c...
Vol. 3 11-25 MEMORY CACHE CONTROL 11.5.4 Disabling and Enabling the L3 Cache On processors based on Intel NetBurst microarchitecture, the third-level cache can be disabled by bit 6 of the IA32_MISC_ENABLE MSR. The third-level cache disable flag (bit 6 of the IA32_MISC_ENABLE MSR) allows the L3 cache...
11-26 Vol. 3 MEMORY CACHE CONTROL The CLFLUSH instruction allow selected cache lines to be flushed from memory. This instruction give a program the ability to explicitly free up cache space, when it is known that cached section of system memory will not be accessed in the near future.The non-tempora...
11-28 Vol. 3 MEMORY CACHE CONTROL To avoid problems related to implicit caching, the operating system must explicitly invalidate the cache when changes are made to cacheable data that the cache coher-ency mechanism does not automatically handle. This includes writes to dual-ported or physically alia...
Vol. 3 11-29 MEMORY CACHE CONTROL 11.9 INVALIDATING THE TRANSLATION LOOKASIDE BUFFERS (TLBS) The processor updates its address translation caches (TLBs) transparently to soft-ware. Several mechanisms are available, however, that allow software and hardware to invalidate the TLBs either explicitly or...
11-30 Vol. 3 MEMORY CACHE CONTROL The discussion of write ordering in Section 8.2, “Memory Ordering,” gives a detailed description of the operation of the store buffer. 11.11 MEMORY TYPE RANGE REGISTERS (MTRRS) The following section pertains only to the P6 and more recent processor families.The memo...
Vol. 3 11-31 MEMORY CACHE CONTROL Reserved* 03H Write-through (WT) 04H Write-protected (WP) 05H Writeback (WB) 06H Reserved* 7H through FFH NOTE: * Use of these encodings results in a general-protection exception (#GP). Figure 11-4. Mapping Physical Memory With MTRRs Table 11-8. Memory Types That Ca...
Vol. 3 11-33 MEMORY CACHE CONTROL 11.11.2 Setting Memory Ranges with MTRRs The memory ranges and the types of memory specified in each range are set by three groups of registers: the IA32_MTRR_DEF_TYPE MSR, the fixed-range MTRRs, and the variable range MTRRs. These registers can be read and written ...
11-36 Vol. 3 MEMORY CACHE CONTROL — The width of the PhysMask field depends on the maximum physical address size supported by the processor. CPUID.80000008H reports the maximum physical address size supported by the processor. If CPUID.80000008H is not available, software may assume that the process...
11-38 Vol. 3 MEMORY CACHE CONTROL Before attempting to access these SMRR registers, software must test bit 11 in the IA32_MTRRCAP register. If SMRR is not supported, reads from or writes to registers cause general-protection exceptions.When the valid flag in the IA32_SMRR_PHYSMASK MSR is 1, accesses...
11-40 Vol. 3 MEMORY CACHE CONTROL IA32_MTRR_PHYSBASE5 = 0000 0000 A000 0001HIA32_MTRR_PHYSMASK5 = 0000 000F FF80 0800H Caches A0000000-A0800000 as WC type. This MTRR setup uses the ability to overlap any two memory ranges (as long as the ranges are mapped to WB and UC memory types) to minimize the n...
Vol. 3 11-41 MEMORY CACHE CONTROL 11.11.4 Range Size and Alignment Requirement A range that is to be mapped to a variable-range MTRR must meet the following “power of 2” size and alignment rules:1. The minimum range size is 4 KBytes and the base address of the range must be on at least a 4-KByte bou...
11-42 Vol. 3 MEMORY CACHE CONTROL the MTRRs according to known types of memory, including memory on devices that it auto-configures. Initialization is expected to occur prior to booting the operating system.See Section 11.11.8, “MTRR Considerations in MP Systems,” for information on initializing MTR...
11-46 Vol. 3 MEMORY CACHE CONTROL END The physical address to variable range mapping algorithm in the MemTypeSet func-tion detects conflicts with current variable range registers by cycling through them and determining whether the physical address in question matches any of the current ranges. Durin...
11-48 Vol. 3 MEMORY CACHE CONTROL The requirement that all 4-KByte ranges in a large page are of the same memory type implies that large pages with different memory types may suffer a performance penalty, since they must be marked with the lowest common denominator memory type.The Pentium 4, Intel X...
Vol. 3 11-49 MEMORY CACHE CONTROL 11.12.2 IA32_PAT MSR The IA32_PAT MSR is located at MSR address 277H (see to Appendix B, “Model-Specific Registers (MSRs),” and this address will remain at the same address on future IA-32 processors that support the PAT feature. Figure 11-9. shows the format of the...
11-50 Vol. 3 MEMORY CACHE CONTROL 11.12.3 Selecting a Memory Type from the PAT To select a memory type for a page from the PAT, a 3-bit index made up of the PAT, PCD, and PWT bits must be encoded in the page-table or page-directory entry for the page. Table 11-11 shows the possible encodings of the ...
Vol. 3 12-1 CHAPTER 12 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING This chapter describes those features of the Intel ® MMX™ technology that must be considered when designing or enhancing an operating system to support MMX tech-nology. It covers MMX instruction set emulation, the MMX state, aliasing...
12-2 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING result, the MMX register mapping is fixed and is not affected by value in the Top Of Stack (TOS) field in the floating-point status word (bits 11 through 13). When a value is written into an MMX register using an MMX instruction, the value also...
Vol. 3 12-3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • When the EMMS instruction is executed, each tag field in the x87 FPU tag word is set to 11B (empty). • Each time an MMX instruction is executed, the TOS value is set to 000B. Execution of MMX instructions does not affect the other bits in the...
12-4 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING 12.3 SAVING AND RESTORING THE MMX STATE AND REGISTERS Because the MMX registers are aliased to the x87 FPU data registers, the MMX state can be saved to memory and restored from memory as follows: • Execute an FSAVE, FNSAVE, or FXSAVE instructi...
Vol. 3 12-5 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • Execute eight MOVQ instructions to save the contents of the MMX0 through MMX7 registers to memory. An EMMS instruction may then (optionally) be executed to clear the MMX state in the x87 FPU. • Execute eight MOVQ instructions to read the save...
12-6 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • System exceptions:— Invalid Opcode (#UD), if the EM flag in control register CR0 is set when an MMX instruction is executed (see Section 12.1, “Emulation of the MMX Instruction Set”). — Device not available (#NM), if an MMX instruction is exe...
Vol. 3 12-7 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING When the TOS equals 2 (case B in Figure 12-2), ST0 points to the physical location R2. MM0 maps to ST6, MM1 maps to ST7, MM2 maps to ST0, and so on. Figure 12-2. Mapping of MMX Registers to x87 FPU Data Register Stack MM0 MM1 MM2 MM3 MM4 MM5 MM...
Vol. 3 13-1 CHAPTER 13 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR EXTENDED STATES This chapter describes system programming features for instruction set extensions operating on the processor state extension known as the SSE state (XMM registers, MXCSR) and for processor extended...
Vol. 3 13-3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND To use POPCNT instruction, software must check CPUID.1:ECX.POPCNT[bit 23] = 1 13.1.3 Checking for Support for the FXSAVE and FXRSTOR Instructions A separate check must be made to insure that the processor supports FXSAVE and FXRSTOR. ...
Vol. 3 13-5 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND The SIMD floating-point exception mask bits (bits 7 through 12), the flush-to-zero flag (bit 15), the denormals-are-zero flag (bit 6), and the rounding control field (bits 13 and 14) in the MXCSR register should be left in their defau...
Vol. 3 13-7 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND — Device not available (#NM). This exception is generated by executing a SSE/SSE2/SSE3/SSSE3/SSE4 instruction when the TS flag (bit 3) of CR0 is set to 1. Other exceptions can occur indirectly due to faulty execution of the above exce...
Vol. 3 13-9 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND • Execute a LDMXCSR instruction to restore the state of the MXCSR register from memory. 13.4 SAVING THE SSE/SSE2/SSE3/SSSE3/SSE4 STATE ON TASK OR CONTEXT SWITCHES When switching from one task or context to another, it is often necessa...
13-10 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR when a suspended task is resumed (using an FXRSTOR instruction). Here, the x87 FPU/MMX/SSE/SSE2/SSE3/SSE4 state must be saved as part of the task state. This approach is appropriate for preemptive multitasking operating sys...
Vol. 3 13-11 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND The TS flag can be set either explicitly (by executing a MOV instruction to control register CR0) or implicitly (using the IA-32 architecture’s native task switching mech-anism). When the native task switching mechanism is used, the ...
13-12 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR If a new task attempts to access an x87 FPU, MMX, XMM, or MXCSR register while the TS flag is set to 1, a device-not-available exception (#NM) is generated. The device-not-available exception handler executes the following ...
Vol. 3 13-13 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND — CPUID leaf function 0DH enumerates the list of processor states (including legacy x87 FPU, SSE states and processor extended states), the offset and size of individual save area for each processor extended state. • Control register...
13-14 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR The XSAVE header is 64 bytes in length and must be aligned on 64 byte boundary. Therefore, the XSAVE/XRSTOR region must be aligned on 64-byte boundary. The format of the header is as follows (see Table 13-3): The value of e...
Vol. 3 13-15 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND enabled), a value of "1" in the corresponding bit of HEADER.XSTATE_BV causes the processor state to be updated with contents of the save area read from the memory image. A value of "0" in HEADER.XSTATE_BV causes the p...
Vol. 3 13-17 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND 13.8 DETECTION, ENUMERATION, ENABLING PROCESSOR EXTENDED STATE SUPPORT An OS can determine if the XSAVE/XRSTOR/XGETBV/XSETBV instructions and the XFEATURE_ENABLED_MASK register (XCR0) are available in the processor by checking the va...
Vol. 3 13-19 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND If all three requirements are met, applications can use the target new instruction set extensions. If any of the above requirements are not met, an attempt to execute an instruction operating on a processor extended state correspondi...
Vol. 3 14-1 CHAPTER 14 POWER AND THERMAL MANAGEMENT This chapter describes facilities of Intel 64 and IA-32 architecture used for power management and thermal monitoring. 14.1 ENHANCED INTEL SPEEDSTEP ® TECHNOLOGY Enhanced Intel SpeedStep ® Technology was introduced in the Pentium M processor; it is...
14-2 Vol. 3 POWER AND THERMAL MANAGEMENT tools can access model-specific events and report the occurrences of state transitions. 14.2 P-STATE HARDWARE COORDINATION The Advanced Configuration and Power Interface (ACPI) defines performance states (P-state) that are used facilitate system software’s ab...
14-4 Vol. 3 POWER AND THERMAL MANAGEMENT // This example does not cover the additional logic or algorithms// necessary to coordinate multiple logical processors to a target P-state. TargetPstate = FindPstate(PercentPerformance); if (TargetPstate != currentPstate) { SetPState(TargetPstate); } // WRMS...
Vol. 3 14-5 POWER AND THERMAL MANAGEMENT corresponding enable mechanism is activated, the headroom is available and certain criteria are met. • The opportunistic processor performance operation is generally transparent to most application software. • System software (BIOS and Operating system) must ...
14-8 Vol. 3 POWER AND THERMAL MANAGEMENT • When the OS timer service transfers control, the application can use RDPMC (with ECX = 4000_0001H) to read IA32_PERF_FIXED_CTR1 (MSR address 30AH) to record the unhalted core clocktick (UCC) value; followed by RDPMC (ECX=4000_0002H) to read IA32_PERF_FIXED_...
Vol. 3 14-9 POWER AND THERMAL MANAGEMENT Software can program the lowest four bits of IA32_ENERGY_PERF_BIAS MSR with a value from 0 - 15. The values represent a sliding scale, where a value of 0 (the default reset value) corresponds to a hint preference for highest performance and a value of 15 corr...
14-10 Vol. 3 POWER AND THERMAL MANAGEMENT Reference, A-M,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A).If CPUID.05H.ECX[Bit 1] = 1, the target processor supports using interrupts as break-events for MWAIT, even when interrupts are disabled. Use this feature to measur...
Vol. 3 14-11 POWER AND THERMAL MANAGEMENT consumption; this is in addition to the reduction offered by automatic thermal monitoring mechanisms. 4. On-die digital thermal sensor and interrupt mechanisms permit the OS to manage thermal conditions natively without relying on BIOS or other system board ...
14-12 Vol. 3 POWER AND THERMAL MANAGEMENT 14.5.1 Catastrophic Shutdown Detector P6 family processors introduced a thermal sensor that acts as a catastrophic shut-down detector. This catastrophic shutdown detector was also implemented in Pentium 4, Intel Xeon and Pentium M processors. It is always en...
Vol. 3 14-13 POWER AND THERMAL MANAGEMENT Support for TM2 is indicated by CPUID.1:ECX.TM2[bit 8] = 1. 14.5.2.3 Two Methods for Enabling TM2 On processors with CPUID family/model/stepping signature encoded as 0x69n or 0x6Dn (early Pentium M processors), TM2 is enabled if the TM_SELECT flag (bit 16) o...
14-14 Vol. 3 POWER AND THERMAL MANAGEMENT 14.5.2.4 Performance State Transitions and Thermal Monitoring If the thermal control circuitry (TCC) for thermal monitor (TM1/TM2) is active, writes to the IA32_PERF_CTL will effect a new target operating point as follows: • If TM1 is enabled and the TCC is ...
14-16 Vol. 3 POWER AND THERMAL MANAGEMENT interrupt enable flags in the IA32_THERM_INTERRUPT MSR are cleared (interrupts are disabled) and the thermal LVT entry is set to mask interrupts. This interrupt should be handled either by the operating system or system management mode (SMM) code.Note that t...
Vol. 3 14-17 POWER AND THERMAL MANAGEMENT The IA32_CLOCK_MODULATION MSR contains the following flag and field used to enable software-controlled clock modulation and to select the clock modulation duty cycle: • On-Demand Clock Modulation Enable, bit 4 — Enables on-demand software controlled clock mo...
14-18 Vol. 3 POWER AND THERMAL MANAGEMENT clock modulation at the duty cycle specified by TM1 takes precedence, regardless of the setting of the on-demand clock modulation duty cycle.For Hyper-Threading Technology enabled processors, the IA32_CLOCK_MODULATION register is duplicated for each logical ...
Vol. 3 15-1 CHAPTER 15 MACHINE-CHECK ARCHITECTURE This chapter describes the machine-check architecture and machine-check exception mechanism found in the Pentium 4, Intel Xeon, and P6 family processors. See Chapter 6, “Interrupt 18—Machine-Check Exception (#MC),” for more information on machine-che...
15-2 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.2 COMPATIBILITY WITH PENTIUM PROCESSOR The Pentium 4, Intel Xeon, and P6 family processors support and extend the machine-check exception mechanism introduced in the Pentium processor. The Pentium processor reports the following machine-check errors: • data ...
Vol. 3 15-3 MACHINE-CHECK ARCHITECTURE Each error-reporting bank is associated with a specific hardware unit (or group of hardware units) in the processor. Use RDMSR and WRMSR to read and to write these registers. 15.3.1 Machine-Check Global Control MSRs The machine-check global control MSRs include...
15-6 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.3.1.3 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR is present if the capability flag MCG_CTL_P is set in the IA32_MCG_CAP MSR. IA32_MCG_CTL controls the reporting of machine-check exceptions. If present, writing 1s to this register enables machine-check features an...
Vol. 3 15-7 MACHINE-CHECK ARCHITECTURE encoding of 06H_1AH and onward ): the operating system or executive software must not modify the contents of the IA32_MC0_CTL MSR. This MSR is internally aliased to the EBL_CR_POWERON MSR and controls platform-specific error handling features. System specific f...
Vol. 3 15-11 MACHINE-CHECK ARCHITECTURE In Table 15-2, the values in the two left-most columns are IA32_MCi_STATUS[54:53]. If a second event overwrites a previously posted event, the information (as guarded by individual valid bits) in the MCi bank is entirely from the second event. Similarly, if a ...
Vol. 3 15-13 MACHINE-CHECK ARCHITECTURE • Recoverable Address LSB (bits 5:0): The lowest valid recoverable address bit. Indicates the position of the least significant bit (LSB) of the recoverable error address. For example, if the processor logs bits [43:9] of the address, the LSB sub-field in IA32...
15-14 Vol. 3 MACHINE-CHECK ARCHITECTURE When IA32_MCG_CAP[10] = 1, the IA32_MCi_CTL2 MSR for each bank exists, i.e. reads and writes to these MSR are supported. However, signaling interface for corrected MC errors may not be supported in all banks. The layout of IA32_MCi_CTL2 is shown in Figure 15-8...
Vol. 3 15-15 MACHINE-CHECK ARCHITECTURE 15.3.2.6 IA32_MCG Extended Machine Check State MSRs The Pentium 4 and Intel Xeon processors implement a variable number of extended machine-check state MSRs. The MCG_EXT_P flag in the IA32_MCG_CAP MSR indicates the presence of these extended registers, and the...
15-16 Vol. 3 MACHINE-CHECK ARCHITECTURE Table 15-5. Extended Machine Check State MSRs In Processors With Support For Intel 64 Architecture MSR Address Description IA32_MCG_RAX 180H Contains state of the RAX register at the time of the machine- check error. IA32_MCG_RBX 181H Contains state of the RBX...
Vol. 3 15-17 MACHINE-CHECK ARCHITECTURE When a machine-check error is detected on a Pentium 4 or Intel Xeon processor, the processor saves the state of the general-purpose registers, the R/EFLAGS register, and the R/EIP in these extended machine-check state MSRs. This information can be used by a de...
15-18 Vol. 3 MACHINE-CHECK ARCHITECTURE processor; the handler must be written to interpret P5_MC_TYPE encodings correctly. 15.4 ENHANCED CACHE ERROR REPORTING Starting with Intel Core Duo processors, cache error reporting was enhanced. In earlier Intel processors, cache status was based on the numb...
Vol. 3 15-19 MACHINE-CHECK ARCHITECTURE beyond those of threshold-based error reporting (Section 15.4). With threshold-based error reporting, software is limited to use periodic polling to query the status of hardware corrected MC errors. CMCI provides a signaling mechanism to deliver a local interr...
15-20 Vol. 3 MACHINE-CHECK ARCHITECTURE CMCI interrupt delivery is configured by writing to the LVT CMCI register entry in the local APIC register space at default address of APIC_BASE + 2F0H. A CMCI interrupt can be delivered to more than one logical processors if multiple logical processors are af...
Vol. 3 15-21 MACHINE-CHECK ARCHITECTURE • Delivery status, bits 12 — It is a read-only bit that, when set, indicates that an interrupt from this source has been delivered to the processor core, but has not yet been accepted. • Mask, bits 16 — When set, inhibits reception of the interrupt. (Unlike th...
15-22 Vol. 3 MACHINE-CHECK ARCHITECTURE b. Each thread examines IA32_MCi_CTL2[30] indicator for each bank to determine if another thread has already claimed ownership of that bank. • If IA32_MCi_CTL2[30] had been set by another thread. This thread can not own bank i and should proceed to step b. and...
Vol. 3 15-23 MACHINE-CHECK ARCHITECTURE • Write 7FFFH to IA32_MCi_CTL2[15:0], • Read back IA32_MCi_CTL2[15:0], the lower 15 bits (14:0) is the maximum threshold supported by the processor. b. Increase the threshold to a value below the maximum value discovered using step a. 15.5.2.3 CMCI Interrupt H...
15-24 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.6.1 Detection of Software Error Recovery Support Software must use bit 24 of IA32_MCG_CAP (MCG_SER_P) to detect the presence of software error recovery support (see Figure 15-2). When IA32_MCG_CAP[24] is set, this indicates that the processor supports soft-...
Vol. 3 15-25 MACHINE-CHECK ARCHITECTURE • S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank and system software needs to check the AR flag and the MCA error code fields in the IA32_MCi_STATUS register to identify...
Vol. 3 15-27 MACHINE-CHECK ARCHITECTURE 15.6.4 UCR Error Overwrite Rules In general, the overwrite rules are as follows: • UCR errors will overwrite corrected errors. • Uncorrected (PCC=1) errors overwrite UCR (PCC=0) errors. • UCR errors are not written over previous UCR errors. • Corrected errors ...
15-28 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.7 MACHINE-CHECK AVAILABILITY The machine-check architecture and machine-check exception (#MC) are model-specific features. Software can execute the CPUID instruction to determine whether a processor implements these features. Following the execution of the ...
Vol. 3 15-29 MACHINE-CHECK ARCHITECTURE (* enables all MCA features *) FI (* Determine number of error-reporting banks supported *) COUNT ← IA32_MCG_CAP.Count; MAX_BANK_NUMBER ← COUNT - 1; IF (Processor Family is 6H and Processor EXTMODEL:MODEL is less than 1AH)THEN (* Enable logging of all errors e...
15-30 Vol. 3 MACHINE-CHECK ARCHITECTURE also write a 16-bit model-specific error code in the IA32_MCi_STATUS register depending on the implementation of the machine-check architec- ture of the processor.The MCA error codes are architecturally defined for Intel 64 and IA-32 processors. To determine t...
Vol. 3 15-31 MACHINE-CHECK ARCHITECTURE 15.9.2 Compound Error Codes Compound error codes describe errors related to the TLBs, memory, caches, bus and interconnect logic, and internal timer. A set of sub-fields is common to all of compound errors. These sub-fields describe the type of access, level i...
15-32 Vol. 3 MACHINE-CHECK ARCHITECTURE The behavior of error filtering after crossing the yellow threshold is model- specific. 15.9.2.2 Transaction Type (TT) Sub-Field The 2-bit TT sub-field (Table 15-10) indicates the type of transaction (data, instruction, or generic). The sub-field applies to th...
Vol. 3 15-33 MACHINE-CHECK ARCHITECTURE caused the error. Eviction and snoop requests apply only to the caches. All of the other requests apply to TLBs, caches and interconnects. 15.9.2.5 Bus and Interconnect Errors The bus and interconnect errors are defined with the 2-bit PP (participation), 1-bit...
15-34 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.9.2.6 Memory Controller Errors The memory controller errors are defined with the 3-bit MMM (memory transaction type), and 4-bit CCCC (channel) sub-fields. The encodings for MMM and CCCC are defined in Table 15-14. 15.9.3 Architecturally Defined UCR Errors S...
Vol. 3 15-35 MACHINE-CHECK ARCHITECTURE 15-9). Their values and compound encoding format are given in Table 15-15. Table 15-16 lists values of relevant bit fields of IA32_MCi_STATUS for archi- tecturally defined SRAO errors. For both the memory scrubbing and L3 explicit writeback errors, the ADDRV a...
15-36 Vol. 3 MACHINE-CHECK ARCHITECTURE IA32_MCG_STATUS register for the memory scrubbing and L3 explicit write- back errors on both the reporting and non-reporting logical processors. 15.9.3.2 Architecturally Defined SRAR Errors The following two SRAR errors are architecturally defined. • UCR Error...
Vol. 3 15-37 MACHINE-CHECK ARCHITECTURE Table 15-19 lists values of relevant bit fields of IA32_MCi_STATUS for archi- tecturally defined SRAR errors. For both the data load and instruction fetch errors, the ADDRV and MISCV flags in the IA32_MCi_STATUS register are set to indicate that the offending ...
15-38 Vol. 3 MACHINE-CHECK ARCHITECTURE For Instruction Fetch recoverable error, the affected logical processor should find that the RIPV flag and the EIPV Flag in the IA32_MCG_STATUS register are cleared, indicating that the error is detected at the instruction pointer saved on the stack may not be...
Vol. 3 15-39 MACHINE-CHECK ARCHITECTURE • When multiple recoverable errors are reported and no other fatal condition (e.g.. overflowed condition for SRAR error) is found for the reported recoverable errors, it is possible for system software to recover from the multiple recoverable errors by taking ...
15-40 Vol. 3 MACHINE-CHECK ARCHITECTURE Guidelines for writing a machine-check exception handler or a machine- error logging utility are given in the following sections. 15.10.1 Machine-Check Exception Handler The machine-check exception (#MC) corresponds to vector 18. To service machine-check excep...
Vol. 3 15-41 MACHINE-CHECK ARCHITECTURE generated). If this flag is clear, the processor may still be able to be restarted (for debugging purposes) but not without loss of program continuity. • For unrecoverable errors, the EIPV flag in the IA32_MCG_STATUS register indicates whether the instruction ...
15-42 Vol. 3 MACHINE-CHECK ARCHITECTURE When machine-check exceptions are enabled for the Pentium processor (MCE flag is set in control register CR4), the machine-check exception handler uses the RDMSR instruction to read the error type from the P5_MC_TYPE register and the machine check address from...
Vol. 3 15-43 MACHINE-CHECK ARCHITECTURE AND PCC flag in IA32_MC i _STATUS = 1 OR RIPV flag in IA32_MCG_STATUS = 0(* execution is not restartable *) THEN RESTARTABILITY = FALSE;return RESTARTABILITY to calling procedure; FI; Save time-stamp counter and processor ID;Set IA32_MC i _STATUS to all 0s; Ex...
15-44 Vol. 3 MACHINE-CHECK ARCHITECTURE mechanism to indicate the frequency of exceptions. A multiprocessing oper- ating system stores the identity of the processor node incurring the excep- tion using a unique identifier, such as the processor’s APIC ID (see Section 10.9, “Handling Interrupts”). Th...
Vol. 3 15-51 MACHINE-CHECK ARCHITECTURE before these errors are actually handled and processed by the MCE handler for attempted software error recovery. Example 15-5 gives pseudocode for a CMCI handler with UCR support. Example 15-5. Corrected Error Handler Pseudocode with UCR Support Corrected Erro...
Vol. 3 16-1 CHAPTER 16 DEBUGGING, PROFILING BRANCHES AND TIME- STAMP COUNTER Intel 64 and IA-32 architectures provide debug facilities for use in debugging code and monitoring performance. These facilities are valuable for debugging application software, system software, and multitasking operating s...
16-4 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.2.1 Debug Address Registers (DR0-DR3) Each of the debug-address registers (DR0 through DR3) holds the 32-bit linear address of a breakpoint (see Figure 16-1). Breakpoint comparisons are made before physical address translation occur...
16-6 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 10 — Break on I/O reads or writes.11 — Break on data reads or writes but not instruction fetches. When the DE flag is clear, the processor interprets the R/Wn bits the same as for the Intel386™ and Intel486™ processors, which is as fol...
16-8 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.2.6 Debug Registers and Intel ® 64 Processors For Intel 64 architecture processors, debug registers DR0–DR7 are 64 bits. In 16-bit or 32-bit modes (protected mode and compatibility mode), writes to a debug register fill the upper 32...
16-14 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING OVERVIEW P6 family processors introduced the ability to set breakpoints on taken branches, interrupts, and exceptions, and to single-step from one branch to the next. This capabilit...
Vol. 3 16-15 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER in the last branch record (LBR) stack. For more information, see the Section 16.5.1, “LBR Stack”. • BTF (single-step on branches) flag (bit 1) — When set, the processor treats the TF flag in the EFLAGS register as a “single-step on br...
16-16 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • FREEZE_LBRS_ON_PMI flag (bit 11) — When set, the LBR stack is frozen on a hardware PMI request (e.g. when a counter overflows and is configured to trigger PMI). • FREEZE_PERFMON_ON_PMI flag (bit 12) — When set, a PMI request clears ...
Vol. 3 16-17 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER a bug to a particular block of code before instruction single-stepping further narrows the search. If the BTF flag is set when the processor generates a debug exception, the processor clears the BTF flag along with the TF flag. The de...
16-18 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.6 CPL-Qualified Branch Trace Mechanism CPL-qualified branch trace mechanism is available to a subset of Intel 64 and IA-32 processors that support the branch trace storing mechanism. The processor supports the CPL-qualified branc...
Vol. 3 16-19 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8 LBR Stack The last branch record stack and top-of-stack (TOS) pointer MSRs are supported across Intel 64 and IA-32 processor families. However, the number of MSRs in the LBR stack and the valid range of TOS pointer value can va...
16-20 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8.1 LBR Stack and Intel ® 64 Processors LBR MSRs are 64-bits. If IA-32e mode is disabled, only the lower 32-bits of the address is recorded. If IA-32e mode is enabled, the processor writes 64-bit values into the MSR. In 64-bit mo...
Vol. 3 16-21 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8.3 Last Exception Records and Intel 64 Architecture Intel 64 and IA-32 processors also provide MSRs that store the branch record for the last branch taken prior to an exception or an interrupt. The location of the last excep-tio...
16-30 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 2. Set the TR and BTS flags in the IA32_DEBUGCTL for Intel Core Solo and Intel Core Duo processors or later processors (or MSR_DEBUGCTLA MSR for processors based on Intel NetBurst Microarchitecture; or MSR_DEBUGCTLB for Pentium M proc...
16-32 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • The ISR must clear the mask bit in the performance counter LVT entry. • The ISR must re-enable the counters to count via IA32_PERF_GLOBAL_CTRL/IA32_PERF_GLOBAL_OVF_CTRL if it is servicing an overflow PMI due to PEBS (or via CCCR's E...
Vol. 3 16-35 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER Processors based on Intel microarchitecture (Nehalem) have an LBR MSR Stack as shown in Table 16-8. Table 16-8. LBR Stack Size and TOS Pointer Range 16.6.2 Filtering of Last Branch Records MSR_LBR_SELECT is cleared to zero at RESET, a...
16-36 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.7 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PROCESSORS BASED ON INTEL NETBURST ® MICROARCHITECTURE) Pentium 4 and Intel Xeon processors based on Intel NetBurst microarchitecture provide the following methods for recording ta...
16-38 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • BTS (branch trace store) flag (bit 3) — When set, enables the BTS facilities to log BTMs to a memory-resident BTS buffer that is part of the DS save area. See Section 16.4.9, “BTS and DS Save Area.” • BTINT (branch trace interrupt) ...
Vol. 3 16-39 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER LBR MSR pair) that contains the most recent (last) branch record placed on the stack. Prior to placing a new branch record on the stack, the TOS is incremented by 1. When the TOS pointer reaches it maximum value, it wraps around to 0....
16-40 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER Additional information is saved if an exception or interrupt occurs in conjunction with a branch instruction. If a branch instruction generates a trap type exception, two branch records are stored in the LBR stack: a branch record for...
16-42 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • Debug store (DS) feature flag (bit 21), returned by the CPUID instruction — Indicates that the processor provides the debug store (DS) mechanism, which allows BTMs to be stored in a memory-resident BTS buffer. See Section 16.4.5, “B...
Vol. 3 16-43 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.9 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS) Like the Pentium 4 and Intel Xeon processor family, Pentium M processors provide last branch interrupt and exception recording. The capability operates almost...
Vol. 3 16-45 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER For more detail on these capabilities, see Section 16.7.3, “Last Exception Records,” and Appendix B.7, “MSRs In the Pentium M Processor.” 16.10 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) The P6 family proce...
16-46 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • BTF (single-step on branches) flag (bit 1) — When set, the processor treats the TF flag in the EFLAGS register as a “single-step on branches” flag. See Section 16.4.3, “Single-Stepping on Branches, Exceptions, and Interrupts.” • PBi...
16-48 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.11 TIME-STAMP COUNTER The Intel 64 and IA-32 architectures (beginning with the Pentium processor) define a time-stamp counter mechanism that can be used to monitor and identify the relative time occurrence of processor events. The ...
Vol. 3 16-49 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER NOTE To determine average processor clock frequency, Intel recommends the use of EMON logic to count processor core clocks over the period of time for which the average is required. See Section 30.10, “Counting Clocks,” and Appendix A...
Vol. 3 17-1 CHAPTER 17 8086 EMULATION IA-32 processors (beginning with the Intel386 processor) provide two ways to execute new or legacy programs that are assembled and/or compiled to run on an Intel 8086 processor: • Real-address mode. • Virtual-8086 mode. Figure 2-3 shows the relationship of these...
Vol. 3 17-3 8086 EMULATION • A single interrupt table, called the “interrupt vector table” or “interrupt table,” is provided for handling interrupts and exceptions (see Figure 17-2). The interrupt table (which has 4-byte entries) takes the place of the interrupt descriptor table (IDT, with 8-byte en...
17-8 Vol. 3 8086 EMULATION 17.2 VIRTUAL-8086 MODE Virtual-8086 mode is actually a special type of a task that runs in protected mode. When the operating-system or executive switches to a virtual-8086-mode task, the processor emulates an Intel 8086 processor. The execution environment of the processo...
Vol. 3 17-9 8086 EMULATION 17.2.1 Enabling Virtual-8086 Mode The processor runs in virtual-8086 mode when the VM (virtual machine) flag in the EFLAGS register is set. This flag can only be set when the processor switches to a new protected-mode task or resumes virtual-8086 mode via an IRET instructi...
17-10 Vol. 3 8086 EMULATION The processor enters virtual-8086 mode to run the 8086 program and returns to protected mode to run the virtual-8086 monitor.The virtual-8086 monitor is a 32-bit protected-mode code module that runs at a CPL of 0. The monitor consists of initialization, interrupt- and exc...
Vol. 3 17-11 8086 EMULATION Paging is not necessary for a single virtual-8086-mode task, but paging is useful or necessary in the following situations: • When running multiple virtual-8086-mode tasks. Here, paging allows the lower 1 MByte of the linear address space for each virtual-8086-mode task t...
Vol. 3 17-15 8086 EMULATION execution sequence after verifying that it was entered as a result of a HLT execution. See Section 17.3, “Interrupt and Exception Handling in Virtual-8086 Mode”, for infor-mation on leaving virtual-8086 mode to handle an interrupt or exception generated in virtual-8086 mo...
Vol. 3 17-17 8086 EMULATION In virtual-8086 mode, the interrupts and exceptions are divided into three classes for the purposes of handling: • Class 1 — All processor-generated exceptions and all hardware interrupts, including the NMI interrupt and the hardware interrupts sent to the processor’s ext...
17-18 Vol. 3 8086 EMULATION in the previous paragraphs. These sections describe three possible types of interrupt and exception handlers: • Protected-mode interrupt and exceptions handlers — These are the standard handlers that the processor calls through the protected-mode IDT. • Virtual-8086 monit...
17-20 Vol. 3 8086 EMULATION Interrupt and exception handlers can examine the VM flag on the stack to determine if the interrupted procedure was running in virtual-8086 mode. If so, the interrupt or exception can be handled in one of three ways: • The protected-mode interrupt or exception handler tha...
Vol. 3 17-21 8086 EMULATION 2. Store the EFLAGS (low-order 16 bits only), CS and EIP values of the 8086 program on the privilege-level 3 stack. This is the stack that the virtual-8086-mode task is using. (The 8086 handler may use or modify this information.) 3. Change the return link on the privileg...
17-24 Vol. 3 8086 EMULATION 5. Upon returning to virtual-8086 mode, the processor continues execution of the 8086 program. When the 8086 program is ready to receive maskable hardware interrupts, it executes the STI instruction to set the VIF flag (enabling maskable hardware interrupts). Prior to set...
Vol. 3 17-27 8086 EMULATION Redirecting software interrupts back to the 8086 program potentially speeds up interrupt handling because a switch back and forth between virtual-8086 mode and protected mode is not required. This latter interrupt-handling technique is particu-larly useful for 8086 operat...
18-4 Vol. 3 MIXING 16-BIT AND 32-BIT CODE 18.3 SHARING DATA AMONG MIXED-SIZE CODE SEGMENTS Data segments can be accessed from both 16-bit and 32-bit code segments. When a data segment that is larger than 64 KBytes is to be shared among 16- and 32-bit code segments, the data that is to be accessed fr...
Vol. 3 18-9 MIXING 16-BIT AND 32-BIT CODE 18.4.5 Writing Interface Procedures Placing interface code between 32-bit and 16-bit procedures can be the solution to the following interface problems: • Allowing procedures in 16-bit code segments to call procedures with offsets greater than FFFFH in 32-bi...
Vol. 3 19-1 CHAPTER 19 ARCHITECTURE COMPATIBILITY Intel 64 and IA-32 processors are binary compatible. Compatibility means that, within limited constraints, programs that execute on previous generations of proces-sors will produce identical results when executed on later processors. The compati-bili...
19-2 Vol. 3 ARCHITECTURE COMPATIBILITY • Pentium D Processors — A family of dual-core Intel 64 processors that provides two processor cores in a physical package. Each core is based on the Intel NetBurst microarchitecture. • Pentium Processor Extreme Editions — A family of dual-core Intel 64 process...
Vol. 3 19-3 ARCHITECTURE COMPATIBILITY original value results in a general-protection exception (#GP). So, programs that execute on the P6 family and Pentium processors cannot erroneously enable func-tions that may be implemented in future IA-32 processors. The P6 family and Pentium processors do no...
19-4 Vol. 3 ARCHITECTURE COMPATIBILITY control and status register. These instructions and registers are designed to allow SIMD computations to be made on single-precision floating-point numbers. Several of these new instructions also operate in the MMX registers. SSE instructions and registers are ...
Vol. 3 19-5 ARCHITECTURE COMPATIBILITY 19.10 INTEL HYPER-THREADING TECHNOLOGY Intel Hyper-Threading Technology provides two logical processors that can execute two separate code streams (called threads) concurrently by using shared resources in a single processor core or in a physical package. This ...
19-6 Vol. 3 ARCHITECTURE COMPATIBILITY 19.13.1 Instructions Added Prior to the Pentium Processor The following instructions were added in the Intel486 processor: • BSWAP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and exchange) instruction. • Ι NVD (invalidate ...
19-8 Vol. 3 ARCHITECTURE COMPATIBILITY The following flags were added to the EFLAGS register in the Pentium processor: • VIF (virtual interrupt flag), bit 19. • VIP (virtual interrupt pending), bit 20. • ID (identification flag), bit 21. The AC flag (bit 18) was added to the EFLAGS register in the I...
Vol. 3 19-9 ARCHITECTURE COMPATIBILITY XCHG BP, [BP] This code functions as the 8086 processor PUSH SP instruction on the P6 family, Pentium, Intel486, Intel386, and Intel 286 processors. 19.17.2 EFLAGS Pushed on the Stack The setting of the stored values of bits 12 through 15 (which includes the IO...
19-12 Vol. 3 ARCHITECTURE COMPATIBILITY Software written to run on a 16-bit IA-32 math coprocessor may not operate correctly on a 16-bit x87 FPU, if it uses the FLDENV, FRSTOR, or FXRSTOR instruc-tions to change tags to values (other than to empty) that are different from actual register contents.Th...
Vol. 3 19-13 ARCHITECTURE COMPATIBILITY ters. The only affect may be in how software handles the tags in the tag word (see also: Section 19.18.4, “x87 FPU Tag Word”). 19.18.6 Floating-Point Exceptions This section identifies the implementation differences in exception handling for floating-point ins...
19-22 Vol. 3 ARCHITECTURE COMPATIBILITY 19.20 FPU AND MATH COPROCESSOR INITIALIZATION Table 9-1 shows the states of the FPUs in the P6 family, Pentium, Intel486 processors and of the Intel 387 math coprocessor and Intel 287 coprocessor following a power-up, reset, or INIT, or following the execution...
Vol. 3 19-23 ARCHITECTURE COMPATIBILITY Following is an example code sequence to initialize the system and check for the presence of Intel486 SX processor/Intel 487 SX math coprocessor. fninitfstcw mem_locmov ax, mem_loccmp ax, 037fhjz Intel487_SX_Math_CoProcessor_present ;ax=037fh jmp Intel486_SX_m...
Vol. 3 19-25 ARCHITECTURE COMPATIBILITY • NE — Numeric error. Enables the normal mechanism for reporting floating-point numeric errors. • WP — Write protect. Write-protects read-only pages against supervisor-mode accesses. • AM — Alignment mask. Controls whether alignment checking is performed. Oper...
Vol. 3 19-27 ARCHITECTURE COMPATIBILITY 19.22.4 Changes in Segment Descriptor Loads On the Intel386 processor, loading a segment descriptor always causes a locked read and write to set the accessed bit of the descriptor. On the P6 family, Pentium, and Intel486 processors, the locked read and write o...
19-28 Vol. 3 ARCHITECTURE COMPATIBILITY are enabled (the DE flag is set), attempts to reference registers DR4 or DR5 will result in an invalid-opcode exception (#UD). 19.24 RECOGNITION OF BREAKPOINTS For the Pentium processor, it is recommended that debuggers execute the LGDT instruction before retu...
19-30 Vol. 3 ARCHITECTURE COMPATIBILITY 19.25.1 Machine-Check Architecture The Pentium Pro processor introduced a new architecture to the IA-32 for handling and reporting on machine-check exceptions. This machine-check architecture (described in detail in Chapter 15, “Machine-Check Architecture”) gr...
Vol. 3 19-31 ARCHITECTURE COMPATIBILITY 19.26.3 IDT Limit The LIDT instruction can be used to set a limit on the size of the IDT. A double-fault exception (#DF) is generated if an interrupt or exception attempts to read a vector beyond the limit. Shutdown then occurs on the 32-bit IA-32 processors i...
19-32 Vol. 3 ARCHITECTURE COMPATIBILITY • The remote read delivery mode provided in the 82489DX and local APIC for Pentium processors is not supported in the local APIC in the Pentium 4, Intel Xeon, and P6 family processors. • For the 82489DX, in the lowest priority delivery mode, all the target loc...
Vol. 3 19-33 ARCHITECTURE COMPATIBILITY 19.28.1 P6 Family and Pentium Processor TSS When the virtual mode extensions are enabled (by setting the VME flag in control register CR4), the TSS in the P6 family and Pentium processors contain an interrupt redirection bit map, which is used in virtual-8086 ...
19-36 Vol. 3 ARCHITECTURE COMPATIBILITY 19.29.2 Disabling the L3 Cache A unified third-level (L3) cache in processors based on Intel NetBurst microarchitec-ture (see Section 11.1, “Internal Caches, TLBs, and Buffers”) provides the third-level cache disable flag, bit 6 of the IA32_MISC_ENABLE MSR. Th...
19-38 Vol. 3 ARCHITECTURE COMPATIBILITY • The initial stack pointer is FFFCH (32-bit operand) or FFFEH (16-bit operand) and will wrap around to 0H as a result of the POP operation. The result of the memory write is implementation-specific. For example, in P6 family processors, the result of the memo...
Vol. 3 19-39 ARCHITECTURE COMPATIBILITY 19.32 MIXING 16- AND 32-BIT SEGMENTS The features of the 16-bit Intel 286 processor are an object-code compatible subset of those of the 32-bit IA-32 processors. The D (default operation size) flag in segment descriptors indicates whether the processor treats ...
19-40 Vol. 3 ARCHITECTURE COMPATIBILITY 19.33.1 Segment Wraparound On the 8086 processor, an attempt to access a memory operand that crosses offset 65,535 or 0FFFFH or offset 0 (for example, moving a word to offset 65,535 or pushing a word when the stack pointer is set to 1) causes the offset to wra...
19-44 Vol. 3 ARCHITECTURE COMPATIBILITY Earlier IA-32 processors (such as the Intel486 and Pentium processors) used the KEN# (cache enable) pin and external logic to maintain an external memory map and signal cacheable accesses to the processor. The MTRR mechanism simplifies hard-ware designs by eli...
Vol. 3 19-45 ARCHITECTURE COMPATIBILITY The performance-monitoring counters are useful for debugging programs, optimizing code, diagnosing system failures, or refining hardware designs. See Chapter 30, “Performance Monitoring,” for more information on these counters. 19.38 TWO WAYS TO RUN INTEL 286 ...
Intel Manuals
-
Intel FSB- 865G
Manual
-
Intel P3700
Manual
-
Intel 100BASE-T4
Manual
-
Intel SRCU31
Manual
-
Intel IXP400
Manual
-
Intel Netstructure MPRTM0020 Rear Transition module
Manual
-
Intel EES-5718
Manual
-
Intel TIGI2U
Manual
-
Intel X18-M
Manual
-
Intel 80287
Manual
-
Intel AR-B1890
Manual
-
Intel PPC-7508F M1
Manual
-
Intel SSDSCKHW360A401
Manual
-
Intel SYS7180VE
Manual
-
Intel 330T
Manual
-
Intel 4
Manual
-
Intel 82555
Manual
-
Intel PRO
Manual
-
Intel CAP15ECS7TB
Manual
-
Intel PCI-7200
Manual