Page 2 - ii
ii Vol. 3A INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUME...
Page 3 - iii; CONTENTS; CHAPTER 1
Vol. 3A iii CONTENTS PAGE CHAPTER 1 ABOUT THIS MANUAL 1.1 PROCESSORS COVERED IN THIS MANUAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 1.2 OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 1...
Page 4 - iv; CHAPTER 3
CONTENTS iv Vol. 3A PAGE 2.7.5 Controlling the Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31 2.7.6 Reading Performance-Monitoring and Time-Stamp Counters . . . . . . . . . . . . . . . . . . . . . 2-32 2.7.6.1 Reading Coun...
Page 5 - CHAPTER 5; FIELDS AND FLAGS USED FOR SEGMENT-LEVEL AND
Vol. 3A v CONTENTS PAGE 4.9.3 Caching Paging-Related Information about Memory Typing . . . . . . . . . . . . . . . . . . . . . . .4-38 4.10 CACHING TRANSLATION INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38 4.10.1 . . . . . . . . . . . . . . ...
Page 6 - CHAPTER 6
CONTENTS vi Vol. 3A PAGE 5.8.7.1 SYSENTER and SYSEXIT Instructions in IA-32e Mode. . . . . . . . . . . . . . . . . . . . . . . . . . 5-31 5.8.8 Fast System Calls in 64-bit Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32 5.9 PRIVILEGED INSTRUCT...
Page 7 - vii; CHAPTER 7
Vol. 3A vii CONTENTS PAGE 6.14 EXCEPTION AND INTERRUPT HANDLING IN 64-BIT MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22 6.14.1 64-Bit Mode IDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23 6.14.2 ...
Page 8 - CHAPTER 8
CONTENTS viii Vol. 3A PAGE CHAPTER 8 MULTIPLE-PROCESSOR MANAGEMENT 8.1 LOCKED ATOMIC OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 8.1.1 Guaranteed Atomic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 9 - ix; CHAPTER 9
Vol. 3A ix CONTENTS PAGE 8.7.9 Memory Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-42 8.7.10 Serializing Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 10 - Pentium 4, Intel Xeon, and P6 Family Processor
CONTENTS x Vol. 3A PAGE 9.5 MEMORY TYPE RANGE REGISTERS (MTRRS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9 9.6 INITIALIZING SSE/SSE2/SSE3/SSSE3 EXTENSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 9.7 SOFTWARE INITIALIZATION F...
Page 11 - xi; ADVANCED PROGRAMMABLE; THE INTEL
Vol. 3A xi CONTENTS PAGE CHAPTER 10 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.1 LOCAL AND I/O APIC OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 10.2 SYSTEM BUS VS. APIC BUS . . . . . . . . . . . . . . . . . . . . . . ....
Page 12 - xii; APIC BUS MESSAGE PASSING MECHANISM AND; MEMORY CACHE CONTROL
CONTENTS xii Vol. 3A PAGE 10.7.2.4 Deriving Logical x2APIC ID from the Local x2APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . 10-50 10.7.2.5 Broadcast/Self Delivery Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-51 10.7.2.6 Lowest Prior...
Page 13 - xiii; INTEL; PROVIDING OPERATING SYSTEM SUPPORT FOR
Vol. 3A xiii CONTENTS PAGE 11.11 MEMORY TYPE RANGE REGISTERS (MTRRS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-30 11.11.1 MTRR Feature Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-32 11.11....
Page 14 - POWER AND THERMAL MANAGEMENT
CONTENTS xiv Vol. 3A PAGE 13.1.6.1 Numeric Error flag and IGNNE# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.2 EMULATION OF SSE/SSE2/SSE3/SSSE3/SSE4 EXTENSIONS. . . . . . . . . . . . . . . . . . . . . . . . . . 13-8 13.3 SAVING AND RESTORING TH...
Page 15 - xv; Mapping of the Pentium
Vol. 3A xv CONTENTS PAGE 15.3 MACHINE-CHECK MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2 15.3.1 Machine-Check Global Control MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 16 - DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER
CONTENTS xvi Vol. 3A PAGE CHAPTER 16 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.1 OVERVIEW OF DEBUG SUPPORT FACILITIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1 16.2 DEBUG REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Page 17 - xvii; INTERRUPT AND EXCEPTION HANDLING
Vol. 3A xvii CONTENTS PAGE 16.9 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-43 16.10 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) . . . . . . . . . . . . . . ....
Page 18 - xviii; ARCHITECTURE COMPATIBILITY
CONTENTS xviii Vol. 3A PAGE CHAPTER 18 MIXING 16-BIT AND 32-BIT CODE 18.1 DEFINING 16-BIT AND 32-BIT PROGRAM MODULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2 18.2 MIXING 16-BIT AND 32-BIT OPERATIONS WITHIN A CODE SEGMENT . . . . . . . . . . . . . . . . . 18-2 18.3 SHARI...
Page 19 - xix; Intel
Vol. 3A xix CONTENTS PAGE 19.18.6.3 Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-14 19.18.6.4 Exception Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-14 ...
Page 20 - xx; New Features Incorporated in the Local APIC for the P6 Family; INTRODUCTION TO VIRTUAL-MACHINE EXTENSIONS
CONTENTS xx Vol. 3A PAGE 19.25 EXCEPTIONS AND/OR EXCEPTION CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-28 19.25.1 Machine-Check Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-30 19.25.2 Pri...
Page 21 - xxi; VIRTUAL-MACHINE CONTROL STRUCTURES
Vol. 3A xxi CONTENTS PAGE 20.5 VIRTUAL-MACHINE CONTROL STRUCTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.6 DISCOVERING SUPPORT FOR VMX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3 20.7 ENABLIN...
Page 22 - xxii; VMX NON-ROOT OPERATION
CONTENTS xxii Vol. 3A PAGE CHAPTER 22 VMX NON-ROOT OPERATION 22.1 INSTRUCTIONS THAT CAUSE VM EXITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-1 22.1.1 Relative Priority of Faults and VM Exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Page 23 - xxiii; VM EXITS
Vol. 3A xxiii CONTENTS PAGE 23.3.1.3 Checks on Guest Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-15 23.3.1.4 Checks on Guest RIP and RFLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-15 23.3.1.5 Checks on G...
Page 24 - xxiv; VMX SUPPORT FOR ADDRESS TRANSLATION; SWITCHING BETWEEN SMM AND THE OTHER
CONTENTS xxiv Vol. 3A PAGE 24.5.6 Clearing Address-Range Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-37 24.6 LOADING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Page 25 - xxv; VIRTUAL-MACHINE MONITOR PROGRAMMING CONSIDERATIONS
Vol. 3A xxv CONTENTS PAGE 26.11 SMBASE RELOCATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-19 26.11.1 Relocating SMRAM to an Address Above 1 MByte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-20 26.12 I...
Page 26 - xxvi; VIRTUALIZATION OF SYSTEM RESOURCES
CONTENTS xxvi Vol. 3A PAGE 27.7.1 Handling VM Exits Due to Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.1 Reflecting Exceptions to Guest Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-11 27.7.1.2 Resum...
Page 27 - HANDLING BOUNDARY CONDITIONS IN A VIRTUAL MACHINE MONITOR
Vol. 3A xxvii CONTENTS PAGE CHAPTER 29 HANDLING BOUNDARY CONDITIONS IN A VIRTUAL MACHINE MONITOR 29.1 OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29-1 29.2 INTERRUPT HANDLING IN VMX OPERATION. ...
Page 29 - APPENDIX A
Vol. 3A xxix CONTENTS PAGE 30.10.3 Incrementing the Time-Stamp Counter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-77 30.10.4 Non-Halted Reference Clockticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30-77 30.1...
Page 31 - xxxi; Processor Model Specific Error Code Field
Vol. 3A xxxi CONTENTS PAGE E.4.3 Processor Model Specific Error Code Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-21 E.4.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MCA Error Type A: L3 ErrorE-21 E...
Page 32 - xxxii; APPENDIX I
CONTENTS xxxii Vol. 3A PAGE H.4.2 Natural-Width Read-Only Data Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10 H.4.3 Natural-Width Guest-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-10...
Page 33 - xxxiii; FIGURES; Memory Management Convention That Assigns a Page Table
Vol. 3A xxxiii CONTENTS PAGE FIGURES Figure 1-1. Bit and Byte Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Figure 1-2. Syntax for CPUID, CR, and MSR Data Presentation. . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 34 - xxxiv
CONTENTS xxxiv Vol. 3A PAGE Figure 6-2. IDT Gate Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15 Figure 6-3. Interrupt Procedure Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
Page 35 - xxxv
Vol. 3A xxxv CONTENTS PAGE Figure 10-14. Error Status Register (ESR) in x2APIC Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-36 Figure 10-15. Divide Configuration Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-37 Fi...
Page 37 - xxxvii
Vol. 3A xxxvii CONTENTS PAGE Figure 29-1. Host External Interrupts and Guest Virtual Interrupts . . . . . . . . . . . . . . . . . . . . . . . . .29-5 Figure 30-1. Layout of IA32_PERFEVTSELx MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30-4 Figure 30-2. Layout...
Page 38 - TABLES
CONTENTS xxxviii Vol. 3A PAGE TABLES Table 2-1. Action Taken By x87 FPU Instructions for Different Combinations of EM, MP, and TS2-21 Table 2-2. Summary of System Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27 Table 3-1. Code- and Data-Segme...
Page 41 - xli; EPT Page Directory25-6
Vol. 3A xli CONTENTS PAGE Table 21-4. Format of Pending-Debug-Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21-8 Table 21-5. Definitions of Pin-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-11 Table 21-6. Definiti...
Page 44 - xliv
CONTENTS xliv Vol. 3A PAGE Table F-2. Short Message (21 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-2 Table F-3. Non-Focused Lowest Priority Message (34 Cycles). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .F-3 Table F-...
Page 45 - PROCESSORS COVERED IN THIS MANUAL
Vol. 3 1-1 CHAPTER 1 ABOUT THIS MANUAL The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (order number 253668) and the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B: System Programming Guide, Part 2 (order number...
Page 47 - ABOUT THIS MANUAL; OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE
Vol. 3 1-3 ABOUT THIS MANUAL The Intel ® Core TM i7 processor and the Intel ® Core TM i5 processor are based on the Intel ® microarchitecture (Nehalem) and support Intel 64 architecture. Processors based on the Next Generation Intel Processor, codenamed Westmere, support Intel 64 architecture.P6 fam...
Page 48 - MMXTM Technology System Programming. Describes; those aspects of the Intel; Describes the machine-check
1-4 Vol. 3 ABOUT THIS MANUAL Chapter 6 — Interrupt and Exception Handling. Describes the basic interrupt mechanisms defined in the Intel 64 and IA-32 architectures, shows how interrupts and exceptions relate to protection, and describes how the architecture handles each exception type. Reference inf...
Page 50 - CONVENTIONS; Bit and Byte Order
1-6 Vol. 3 ABOUT THIS MANUAL Chapter 30 — Performance Monitoring. Describes the Intel 64 and IA-32 archi-tectures’ facilities for monitoring performance.Appendix A — Performance-Monitoring Events. Lists architectural performance events. Non-architectural performance events (i.e. model-specific event...
Page 51 - Reserved Bits and Software Compatibility; NOTE
Vol. 3 1-7 ABOUT THIS MANUAL means the bytes of a word are numbered starting from the least significant byte. Figure 1-1 illustrates these conventions. 1.3.2 Reserved Bits and Software Compatibility In many register and memory layout descriptions, certain bits are marked as reserved. When bits are m...
Page 52 - Operands; A label is an identifier which is followed by a colon.; Hexadecimal and Binary Numbers
1-8 Vol. 3 ABOUT THIS MANUAL 1.3.3 Instruction Operands When instructions are represented symbolically, a subset of assembly language is used. In this subset, an instruction has the following format: label: mnemonic argument1, argument2, argument3 where: • A label is an identifier which is followed ...
Page 55 - LITERATURE
Vol. 3 1-11 ABOUT THIS MANUAL This example refers to a page-fault exception under conditions where an error code naming a type of fault is reported. Under some conditions, exceptions which produce error codes may not be able to report an accurate code. In this case, the error code is zero, as shown ...
Page 58 - SYSTEM ARCHITECTURE OVERVIEW; OVERVIEW OF THE SYSTEM-LEVEL ARCHITECTURE
2-2 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW initiates the switch from real-address mode to protected mode. If IA-32e mode oper-ation is desired, software also initiates a switch from protected mode to IA-32e mode. 2.1 OVERVIEW OF THE SYSTEM-LEVEL ARCHITECTURE System-level architecture consists of a set ...
Page 61 - Global and Local Descriptor Tables; Global and Local Descriptor Tables in IA-32e Mode; System Segments, Segment Descriptors, and Gates
Vol. 3 2-5 SYSTEM ARCHITECTURE OVERVIEW 2.1.1 Global and Local Descriptor Tables When operating in protected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment...
Page 62 - Task-State Segments and Task Gates
2-6 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The architecture also defines a set of special descriptors called gates (call gates, interrupt gates, trap gates, and task gates). These provide protected gateways to system procedures and handlers that may operate at a different privilege level than applicati...
Page 63 - Interrupt and Exception Handling; Interrupt and Exception Handling IA-32e Mode
Vol. 3 2-7 SYSTEM ARCHITECTURE OVERVIEW 2. Loads the task register with the segment selector for the new task.3. Accesses the new TSS through a segment descriptor in the GDT.4. Loads the state of the new task from the new TSS into the general-purpose registers, the segment registers, the LDTR, contr...
Page 64 - Management; Memory Management in IA-32e Mode
2-8 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The IDTR register is expanded to hold a 64-bit base address. Task gates are not supported. 2.1.5 Memory Management System architecture supports either direct physical addressing of memory or virtual memory (through paging). When physical addressing is used, a ...
Page 65 - Registers; System Registers in IA-32e Mode
Vol. 3 2-9 SYSTEM ARCHITECTURE OVERVIEW 2.1.6 System Registers To assist in initializing the processor and controlling system operations, the system architecture provides system flags in the EFLAGS register and several system registers: • The system flags and IOPL field in the EFLAGS register contro...
Page 66 - IA32_KernelGSbase — Used by SWAPGS instruction.; Other System Resources; MODES OF OPERATION
2-10 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW On systems that support IA-32e mode, the extended feature enable register (IA32_EFER) is available. This model-specific register controls activation of IA-32e mode and other IA-32e mode operations. In addition, there are several model-specific registers that ...
Page 67 - Figure 2-3. Transitions Among the Processor’s Operating Modes; System
Vol. 3 2-11 SYSTEM ARCHITECTURE OVERVIEW running program or task. SMM-specific code may then be executed transparently. Upon returning from SMM, the processor is placed back into its state prior to the SMI. • Virtual-8086 mode — In protected mode, the processor supports a quasi-operating mode known ...
Page 68 - SYSTEM FLAGS AND FIELDS IN THE EFLAGS
2-12 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The VM flag in the EFLAGS register determines whether the processor is operating in protected mode or virtual-8086 mode. Transitions between protected mode and virtual-8086 mode are generally carried out as part of a task switch or a return from an interrupt ...
Page 69 - Figure 2-4. System Flags in the EFLAGS Register
Vol. 3 2-13 SYSTEM ARCHITECTURE OVERVIEW IF Interrupt enable (bit 9) — Controls the response of the processor to maskable hardware interrupt requests (see also: Section 6.3.2, “Maskable Hardware Interrupts”). The flag is set to respond to maskable hardware interrupts; cleared to inhibit maskable har...
Page 71 - System Flags and Fields in IA-32e Mode; REGISTERS
Vol. 3 2-15 SYSTEM ARCHITECTURE OVERVIEW VIP Virtual interrupt pending (bit 20) — Set by software to indicate that an interrupt is pending; cleared to indicate that no interrupt is pending. This flag is used in conjunction with the VIF flag. The processor reads this flag but never modifies it. The p...
Page 73 - IDTR Interrupt Descriptor Table Register; All 64 bits of CR2 are writable by software.
Vol. 3 2-17 SYSTEM ARCHITECTURE OVERVIEW 2.4.3 IDTR Interrupt Descriptor Table Register The IDTR register holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and 16-bit table limit for the IDT. The base address specifies the linear address of byte 0 of the IDT; the table limit...
Page 77 - Table 2-1. Action Taken By x87 FPU Instructions for Different; CR0 Flags
Vol. 3 2-21 SYSTEM ARCHITECTURE OVERVIEW delayed until an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed by the new task. The processor sets this flag on every task switch and tests it when executing x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instructions. • If the TS flag is set an...
Page 81 - OSXMMEXCPT
Vol. 3 2-25 SYSTEM ARCHITECTURE OVERVIEW processor will generate an invalid opcode exception (#UD) if it attempts to execute any SSE/SSE2/SSE3and instruction, with the exception of PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT. The operating system or executive must ex...
Page 82 - CPUID Qualification of Control Register Flags; EXTENDED CONTROL REGISTERS (INCLUDING THE
2-26 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW all interrupts are enabled. This field is available in 64-bit mode. A value of 15 means all interrupts will be disabled. 2.5.1 CPUID Qualification of Control Register Flags The VME, PVI, TSD, DE, PSE, PAE, MCE, PGE, PCE, OSFXSR, and OSXMMEXCPT flags in contro...
Page 83 - INSTRUCTION; Table 2-2. Summary of System Instructions; Instruction
Vol. 3 2-27 SYSTEM ARCHITECTURE OVERVIEW state, SSE state, or a future processor extended state) is represented by a bit in XCR0. The OS can enable future processor extended states in a forward manner by specifying the appropriate bit mask value using the XSETBV instruction according to the results ...
Page 85 - Loading and Storing System Registers
Vol. 3 2-29 SYSTEM ARCHITECTURE OVERVIEW 2.7.1 Loading and Storing System Registers The GDTR, LDTR, IDTR, and TR registers each have a load and store instruction for loading data into and storing data from the register: • LGDT (Load GDTR Register) — Loads the GDT base address and limit from memory i...
Page 86 - Verifying of Access Privileges; The
2-30 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW The LMSW (load machine status word) and SMSW (store machine status word) instructions operate on bits 0 through 15 of control register CR0. These instructions are provided for compatibility with the 16-bit Intel 286 processor. Programs written to run on 32-bi...
Page 87 - Loading and Storing Debug Registers
Vol. 3 2-31 SYSTEM ARCHITECTURE OVERVIEW Instructions),” for a detailed explanation of the function and use of this instruction. 2.7.3 Loading and Storing Debug Registers Internal debugging facilities in the processor are controlled by a set of 8 debug regis-ters (DR0-DR7). The MOV instruction allow...
Page 88 - Reading Performance-Monitoring and Time-Stamp Counters
2-32 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW introduced with the Pentium Pro processor). If any non-wake events are pending during shutdown, they will be handled after the wake event from shutdown is processed (for example, A20M# interrupts).The LOCK prefix invokes a locked (atomic) read-modify-write op...
Page 89 - Reading Counters in 64-Bit Mode; Reading and Writing Model-Specific Registers
Vol. 3 2-33 SYSTEM ARCHITECTURE OVERVIEW Fixed-function performance counters record only specific events that are defined in Chapter 20, “Introduction to Virtual-Machine Extensions”, and the width/number of fixed-function counters are enumerated by CPUID leaf 0AH.The time-stamp counter is a model-sp...
Page 90 - Reading and Writing Model-Specific Registers in 64-Bit Mode; Enabling Processor Extended States
2-34 Vol. 3 SYSTEM ARCHITECTURE OVERVIEW 2.7.7.1 Reading and Writing Model-Specific Registers in 64-Bit Mode RDMSR and WRMSR require an index to specify the address of an MSR. In 64-bit mode, the index is 32 bits; it is specified using ECX. 2.7.8 Enabling Processor Extended States The XSETBV instruc...
Page 91 - MANAGEMENT
Vol. 3 3-1 CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT This chapter describes the Intel 64 and IA-32 architecture’s protected-mode memory management facilities, including the physical memory requirements, segmentation mechanism, and paging mechanism.See also: Chapter 5, “Protection” (for a descriptio...
Page 92 - PROTECTED-MODE MEMORY MANAGEMENT
3-2 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT segment, the segment type, and the location of the first byte of the segment in the linear address space (called the base address of the segment). The offset part of the logical address is added to the base address for the segment to locate a byte within t...
Page 93 - SEGMENTS; Basic Flat Model
Vol. 3 3-3 PROTECTED-MODE MEMORY MANAGEMENT storage. When using paging, each segment is divided into pages (typically 4 KBytes each in size), which are stored either in physical memory or on the disk. The oper-ating system or executive maintains a page directory and a set of page tables to keep trac...
Page 94 - Protected Flat Model
3-4 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT FFFF_FFF0H. RAM (DRAM) is placed at the bottom of the address space because the initial base address for the DS data segment after reset initialization is 0. 3.2.2 Protected Flat Model The protected flat model is similar to the basic flat model, except the...
Page 95 - Model
Vol. 3 3-5 PROTECTED-MODE MEMORY MANAGEMENT More complexity can be added to this protected flat model to provide more protec-tion. For example, for the paging mechanism to provide isolation between user and supervisor code and data, four segments need to be defined: code and data segments at privile...
Page 96 - Segmentation in IA-32e Mode
3-6 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT Access checks can be used to protect not only against referencing an address outside the limit of a segment, but also against performing disallowed operations in certain segments. For example, since code segments are designated as read-only segments, hardw...
Page 97 - Paging and Segmentation; ADDRESS; bytes). This is the address space that the processor can address on
Vol. 3 3-7 PROTECTED-MODE MEMORY MANAGEMENT In 64-bit mode, segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to the effective address. The FS ...
Page 98 - 4 Processors and Physical Address Space; LOGICAL AND LINEAR ADDRESSES; to form a linear address.
3-8 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.3.1 Intel ® 64 Processors and Physical Address Space On processors that support Intel 64 architecture (CPUID.80000001:EDX[29] = 1), the size of the physical address range is implementation-specific and indicated by CPUID.80000008H:EAX[bits 7-0]. For the ...
Page 99 - Logical Address Translation in IA-32e Mode; Figure 3-5. Logical Address to Linear Address Translation
Vol. 3 3-9 PROTECTED-MODE MEMORY MANAGEMENT If paging is not used, the processor maps the linear address directly to a physical address (that is, the linear address goes out on the processor’s address bus). If the linear address space is paged, a second level of address translation is used to trans-...
Page 102 - Segment Loading Instructions in IA-32e Mode
3-12 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.4.4 Segment Loading Instructions in IA-32e Mode Because ES, DS, and SS segment registers are not used in 64-bit mode, their fields (base, limit, and attribute) in segment descriptor registers are ignored. Some forms of segment load instructions are also...
Page 103 - Descriptors
Vol. 3 3-13 PROTECTED-MODE MEMORY MANAGEMENT 3.4.5 Segment Descriptors A segment descriptor is a data structure in a GDT or LDT that provides the processor with the size and location of a segment, as well as access control and status informa-tion. Segment descriptors are typically created by compile...
Page 104 - Base address fields
3-14 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT to the segment limit. Offsets greater than the segment limit generate general-protection exceptions (#GP). For expand-down segments, the segment limit has the reverse function; the offset can range from the segment limit to FFFFFFFFH or FFFFH, depending o...
Page 105 - flag
Vol. 3 3-15 PROTECTED-MODE MEMORY MANAGEMENT store its own data, such as information regarding the whereabouts of the missing segment. D/B (default operation size/default stack pointer size and/or upper bound) flag Performs different functions depending on whether the segment descriptor is an execut...
Page 106 - Available and reserved bits
3-16 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT G (granularity) flag Determines the scaling of the segment limit field. When the granularity flag is clear, the segment limit is interpreted in byte units; when flag is set, the segment limit is interpreted in 4-KByte units. (This flag does not affect the...
Page 108 - DESCRIPTOR
3-18 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT For code segments, the three low-order bits of the type field are interpreted as accessed (A), read enable (R), and conforming (C). Code segments can be execute-only or execute/read, depending on the setting of the read-enable bit. An execute/read segment...
Page 110 - Segment Descriptor Tables; tors. There are two kinds of descriptor tables:; Figure 3-10. Global and Local Descriptor Tables
3-20 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT See also: Section 3.5.1, “Segment Descriptor Tables”, and Section 7.2.2, “TSS Descriptor” (for more information on the system-segment descriptors); see Section 5.8.3, “Call Gates”, Section 6.11, “IDT Descriptors”, and Section 7.2.5, “Task-Gate Descriptor”...
Page 112 - Segment Descriptor Tables in IA-32e Mode; In IA-32e mode, a segment descriptor table can contain up to 8192 (2
3-22 Vol. 3 PROTECTED-MODE MEMORY MANAGEMENT 3.5.2 Segment Descriptor Tables in IA-32e Mode In IA-32e mode, a segment descriptor table can contain up to 8192 (2 13 ) 8-byte descriptors. An entry in the segment descriptor table can be 8 bytes. System descrip-tors are expanded to 16 bytes (occupying t...
Page 113 - PAGING MODES AND CONTROL BITS; Paging behavior is controlled by the following control bits:
Vol. 3 4-1 CHAPTER 4 PAGING Chapter 3 explains how segmentation converts logical addresses to linear addresses. Paging (or linear-address translation) is the process of translating linear addresses so that they can be used to access memory or I/O devices. Paging translates each linear address to a p...
Page 114 - PAGING; Three Paging Modes
4-2 Vol. 3 PAGING paging modes. Section 4.1.3 discusses how CR0.WP, CR4.PSE, CR4.PGE, and IA32_EFER.NXE modify the operation of the different paging modes. 4.1.1 Three Paging Modes If CR0.PG = 0, paging is not used. The logical processor treats all linear addresses as if they were physical addresses...
Page 115 - Enabling; Table 4-1. Properties of Different Paging Modes; None
Vol. 3 4-3 PAGING linear addresses larger than 32 bits, 32-bit paging and PAE paging translate 32-bit linear addresses.Because it is used only if IA32_EFER.LME = 1, IA-32e paging is used only in IA-32e mode. (In fact, it is the use of IA-32e paging that defines IA-32e mode.) IA-32e mode has two sub-...
Page 116 - Figure 4-1. Enabling and Changing Paging Modes
4-4 Vol. 3 PAGING enable these modes and make transitions between them. The following items identify certain limitations and other details: • IA32_EFER.LME cannot be modified while paging is enabled (CR0.PG = 1). Attempts to do so using WRMSR cause a general-protection exception (#GP(0)). • Paging c...
Page 117 - Modifiers
Vol. 3 4-5 PAGING • Software can always disable paging by clearing CR0.PG with MOV to CR0. • Software can make transitions between 32-bit paging and PAE paging by changing the value of CR4.PAE with MOV to CR4. • Software cannot make transitions directly between IA-32e paging and either of the other ...
Page 118 - Enumeration of Paging Features by CPUID
4-6 Vol. 3 PAGING 4.1.4 Enumeration of Paging Features by CPUID Software can discover support for different paging features using the CPUID instruc-tion: • PSE: page-size extensions for 32-bit paging.If CPUID.01H:EDX.PSE [bit 3] = 1, CR4.PSE may be set to 1, enabling support for 4-MByte pages with 3...
Page 119 - HIERARCHICAL PAGING STRUCTURES: AN OVERVIEW; With PAE paging, the first paging structure comprises only 4 = 2
Vol. 3 4-7 PAGING 4.2 HIERARCHICAL PAGING STRUCTURES: AN OVERVIEW All three paging modes translate linear addresses use hierarchical paging struc-tures. This section provides an overview of their operation. Section 4.3, Section 4.4, and Section 4.5 provide details for the three paging modes.Every pa...
Page 120 - Although 40 bits
4-8 Vol. 3 PAGING and bits 20:12 identify a fourth. Again, the last identifies the page frame. (See Figure 4-8 for an illustration.) The translation process in each of the examples above completes by identifying a page frame. However, the paging structures may be configured so that translation termi...
Page 121 - Table 4-2. Paging Structures in the Different Paging Modes
Vol. 3 4-9 PAGING corresponds to 1 TByte, linear addresses are limited to 32 bits; at most 4 GBytes of linear-address space may be accessed at any given time.32-bit paging uses a hierarchy of paging structures to produce a translation for a linear address. CR3 is used to locate the first paging-stru...
Page 128 - Bit
4-16 Vol. 3 PAGING ters. (This is different from the other paging modes, in which there is one hierarchy referenced by CR3.)Section 4.4.1 discusses the PDPTE registers. Section 4.4.2 describes linear-address translation with PAE paging. 4.4.1 PDPTE Registers When PAE paging is used, CR3 references t...
Page 129 - Linear-Address Translation with PAE Paging; Because a PDPTE register is
Vol. 3 4-17 PAGING Table 4-8 gives the format of a PDPTE. If any of the PDPTEs sets both the P flag (bit 0) and any reserved bit, the MOV to CR instruction causes a general-protection exception (#GP(0)) and the PDPTEs are not loaded. 1 As show in Table 4-8, bits 2:1, 8:5, and 63:MAXPHYADDR are reser...
Page 139 - Directory-Pointer Table
Vol. 3 4-27 PAGING Table 4-13. Format of an IA-32e PML4 Entry (PML4E) that References a Page- Directory-Pointer Table Bit Position(s) Contents 0 (P) Present; must be 1 to reference a page-directory-pointer table 1 (R/W) Read/write; if 0, writes may not be allowed to the 512-GByte region controlled b...
Page 140 - References a Page Directory
4-28 Vol. 3 PAGING • If the PDE’s PS flag is 1, the PDE maps a 2-MByte page (see Table 4-15). The final physical address is computed as follows: Table 4-14. Format of an IA-32e Page-Directory-Pointer-Table Entry (PDPTE) that References a Page Directory Bit Position(s) Contents 0 (P) Present; must be...
Page 144 - RIGHTS; With PAE paging, the PDPTEs do not determine access rights.
4-32 Vol. 3 PAGING • If the P flag of a PML4E or a PDPTE is 1, the PS flag is reserved. • If the P flag and the PS flag of a PDE are both 1, bits 20:13 are reserved. • If IA32_EFER.NXE = 0 and the P flag of a paging-structure entry is 1, the XD flag (bit 63) is reserved. A reference using a linear a...
Page 146 - EXCEPTIONS
4-34 Vol. 3 PAGING both the R/W flag and the U/S flag are 1 in every paging-structure entry controlling the translation. — Instruction fetches. • For 32-bit paging or if IA32_EFER.NXE = 0, instructions may be fetched from any linear address with a valid translation for which the U/S flag is 1 in eve...
Page 148 - ACCESSED AND DIRTY FLAGS; For paging-structure entries that map a page (as opposed to; son, the PDPTEs do not contain accessed flags with PAE paging.
4-36 Vol. 3 PAGING Page-fault exceptions occur only due to an attempt to use a linear address. Failures to load the PDPTE registers with PAE paging (see Section 4.4.1) cause general-protection exceptions (#GP(0)) and not page-fault exceptions. 4.8 ACCESSED AND DIRTY FLAGS For any paging-structure en...
Page 149 - PAGING AND MEMORY TYPING; Section; Paging and Memory Typing When the PAT is Not Supported; Paging and Memory Typing When the PAT is Supported; how to determine whether the PAT is supported.
Vol. 3 4-37 PAGING 4.9 PAGING AND MEMORY TYPING The memory type of a memory access refers to the type of caching used for that access. Chapter 11, “Memory Cache Control” provides many details regarding memory typing in the Intel-64 and IA-32 architectures. This section describes how paging contribut...
Page 150 - Caching Paging-Related Information about Memory Typing; CACHING TRANSLATION INFORMATION
4-38 Vol. 3 PAGING The PAT is a 64-bit MSR (IA32_PAT; MSR index 277H) comprising eight (8) 8-bit entries (entry i comprises bits 8i+7:8i of the MSR).For any access to a physical address, the table combines the memory type specified for that physical address by the MTRRs with a memory type selected f...
Page 153 - Caches
Vol. 3 4-41 PAGING entries in memory. See Section 4.10.3.2 for how software can ensure that the processor uses the modified paging-structure entries.If the paging structures specify a translation using a page larger than 4 KBytes, some processors may choose to cache multiple smaller-page TLB entries...
Page 158 - Invalidation of TLBs and Paging-Structure Caches; INVLPG
4-46 Vol. 3 PAGING 4.10.3 Invalidation of TLBs and Paging-Structure Caches As noted in Section 4.10.1 and Section 4.10.2, the processor may create entries in the TLBs and the paging-structure caches when linear addresses are translated, and it may retain these entries even after the paging structure...
Page 161 - Propagation of Paging-Structure Changes to Multiple
Vol. 3 4-49 PAGING in response to an attempted user-mode access) but no other adverse behavior. Such an exception will occur at most once for each affected linear address (see Section 4.10.3.1). • If a paging-structure entry is modified to change the XD flag from 1 to 0, failure to perform an invali...
Page 163 - INTERACTIONS WITH VIRTUAL-MACHINE; Transitions; paging or PAE paging.; VMX Support for Address Translation
Vol. 3 4-51 PAGING 4.11 INTERACTIONS WITH VIRTUAL-MACHINE EXTENSIONS (VMX) The architecture for virtual-machine extensions (VMX) includes features that interact with paging. Section 4.11.1 discusses ways in which VMX-specific control transfers, called VMX transitions specially affect paging. Section...
Page 164 - USING PAGING FOR VIRTUAL MEMORY
4-52 Vol. 3 PAGING concurrently information for multiple address spaces in its TLBs and paging-structure caches. See Section 25.1 for details.When EPT is in use, the addresses in the paging-structures are not used as physical addresses to access memory and memory-mapped I/O. Instead, they are treate...
Page 165 - to Each Segment
Vol. 3 4-53 PAGING segments can be mapped to pages in several ways. To implement a flat (unseg-mented) addressing environment, for example, all the code, data, and stack modules can be mapped to one or more large segments (up to 4-GBytes) that share same range of linear addresses (see Figure 3-2 in ...
Page 167 - Privilege level checks.; ENABLING AND DISABLING SEGMENT AND PAGE
Vol. 3 5-1 CHAPTER 5 PROTECTION In protected mode, the Intel 64 and IA-32 architectures provide a protection mecha-nism that operates at both the segment level and the page level. This protection mechanism provides the ability to limit access to certain segments or pages based on privilege levels (f...
Page 169 - PROTECTION
Vol. 3 5-3 PROTECTION procedure. The term current privilege level (CPL) refers to the setting of this field. • User/supervisor (U/S) flag — (Bit 2 of paging-structure entries.) Determines the type of page: user or supervisor. • Read/write (R/W) flag — (Bit 1 of paging-structure entries.) Determines ...
Page 170 - Figure 5-1. Descriptor Fields Used for Protection
5-4 Vol. 3 PROTECTION Many different styles of protection schemes can be implemented with these fields and flags. When the operating system creates a descriptor, it places values in these fields and flags in keeping with the particular protection style chosen for an operating system or executive. Ap...
Page 171 - Code Segment Descriptor in 64-bit Mode
Vol. 3 5-5 PROTECTION The following sections describe how the processor uses these fields and flags to perform the various categories of checks described in the introduction to this chapter. 5.2.1 Code Segment Descriptor in 64-bit Mode Code segments continue to exist in 64-bit mode even though, for ...
Page 172 - CHECKING; A byte at an offset greater than the effective limit; Figure 5-2. Descriptor Fields with Flags used in IA-32e Mode
5-6 Vol. 3 PROTECTION 5.3 LIMIT CHECKING The limit field of a segment descriptor prevents programs or procedures from addressing memory locations outside the segment. The effective value of the limit depends on the setting of the G (granularity) flag (see Figure 5-1). For data segments, the limit al...
Page 173 - Limit Checking in 64-bit Mode; Segment descriptors contain type information in two places:
Vol. 3 5-7 PROTECTION • A doubleword at an offset greater than the (effective-limit – 3) • A quadword at an offset greater than the (effective-limit – 7) For expand-down data segments, the segment limit has the same function but is interpreted differently. Here, the effective limit specifies the las...
Page 175 - Null Segment Selector Checking; NULL Segment Checking in 64-bit Mode; LEVELS
Vol. 3 5-9 PROTECTION instruction. If the descriptor type is for a code segment or call gate, a call or jump to another code segment is indicated; if the descriptor type is for a TSS or task gate, a task switch is indicated. — On a call or jump through a call gate (or on an interrupt- or exception-h...
Page 177 - — Nonconforming code segment (without using a call gate) — The DPL; PRIVILEGE LEVEL CHECKING WHEN ACCESSING DATA
Vol. 3 5-11 PROTECTION example, if the DPL of a data segment is 1, only programs running at a CPL of 0 or 1 can access the segment. — Nonconforming code segment (without using a call gate) — The DPL indicates the privilege level that a program or task must be at to access the segment. For example, i...
Page 178 - Figure 5-4. Privilege Check for Data Access
5-12 Vol. 3 PROTECTION loads the segment selector into the segment register if the DPL is numerically greater than or equal to both the CPL and the RPL. Otherwise, a general-protection fault is generated and the segment register is not loaded. Figure 5-5 shows four procedures (located in codes segme...
Page 179 - Accessing Data in Code Segments
Vol. 3 5-13 PROTECTION As demonstrated in the previous examples, the addressable domain of a program or task varies as its CPL changes. When the CPL is 0, data segments at all privilege levels are accessible; when the CPL is 1, only data segments at privilege levels 1 through 3 are accessible; when ...
Page 180 - PRIVILEGE LEVEL CHECKING WHEN LOADING THE SS
5-14 Vol. 3 PROTECTION • Load a data-segment register with a segment selector for a nonconforming, readable, code segment. • Load a data-segment register with a segment selector for a conforming, readable, code segment. • Use a code-segment override prefix (CS) to read a readable, code segment whose...
Page 181 - Direct Calls or Jumps to Code Segments
Vol. 3 5-15 PROTECTION • The target operand points to a TSS, which contains the segment selector for the target code segment. • The target operand points to a task gate, which points to a TSS, which in turn contains the segment selector for the target code segment. The following sections describe fi...
Page 182 - Accessing Nonconforming Code Segments
5-16 Vol. 3 PROTECTION • The RPL of the segment selector of the destination code segment. • The conforming (C) flag in the segment descriptor for the destination code segment, which determines whether the segment is a conforming (C flag is set) or nonconforming (C flag is clear) code segment. See Se...
Page 183 - Accessing Conforming Code Segments; From Various Privilege Levels
Vol. 3 5-17 PROTECTION The RPL of the segment selector that points to a nonconforming code segment has a limited effect on the privilege check. The RPL must be numerically less than or equal to the CPL of the calling procedure for a successful control transfer to occur. So, in the example in Figure ...
Page 184 - Call gates
5-18 Vol. 3 PROTECTION In the example in Figure 5-7, code segment D is a conforming code segment. There-fore, calling procedures in both code segment A and B can access code segment D (using either segment selector D1 or D2, respectively), because they both have CPLs that are greater than or equal t...
Page 185 - Gates; It specifies the code segment to be accessed.
Vol. 3 5-19 PROTECTION 5.8.3 Call Gates Call gates facilitate controlled transfers of program control between different privi-lege levels. They are typically used only in operating systems or executives that use the privilege-level protection mechanism. Call gates are also useful for transferring pr...
Page 186 - IA-32e Mode Call Gates
5-20 Vol. 3 PROTECTION Note that the P flag in a gate descriptor is normally always set to 1. If it is set to 0, a not present (#NP) exception is generated when a program attempts to access the descriptor. The operating system can use the P flag for special purposes. For example, it could be used to...
Page 188 - Accessing a Code Segment Through a Call Gate; The DPL (descriptor privilege level) of the call gate descriptor.
5-22 Vol. 3 PROTECTION 5.8.4 Accessing a Code Segment Through a Call Gate To access a call gate, a far pointer to the gate is provided as a target operand in a CALL or JMP instruction. The segment selector from this pointer identifies the call gate (see Figure 5-10); the offset from the pointer is r...
Page 189 - Figure 5-11. Privilege Check for Control Transfer with Call Gate
Vol. 3 5-23 PROTECTION The privilege checking rules are different depending on whether the control transfer was initiated with a CALL or a JMP instruction, as shown in Table 5-1. The DPL field of the call-gate descriptor specifies the numerically highest privilege level from which a calling procedur...
Page 191 - Switching
Vol. 3 5-25 PROTECTION Call gates allow a single code segment to have procedures that can be accessed at different privilege levels. For example, an operating system located in a code segment may have some services which are intended to be used by both the oper-ating system and application software ...
Page 193 - Figure 5-13. Stack Switching During an Interprivilege-Level Call
Vol. 3 5-27 PROTECTION 3. Checks the stack-segment descriptor for the proper privileges and type and generates an invalid TSS (#TS) exception if violations are detected. 4. Temporarily saves the current values of the SS and ESP registers.5. Loads the segment selector and stack pointer for the new st...
Page 194 - Stack Switching in 64-bit Mode; Returning from a Called Procedure; ESP
5-28 Vol. 3 PROTECTION dure, one of the parameters can be a pointer to a data structure, or the saved contents of the SS and ESP registers may be used to access parameters in the old stack space. The size of the data items passed to the called procedure depends on the call gate size, as described in...
Page 196 - Performing Fast Calls to System Procedures with the; Stack segment — Computed by adding 8 to the value in IA32_SYSENTER_CS.
5-30 Vol. 3 PROTECTION 5. (If the RET instruction includes a parameter count operand.) Adds the parameter count (in bytes obtained from the RET instruction) to the current ESP register value, to step past the parameters on the calling procedure’s stack. The resulting ESP value is not checked against...
Page 197 - Stack pointer — Reads this from ECX.; SYSENTER and SYSEXIT Instructions in IA-32e Mode; Target instruction — Reads 64-bit canonical address in RDX.
Vol. 3 5-31 PROTECTION • Stack segment — Computed by adding 24 to the value in IA32_SYSENTER_CS. • Stack pointer — Reads this from ECX. The SYSENTER and SYSEXIT instructions preform “fast” calls and returns because they force the processor into a predefined privilege level 0 state when SYSENTER is e...
Page 198 - Stack pointer — Update ESP from 32-bit address in ECX.; Fast System Calls in 64-bit Mode; Target instruction — Copies the value in RCX into RIP.
5-32 Vol. 3 PROTECTION When SYSEXIT transfers control to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: • Target code segment — Computed by adding 16 to the value in IA32_SYSENTER_CS. • New CS attributes — L-bit = 0 (go to co...
Page 199 - Target instruction — Copies the value in ECX into EIP.; INSTRUCTIONS; Figure 5-14. MSRs Used by SYSCALL and SYSRET
Vol. 3 5-33 PROTECTION When SYSRET transfers control to 32-bit mode user code using a 32-bit operand size, the processor gets the privilege level 3 target instruction and stack pointer from: • Target code segment — Reads a non-NULL selector from IA32_STAR[63:48]. • Target instruction — Copies the va...
Page 200 - VALIDATION
5-34 Vol. 3 PROTECTION general-protection exception (#GP) is generated. The following system instructions are privileged instructions: • LGDT — Load GDT register. • LLDT — Load LDT register. • LTR — Load task register. • LIDT — Load IDT register. • MOV (control registers) — Load and store control re...
Page 202 - Checking That the Pointer Offset Is Within Limits (LSL
5-36 Vol. 3 PROTECTION 5.10.2 Checking Read/Write Rights (VERR and VERW Instructions) When the processor accesses any code or data segment it checks the read/write priv-ileges assigned to the segment to verify that the intended read or write operation is allowed. Software can check read/write rights...
Page 203 - Checking Caller Access Privileges (ARPL Instruction)
Vol. 3 5-37 PROTECTION destination register and sets the ZF flag in the EFLAGS register. If the segment selector is not visible at the current privilege level or is an invalid type for the LSL instruction, the instruction does not modify the destination register and clears the ZF flag. Once loaded i...
Page 205 - Alignment; Restriction of addressable domain (supervisor and user modes).
Vol. 3 5-39 PROTECTION The example in Figure 5-15 demonstrates how the ARPL instruction is intended to be used. When the operating-system receives segment selector D2 from the application program, it uses the ARPL instruction to compare the RPL of the segment selector with the privilege level of the...
Page 206 - Flags; The page-level protection mechanism recognizes two page types:
5-40 Vol. 3 PROTECTION page-fault exception mechanism. This chapter describes the protection violations which lead to page-fault exceptions. 5.11.1 Page-Protection Flags Protection information for pages is contained in two flags in a paging-structure entry (see Chapter 4): the read/write flag (bit 1...
Page 207 - Combining Protection of Both Levels of Page Tables; COMBINING PAGE AND SEGMENT PROTECTION
Vol. 3 5-41 PROTECTION When the processor is in supervisor mode and the WP flag in register CR0 is clear (its state following reset initialization), all pages are both readable and writable (write-protection is ignored). When the processor is in user mode, it can write only to user-mode pages that a...
Page 209 - PAGE-LEVEL PROTECTION AND EXECUTE-DISABLE; Detecting and Enabling the Execute-Disable Capability; disable bit
Vol. 3 5-43 PROTECTION 5.13 PAGE-LEVEL PROTECTION AND EXECUTE-DISABLE BIT In addition to page-level protection offered by the U/S and R/W flags, paging struc-tures used with PAE paging and IA-32e paging (see Chapter 4) provide the execute-disable bit. This bit offers additional protection for data p...
Page 210 - Execute-Disable Page Protection; with Execute-Disable Bit Capability; Valid Usage
5-44 Vol. 3 PROTECTION 5.13.2 Execute-Disable Page Protection The execute-disable bit in the paging structures enhances page protection for data pages. Instructions cannot be fetched from a memory page if IA32_EFER.NXE =1 and the execute-disable bit is set in any of the paging-structure entries used...
Page 211 - Reserved Bit Checking; Execute Disable Bit Value (Bit 63) Valid Usage
Vol. 3 5-45 PROTECTION 5.13.3 Reserved Bit Checking The processor enforces reserved bit checking in paging data structure entries. The bits being checked varies with paging mode and may vary with the size of physical address space. Table 5-8 shows the reserved bits that are checked when the execute ...
Page 212 - Capability Enabled; Mode
5-46 Vol. 3 PROTECTION If execute disable bit capability is not enabled or not available, reserved bit checking in 64-bit mode includes bit 63 and additional bits. This and reserved bit checking for legacy 32-bit paging modes are shown in Table 5-10. Table 5-8. IA-32e Mode Page Level Protection Matr...
Page 213 - Handling
Vol. 3 5-47 PROTECTION 5.13.4 Exception Handling When execute disable bit capability is enabled (IA32_EFER.NXE = 1), conditions for a page fault to occur include the same conditions that apply to an Intel 64 or IA-32 processor without execute disable bit capability plus the following new condition: ...
Page 215 - INTERRUPT AND EXCEPTION OVERVIEW
Vol. 3 6-1 CHAPTER 6 INTERRUPT AND EXCEPTION HANDLING This chapter describes the interrupt and exception-handling mechanism when oper-ating in protected mode on an Intel 64 or IA-32 processor. Most of the information provided here also applies to interrupt and exception mechanisms used in real-addre...
Page 216 - EXCEPTION AND INTERRUPT VECTORS; The processor receives interrupts from two sources:; Interrupts
6-2 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.2 EXCEPTION AND INTERRUPT VECTORS To aid in handling exceptions and interrupts, each architecturally defined exception and each interrupt condition requiring special handling by the processor is assigned a unique identification number, called a vector. T...
Page 218 - Maskable Hardware Interrupts
6-4 Vol. 3 INTERRUPT AND EXCEPTION HANDLING The processor’s local APIC is normally connected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC’s pins can be directed to the local APIC through the system bus (Pentium 4, Intel Core Duo, Intel Core 2, Intel Atom, and Intel ...
Page 219 - SOURCES OF EXCEPTIONS; The processor receives exceptions from three sources:; Exceptions
Vol. 3 6-5 INTERRUPT AND EXCEPTION HANDLING defined interrupt vectors from 0 through 255; those that can be delivered through the local APIC include interrupt vectors 16 through 255. The IF flag in the EFLAGS register permits all maskable hardware interrupts to be masked as a group (see Section 6.8....
Page 220 - CLASSIFICATIONS
6-6 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.4.2 Software-Generated Exceptions The INTO, INT 3, and BOUND instructions permit exceptions to be generated in soft-ware. These instructions allow checks for exception conditions to be performed at points in the instruction stream. For example, INT 3 cau...
Page 221 - PROGRAM OR TASK RESTART
Vol. 3 6-7 INTERRUPT AND EXCEPTION HANDLING • Aborts — An abort is an exception that does not always report the precise location of the instruction causing the exception and does not allow a restart of the program or task that caused the exception. Aborts are used to report severe errors, such as ha...
Page 222 - INTERRUPT; External hardware asserts the NMI pin.
6-8 Vol. 3 INTERRUPT AND EXCEPTION HANDLING EFLAGS.OF (overflow) flag. The trap handler for this exception resolves the overflow condition. Upon return from the trap handler, program or task execution continues at the instruction following the INTO instruction.The abort-class exceptions do not suppo...
Page 223 - Handling Multiple NMIs; ENABLING AND DISABLING INTERRUPTS; Masking Maskable Hardware Interrupts
Vol. 3 6-9 INTERRUPT AND EXCEPTION HANDLING It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processor’s NMI-handling hardw...
Page 224 - Masking Instruction Breakpoints
6-10 Vol. 3 INTERRUPT AND EXCEPTION HANDLING is an interrupt. As with the INT n instruction (see Section 6.4.2, “Software-Generated Exceptions”), when an interrupt is generated through the INTR pin to an exception vector, the processor does not push an error code on the stack, so the exception handl...
Page 225 - Masking Exceptions and Interrupts When Switching Stacks; PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND; Table 6-2. Priority Among Simultaneous Exceptions and Interrupts; Priority
Vol. 3 6-11 INTERRUPT AND EXCEPTION HANDLING 6.8.3 Masking Exceptions and Interrupts When Switching Stacks To switch to a different stack segment, software often uses a pair of instructions, for example: MOV SS, AXMOV ESP, StackTop If an interrupt or exception occurs after the segment selector has b...
Page 228 - DESCRIPTORS; The IDT may contain any of three kinds of gate descriptors:; Figure 6-1. Relationship of the IDTR and IDT
6-14 Vol. 3 INTERRUPT AND EXCEPTION HANDLING 6.11 IDT DESCRIPTORS The IDT may contain any of three kinds of gate descriptors: • Task-gate descriptor • Interrupt-gate descriptor • Trap-gate descriptor Figure 6-2 shows the formats for the task-gate, interrupt-gate, and trap-gate descriptors. The forma...
Page 229 - EXCEPTION AND INTERRUPT HANDLING
Vol. 3 6-15 INTERRUPT AND EXCEPTION HANDLING 6.12 EXCEPTION AND INTERRUPT HANDLING The processor handles calls to exception- and interrupt-handlers similar to the way it handles calls with a CALL instruction to a procedure or a task. When responding to an exception or interrupt, the processor uses t...
Page 230 - Exception- or Interrupt-Handler Procedures
6-16 Vol. 3 INTERRUPT AND EXCEPTION HANDLING “Returning from a Called Procedure”). If index points to a task gate, the processor executes a task switch to the exception- or interrupt-handler task in a manner similar to a CALL to a task gate (see Section 7.3, “Task Switching”). 6.12.1 Exception- or I...
Page 234 - Tasks
6-20 Vol. 3 INTERRUPT AND EXCEPTION HANDLING of the EFLAGS register on the stack. Accessing a handler procedure through a trap gate does not affect the IF flag. 6.12.2 Interrupt Tasks When an exception or interrupt handler is accessed through a task gate in the IDT, a task switch results. Handling a...
Page 235 - CODE; IDT
Vol. 3 6-21 INTERRUPT AND EXCEPTION HANDLING 6.13 ERROR CODE When an exception condition is related to a specific segment, the processor pushes an error code onto the stack of the exception handler (whether it is a procedure or task). The error code has the format shown in Figure 6-6. The error code...
Page 236 - TI; EXCEPTION AND INTERRUPT HANDLING IN 64-BIT
6-22 Vol. 3 INTERRUPT AND EXCEPTION HANDLING clear, indicates that the index refers to a descriptor in the GDT or the current LDT. TI GDT/LDT (bit 2) — Only used when the IDT flag is clear. When set, the TI flag indicates that the index portion of the error code refers to a segment or gate descripto...
Page 238 - 4-Bit Mode Stack Frame
6-24 Vol. 3 INTERRUPT AND EXCEPTION HANDLING ware attempts to reference an interrupt gate with a target RIP that is not in canonical form.The target code segment referenced by the interrupt gate must be a 64-bit code segment (CS.L = 1, CS.D = 0). If the target is not a 64-bit code segment, a general...
Page 239 - Stack Switching in IA-32e Mode
Vol. 3 6-25 INTERRUPT AND EXCEPTION HANDLING 6.14.3 IRET in IA-32e Mode In IA-32e mode, IRET executes with an 8-byte operand size. There is nothing that forces this requirement. The stack is formatted in such a way that for actions where IRET is required, the 8-byte IRET operand size works correctly...
Page 240 - Interrupt Stack Table; Figure 6-8. IA-32e Mode Stack Usage After Privilege Level Change
6-26 Vol. 3 INTERRUPT AND EXCEPTION HANDLING In summary, a stack switch in IA-32e mode works like the legacy stack switch, except that a new SS selector is not loaded from the TSS. Instead, the new SS is forced to NULL. 6.14.5 Interrupt Stack Table In IA-32e mode, a new interrupt stack table (IST) m...
Page 241 - EXCEPTION AND INTERRUPT REFERENCE
Vol. 3 6-27 INTERRUPT AND EXCEPTION HANDLING 6.15 EXCEPTION AND INTERRUPT REFERENCE The following sections describe conditions which generate exceptions and interrupts. They are arranged in the order of vector numbers. The information contained in these sections are as follows: • Exception Class — I...
Page 242 - Exception Class; Description
6-28 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 0—Divide Error Exception (#DE) Exception Class Fault. Description Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the result cannot be represented in the number of bits specified for the destination operand. Exception Er...
Page 243 - Trap or Fault. The exception handler can distinguish; Exception Condition
Vol. 3 6-29 INTERRUPT AND EXCEPTION HANDLING Interrupt 1—Debug Exception (#DB) Exception Class Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the other debug registers. Description Indicates that one or more of several debug-exceptio...
Page 244 - Interrupt 2—NMI Interrupt
6-30 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 2—NMI Interrupt Exception Class Not applicable. Description The nonmaskable interrupt (NMI) is generated externally by asserting the processor’s NMI pin or through an NMI request set by the I/O APIC to the local APIC. This interrupt causes the N...
Page 248 - Indicates that the processor did one of the following things:
6-34 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 6—Invalid Opcode Exception (#UD) Exception Class Fault. Description Indicates that the processor did one of the following things: • Attempted to execute an invalid or reserved opcode. • Attempted to execute an instruction with an operand type th...
Page 249 - Exception Error Code
Vol. 3 6-35 INTERRUPT AND EXCEPTION HANDLING processor and earlier IA-32 processors, this exception is not generated as the result of prefetching and preliminary decoding of an invalid instruction. (See Section 6.5, “Exception Classifications,” for general rules for taking of interrupts and exceptio...
Page 251 - Saved Instruction Pointer
Vol. 3 6-37 INTERRUPT AND EXCEPTION HANDLING Saved Instruction Pointer The saved contents of CS and EIP registers point to the floating-point instruction or the WAIT/FWAIT instruction that generated the exception. Program State Change A program-state change does not accompany a device-not-available ...
Page 252 - Class
6-38 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 8—Double Fault Exception (#DF) Exception Class Abort. Description Indicates that the processor detected a second exception while calling an exception handler for a prior exception. Normally, when the processor detects another excep-tion while tr...
Page 253 - The saved contents of CS and EIP registers are undefined.; Program State Change; Second Exception
Vol. 3 6-39 INTERRUPT AND EXCEPTION HANDLING A segment or page fault may be encountered while prefetching instructions; however, this behavior is outside the domain of Table 6-5. Any further faults gener-ated while the processor is attempting to transfer control to the appropriate fault handler coul...
Page 255 - Interrupt 9—Coprocessor Segment Overrun; A program-state following; a coprocessor segment-overrun ex
Vol. 3 6-41 INTERRUPT AND EXCEPTION HANDLING Interrupt 9—Coprocessor Segment Overrun Exception Class Abort. (Intel reserved; do not use. Recent IA-32 processors do not generate this exception.) Description Indicates that an Intel386 CPU-based systems with an Intel 387 math coprocessor detected a pag...
Page 256 - Error Code Index
6-42 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 10—Invalid TSS Exception (#TS) Exception Class Fault. Description Indicates that there was an error related to a TSS. Such an error might be detected during a task switch or during the execution of instructions that use information from a TSS. T...
Page 258 - from a TSS on a call or exception which changes privilege levels in
6-44 Vol. 3 INTERRUPT AND EXCEPTION HANDLING This exception can generated either in the context of the original task or in the context of the new task (see Section 7.3, “Task Switching”). Until the processor has completely verified the presence of the new TSS, the exception is generated in the conte...
Page 264 - Transferring execution to a segment that is not executable.
6-50 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 13—General Protection Exception (#GP) Exception Class Fault. Description Indicates that the processor detected one of a class of protection violations called “general-protection violations.” The conditions that cause this exception to be gener-a...
Page 266 - If the memory address is in a non-canonical form.
6-52 Vol. 3 INTERRUPT AND EXCEPTION HANDLING • A selector from a TSS involved in a task switch. • IDT vector number. Saved Instruction Pointer The saved contents of CS and EIP registers point to the instruction that generated the exception. Program State Change In general, a program-state change doe...
Page 270 - While reading the GDT to locate the TSS descriptor of the new task.
6-56 Vol. 3 INTERRUPT AND EXCEPTION HANDLING second page fault can occur. 1 If a page fault is caused by a page-level protection violation, the access flag in the page-directory entry is set when the fault occurs. The behavior of IA-32 processors regarding the access flag in the corresponding page-t...
Page 271 - Additional Exception-Handling Information
Vol. 3 6-57 INTERRUPT AND EXCEPTION HANDLING description for “Interrupt 10—Invalid TSS Exception (#TS)” in this chapter for addi-tional information on how to handle this situation.) Additional Exception-Handling Information Special care should be taken to ensure that an exception that occurs during ...
Page 272 - encountered in the program’s instruction stream.
6-58 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 16—x87 FPU Floating-Point Error (#MF) Exception Class Fault. Description Indicates that the x87 FPU has detected a floating-point error. The NE flag in the register CR0 must be set for an interrupt 16 (floating-point error exception) to be gener...
Page 273 - None. The x87 FPU provides its own error information.
Vol. 3 6-59 INTERRUPT AND EXCEPTION HANDLING Prior to executing a waiting x87 FPU instruction or the WAIT/FWAIT instruction, the x87 FPU checks for pending x87 FPU floating-point exceptions (as described in step 2 above). Pending x87 FPU floating-point exceptions are ignored for “non-waiting” x87 FP...
Page 274 - AM flag in CR0 register is set.; Table 6-7. Alignment Requirements by Data Type; Data Type
6-60 Vol. 3 INTERRUPT AND EXCEPTION HANDLING Interrupt 17—Alignment Check Exception (#AC) Exception Class Fault. Description Indicates that the processor detected an unaligned memory operand when alignment checking was enabled. Alignment checks are only carried out in data (or stack) accesses (not i...
Page 281 - Interrupts 32 to 255—User Defined Interrupts
Vol. 3 6-67 INTERRUPT AND EXCEPTION HANDLING Interrupts 32 to 255—User Defined Interrupts Exception Class Not applicable. Description Indicates that the processor did one of the following things: • Executed an INT n instruction where the instruction operand is one of the vector numbers from 32 throu...
Page 283 - TASK MANAGEMENT OVERVIEW; Structure
Vol. 3 7-1 CHAPTER 7 TASK MANAGEMENT This chapter describes the IA-32 architecture’s task management facilities. These facilities are only available when the processor is running in protected mode.This chapter focuses on 32-bit tasks and the 32-bit TSS structure. For information on 16-bit tasks and ...
Page 284 - TASK MANAGEMENT; State
7-2 Vol. 3 TASK MANAGEMENT 7.1.2 Task State The following items define the state of the currently executing task: • The task’s current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS). • The state of the general-purpose registers. • The state of...
Page 285 - Executing a Task; A explicit call to a task with the CALL instruction.
Vol. 3 7-3 TASK MANAGEMENT 7.1.3 Executing a Task Software or the processor can dispatch a task for execution in one of the following ways: • A explicit call to a task with the CALL instruction. • A explicit jump to a task with the JMP instruction. • An implicit call (by the processor) to an interru...
Page 286 - TASK MANAGEMENT DATA STRUCTURES; NT flag in the EFLAGS register.
7-4 Vol. 3 TASK MANAGEMENT page tables as other privilege-level-3 tasks can access code and corrupt data and the stack of other tasks.Use of task management facilities for handling multitasking applications is optional. Multitasking can be handled in software, with each software defined task execute...
Page 289 - Descriptor
Vol. 3 7-7 TASK MANAGEMENT • Task switches are carried out faster if the pages containing these structures are present in memory before the task switch is initiated. 7.2.2 TSS Descriptor The TSS, like all other segments, is defined by a segment descriptor. Figure 7-3 shows the format of a TSS descri...
Page 290 - TSS Descriptor in 64-bit mode
7-8 Vol. 3 TASK MANAGEMENT of a TSS. Attempting to switch to a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS exception (#TS). A larger limit is required if an I/O permission bit map is included or if the operating system stores additional data. The processor does not c...
Page 291 - Register; Figure 7-4. Format of TSS and LDT Descriptors in 64-bit Mode
Vol. 3 7-9 TASK MANAGEMENT 7.2.4 Task Register The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (see Figure 2-5). This information is copied from the TSS descri...
Page 294 - SWITCHING; Figure 7-7. Task Gates Referencing the Same Task
7-12 Vol. 3 TASK MANAGEMENT to be handled by handler tasks. When an interrupt or exception vector points to a task gate, the processor switches to the specified task. Figure 7-7 illustrates how a task gate in an LDT, a task gate in the GDT, and a task gate in the IDT can all point to the same task. ...
Page 296 - NOTES; the new task appears not to have been executed.)
7-14 Vol. 3 TASK MANAGEMENT 10. If the task switch was initiated with a CALL instruction, JMP instruction, an exception, or an interrupt, the processor sets the busy (B) flag in the new task’s TSS descriptor; if initiated with an IRET instruction, the busy (B) flag is left set. 11. Loads the task re...
Page 297 - Table 7-1. Exception Conditions Checked During a Task Switch; Condition Checked
Vol. 3 7-15 TASK MANAGEMENT rules control access to a TSS, software does not need to perform explicit privilege checks on a task switch.Table 7-1 shows the exception conditions that the processor checks for when switching tasks. It also shows the exception that is generated for each check if an erro...
Page 298 - LINKING; New Data Segment
7-16 Vol. 3 TASK MANAGEMENT The TS (task switched) flag in the control register CR0 is set every time a task switch occurs. System software uses the TS flag to coordinate the actions of floating-point unit when generating floating-point exceptions with the rest of the processor. The TS flag indicate...
Page 299 - Previous Task Link Field, and TS Flag
Vol. 3 7-17 TASK MANAGEMENT Table 7-2 shows the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task switch.The NT flag may be modified by software executing at any privilege level. It is possible for a program to s...
Page 300 - Use of Busy Flag To Prevent Recursive Task Switching
7-18 Vol. 3 TASK MANAGEMENT 7.4.1 Use of Busy Flag To Prevent Recursive Task Switching A TSS allows only one context to be saved for a task; therefore, once a task is called (dispatched), a recursive (or re-entrant) call to the task would cause the current state of the task to be lost. The busy flag...
Page 301 - TASK ADDRESS SPACE; Mapping Tasks to the Linear and Physical Address Spaces
Vol. 3 7-19 TASK MANAGEMENT In a multiprocessing system, additional synchronization and serialization operations must be added to this procedure to insure that the TSS and its segment descriptor are both locked when the previous task link field is changed and the busy flag is cleared. 7.5 TASK ADDRE...
Page 302 - Task Logical Address Space
7-20 Vol. 3 TASK MANAGEMENT and the page tables point to different pages of physical memory, then the tasks do not share physical addresses.With either method of mapping task linear address spaces, the TSSs for all tasks must lie in a shared area of the physical space, which is accessible to all tas...
Page 304 - TASK MANAGEMENT IN 64-BIT MODE
7-22 Vol. 3 TASK MANAGEMENT 7.7 TASK MANAGEMENT IN 64-BIT MODE In 64-bit mode, task structure and task state are similar to those in protected mode. However, the task switching mechanism available in protected mode is not supported in 64-bit mode. Task management and switching must be performed by s...
Page 307 - Hyper-Threading Technology and Intel
Vol. 3 8-1 CHAPTER 8 MULTIPLE-PROCESSOR MANAGEMENT The Intel 64 and IA-32 architectures provide mechanisms for managing and improving the performance of multiple processors connected to the same system bus. These include: • Bus locking and/or cache coherency management for performing atomic operatio...
Page 308 - MULTIPLE-PROCESSOR MANAGEMENT; ATOMIC; Guaranteed atomic operations
8-2 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT • To distribute interrupt handling among a group of processors — When several processors are operating in a system in parallel, it is useful to have a centralized mechanism for receiving interrupts and distributing them to available processors for servicing. ...
Page 309 - Guaranteed Atomic Operations; Reading or writing a byte; Locking
Vol. 3 8-3 MULTIPLE-PROCESSOR MANAGEMENT software to manage the fairness of semaphores and exclusive locking functions. The mechanisms for handling locked atomic operations have evolved with the complexity of IA-32 processors. More recent IA-32 processors (such as the Pentium 4, Intel Xeon, and P6 f...
Page 310 - Automatic Locking; When executing an XCHG instruction that references memory.
8-4 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT the hardware designer to make the LOCK# signal available in system hardware to control memory accesses among processors.For the P6 and more recent processor families, if the memory area being accessed is cached internally in the processor, the LOCK# signal is...
Page 311 - Software Controlled Bus Locking; The LOCK prefix is automatically assumed for XCHG instruction.; 6-bit boundary for locked word accesses.
Vol. 3 8-5 MULTIPLE-PROCESSOR MANAGEMENT 8.1.2.2 Software Controlled Bus Locking To explicitly force the LOCK semantics, software can use the LOCK prefix with the following instructions when they are used to modify a memory location. An invalid-opcode exception (#UD) is generated when the LOCK prefi...
Page 313 - Effects of a LOCK Operation on Internal Processor Caches
Vol. 3 8-7 MULTIPLE-PROCESSOR MANAGEMENT The act of one processor writing data into the currently executing code segment of a second processor with the intent of having the second processor execute that data as code is called cross-modifying code. As with self-modifying code, IA-32 processors exhibi...
Page 314 - ORDERING; Memory Ordering in the Intel
8-8 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT have cached the same area of memory from simultaneously modifying data in that area. 8.2 MEMORY ORDERING The term memory ordering refers to the order in which the processor issues reads (loads) and writes (stores) through the system bus to system memory. The ...
Page 315 - Memory Ordering in P6 and More Recent Processor Families; Reads are not reordered with other reads.
Vol. 3 8-9 MULTIPLE-PROCESSOR MANAGEMENT among processors are explicitly required to obey program ordering through the use of appropriate locking or serializing operations (see Section 8.2.5, “Strengthening or Weakening the Memory-Ordering Model”). 8.2.2 Memory Ordering in P6 and More Recent Process...
Page 317 - Examples Illustrating the Memory-Ordering Principles; Instructions that read or write a single byte.
Vol. 3 8-11 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3 Examples Illustrating the Memory-Ordering Principles This section provides a set of examples that illustrate the behavior of the memory-ordering principles introduced in Section 8.2.2. They are designed to give software writers an understanding of how ...
Page 318 - Neither Loads Nor Stores Are Reordered with Like Operations; Example 8-1. Stores Are Not Reordered with Other Stores
8-12 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Section 8.2.3.2 through Section 8.2.3.7 give examples using the MOV instruction. The principles that underlie these examples apply to load and store accesses in general and to other instructions that load from or store to memory. Section 8.2.3.8 and Section ...
Page 319 - Stores Are Not Reordered With Earlier Loads; Similarly, processor 0’s load from x occurs before its store to y.; Loads May Be Reordered with Earlier Stores to Different; Example 8-2. Stores Are Not Reordered with Older Loads
Vol. 3 8-13 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3.3 Stores Are Not Reordered With Earlier Loads The Intel-64 memory-ordering model ensures that a store by a processor may not occur before a previous load by the same processor. This is illustrated by the following example: Assume r1 == 1. • Because r1 ...
Page 320 - Intra-Processor Forwarding Is Allowed; Processor 0
8-14 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT has the two loads occurring before the two stores. This would result in each load returning value 0.The fact that a load may not be reordered with an earlier store to the same location is illustrated by the following example: The Intel-64 memory-ordering mod...
Page 321 - Stores Are Transitively Visible; Example 8-6. Stores Are Transitively Visible
Vol. 3 8-15 MULTIPLE-PROCESSOR MANAGEMENT 8.2.3.6 Stores Are Transitively Visible The memory-ordering model ensures transitive visibility of stores; stores that are causally related appear to all processors to occur in an order consistent with the causality relation. This is illustrated by the follo...
Page 322 - Locked Instructions Have a Total Order; Example 8-8. Locked Instructions Have a Total Order
8-16 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT By the principles discussed in Section 8.2.3.2, • processor 2’s first and second load cannot be reordered, • processor 3’s first and second load cannot be reordered. • If r1 == 1 and r2 == 0, processor 0’s store appears to precede processor 1’s store with re...
Page 324 - Out-of-Order Stores For String Operations; EDI and ESI must be 8-byte aligned for the Pentium
8-18 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.2.4 Out-of-Order Stores For String Operations The Intel Core 2 Duo, Intel Core, Pentium 4, and P6 family processors modify the processors operation during the string store operations (initiated with the MOVS and STOS instructions) to maximize performance. ...
Page 325 - Examples Illustrating Memory-Ordering Principles for String; Example 8-11. Stores Within a String Operation May be Reordered
Vol. 3 8-19 MULTIPLE-PROCESSOR MANAGEMENT 2. Stores from separate string operations (for example, stores from consecutive string operations) do not execute out of order. All the stores from an earlier string operation will complete before any store from a later string operation. 3. String operations...
Page 328 - Strengthening or Weakening the Memory-Ordering Model; III
8-22 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.2.5 Strengthening or Weakening the Memory-Ordering Model The Intel 64 and IA-32 architectures provide several mechanisms for strengthening or weakening the memory-ordering model to handle special programming situations. These mechanisms include: • The I/O ...
Page 333 - BSP and AP Processors
Vol. 3 8-27 MULTIPLE-PROCESSOR MANAGEMENT 8.4.1 BSP and AP Processors The MP initialization protocol defines two classes of processors: the bootstrap processor (BSP) and the application processors (APs). Following a power-up or RESET of an MP system, system hardware dynamically selects one of the pr...
Page 334 - MP Initialization Protocol Algorithm for; logical processors on the system bus.
8-28 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.4.3 MP Initialization Protocol Algorithm for Intel Xeon Processors Following a power-up or RESET of an MP system, the processors in the system execute the MP initialization protocol algorithm to initialize each of the logical proces-sors on the system bus ...
Page 335 - MP Initialization Example
Vol. 3 8-29 MULTIPLE-PROCESSOR MANAGEMENT • The newly established BSP broadcasts an FIPI message to “all including self,” which the BSP and APs treat as an end of MP initialization signal. Only the processor with its BSP flag set responds to the FIPI message. It responds by fetching and executing th...
Page 336 - Typical BSP Initialization Sequence
8-30 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT SVR EQU 0FEE000F0H APIC_ID EQU 0FEE00020H LVT3 EQU 0FEE00370H APIC_ENABLED EQU 0100H BOOT_ID DD ? COUNT EQU 00H VACANT EQU 00H 8.4.4.1 Typical BSP Initialization Sequence After the BSP and APs have been selected (by means of a hardware protocol, see Section ...
Page 338 - Typical AP Initialization Sequence
8-32 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT MOV EAX, 000C46XXH; Load ICR encoding from broadcast SIPI IP; to all APs into EAX where xx is the vector computed in step 8. 16. Waits for the timer interrupt.17. Reads and evaluates the COUNT variable and establishes a processor count.18. If necessary, reco...
Page 339 - Identifying Logical Processors in an MP System
Vol. 3 8-33 MULTIPLE-PROCESSOR MANAGEMENT 8.4.5 Identifying Logical Processors in an MP System After the BIOS has completed the MP initialization protocol, each logical processor can be uniquely identified by its local APIC ID. Software can access these APIC IDs in either of the following ways: • Re...
Page 340 - Figure 8-2. Interpretation of APIC ID in Early MP Systems
8-34 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT during power-up and initialization is 8 bits. Bits 2:1 form a 2-bit physical package identifier (which can also be thought of as a socket identifier). In systems that configure physical processors in clusters, bits 4:3 form a 2-bit cluster ID. Bit 0 is used ...
Page 341 - HYPER-THREADING TECHNOLOGY AND; Hyper-Threading Technology; DETECTING HARDWARE MULTI-THREADING; Addressable IDs for processor cores in the same Package
Vol. 3 8-35 MULTIPLE-PROCESSOR MANAGEMENT 8.5 INTEL ® HYPER-THREADING TECHNOLOGY AND INTEL ® MULTI-CORE TECHNOLOGY Intel Hyper-Threading Technology and Intel multi-core technology are extensions to Intel 64 and IA-32 architectures that enable a single physical processor to execute two or more separa...
Page 342 - Processors; one core per package.
8-36 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT number of addressable IDs attributable to processor cores (Y) in the physical package. • Extended Processor Topology Enumeration parameters for 32-bit APIC ID: Intel 64 processors supporting CPUID leaf 0BH will assign unique APIC IDs to each logical processo...
Page 343 - Initializing Multi-Core Processors
Vol. 3 8-37 MULTIPLE-PROCESSOR MANAGEMENT During initialization, each logical processor is assigned an APIC ID that is stored in the local APIC ID register for each logical processor. If two or more processors supporting Intel Hyper-Threading Technology are present, each logical processor on the sys...
Page 344 - HYPER-THREADING TECHNOLOGY
8-38 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.7 INTEL ® HYPER-THREADING TECHNOLOGY ARCHITECTURE Figure 8-4 shows a generalized view of an Intel processor supporting Intel Hyper-Threading Technology, using the original Intel Xeon processor MP as an example. This implementation of the Intel Hyper-Thread...
Page 345 - State of the Logical Processors; Duplicated for each logical processor; Technology
Vol. 3 8-39 MULTIPLE-PROCESSOR MANAGEMENT 8.7.1 State of the Logical Processors The following features are part of the architectural state of logical processors within Intel 64 or IA-32 processors supporting Intel Hyper-Threading Technology. The features can be subdivided into three groups: • Duplic...
Page 346 - Functionality
8-40 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT • Debug registers (DR0, DR1, DR2, DR3, DR6, DR7) and the debug control MSRs • Machine check global status (IA32_MCG_STATUS) and machine check capability (IA32_MCG_CAP) MSRs • Thermal clock modulation and ACPI Power management control MSRs • Time stamp counte...
Page 347 - Machine Check Architecture
Vol. 3 8-41 MULTIPLE-PROCESSOR MANAGEMENT gives software a consistent view of memory, independent of the processor on which it is running. See Section 11.11, “Memory Type Range Registers (MTRRs),” for infor-mation on setting up MTRRs. 8.7.4 Page Attribute Table (PAT) Each logical processor has its o...
Page 348 - Performance Monitoring Counters
8-42 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.7.7 Performance Monitoring Counters Performance counters and their companion control MSRs are shared between the logical processors within a processor core for processors based on Intel NetBurst microarchitecture. As a result, software must manage the use ...
Page 349 - MICROCODE UPDATE Resources
Vol. 3 8-43 MULTIPLE-PROCESSOR MANAGEMENT 8.7.11 MICROCODE UPDATE Resources In an Intel processor supporting Intel Hyper-Threading Technology, the microcode update facilities are shared between the logical processors; either logical processor can initiate an update. Each logical processor has its ow...
Page 352 - ARCHITECTURE; Logical Processor Support
8-46 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.8 MULTI-CORE ARCHITECTURE This section describes the architecture of Intel 64 and IA-32 processors supporting dual-core and quad-core technology. The discussion is applicable to the Intel Pentium processor Extreme Edition, Pentium D, Intel Core Duo, Intel ...
Page 353 - PROGRAMMING CONSIDERATIONS FOR HARDWARE
Vol. 3 8-47 MULTIPLE-PROCESSOR MANAGEMENT 8.8.3 Performance Monitoring Counters Performance counters and their companion control MSRs are shared between two logical processors sharing a processor core if the processor core supports Intel Hyper-Threading Technology and is based on Intel NetBurst micr...
Page 354 - Hierarchical Mapping of Shared Resources
8-48 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT provided for each logical processors (see Section 8.7, “Intel ® Hyper-Threading Tech- nology Architecture,” and Section 8.8, “Multi-Core Architecture”). From a software programming perspective, control transfer of processor operation is managed at the granul...
Page 355 - Figure 8-5. Generalized Four level Interpretation of the APIC ID
Vol. 3 8-49 MULTIPLE-PROCESSOR MANAGEMENT If the processor supports CPUID leaf 0BH, the 32-bit APIC ID can represent cluster plus several levels of topology within the physical processor package. The exact number of hierarchical levels within a physical processor package must be enumer-ated through ...
Page 356 - Hierarchical Mapping of CPUID Extended Topology Leaf; Example 8-17. BitWidth Determination of x2APIC ID Subfields
8-50 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.9.2 Hierarchical Mapping of CPUID Extended Topology Leaf CPUID leaf 0BH provides enumeration parameters for software to identify each hier-archy of the processor topology in a deterministic manner. Each hierarchical level of the topology starting from the ...
Page 357 - Hierarchical ID of Logical Processors in an MP System
Vol. 3 8-51 MULTIPLE-PROCESSOR MANAGEMENT For m = 0, m < N, m ++;{ cumulative_width[m] = CPUID.(EAX=0BH, ECX= m): EAX[4:0]; }BitWidth[0] = cumulative_width[0];For m = 1, m < N, m ++; BitWidth[m] = cumulative_width[m] - cumulative_width[m-1]; Currently, only the following encoding of hierarchic...
Page 358 - Platform; Initial APIC ID; Package 0
8-52 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Table 8-2 shows the initial APIC IDs for a hypothetical situation with a dual processor system. Each physical package providing two processor cores, and each processor core also supporting Intel Hyper-Threading Technology. Figure 8-7. Topological Relationshi...
Page 359 - Hierarchical ID of Logical Processors with x2APIC ID
Vol. 3 8-53 MULTIPLE-PROCESSOR MANAGEMENT 8.9.3.1 Hierarchical ID of Logical Processors with x2APIC ID Table 8-3 shows an example of possible x2APIC ID assignments for a dual processor system that support x2APIC. Each physical package providing four processor cores, and each processor core also supp...
Page 360 - Algorithm for Three-Level Mappings of APIC_ID; and extract identifiers corresponding to the three
8-54 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT 8.9.4 Algorithm for Three-Level Mappings of APIC_ID Software can gather the initial APIC_IDs for each logical processor supported by the operating system at runtime 5 and extract identifiers corresponding to the three levels of sharing topology (package, cor...
Page 363 - Query the x2APIC ID of a logical processor.; bit Initial APIC ID
Vol. 3 8-57 MULTIPLE-PROCESSOR MANAGEMENT int DeriveCore_Mask_Offsets (void){ if (!HWMTSupported()) return -1; execute cpuid with eax = 11, ECX = 0; while( ECX[15:8] ) { // level type encoding is valid If (returned level type encoding in ECX[15:8] matches CORE) { Mask_Core_shift = EAX[4:0]; // neede...
Page 364 - Query the initial APIC ID of a logical processor.
8-58 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT unsigned char MaxLPIDsPerPackage(void){ if (!HWMTSupported()) return 1; execute cpuid with eax = 1 store returned value of ebxreturn (unsigned char) ((reg_ebx & NUM_LOGICAL_BITS) >> 16); } b. Find the size of address space for processor cores in a ...
Page 366 - Identifying Topological Relationships in a MP System; To extract the next bit-field, the shift value of the working mask is
8-60 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT Software must not assume local APIC_ID values in an MP system are consecutive. Non-consecutive local APIC_IDs may be the result of hardware configurations or debug features implemented in the BIOS or OS.An identifier for each hierarchical level can be extrac...
Page 370 - MANAGEMENT OF IDLE AND BLOCKED CONDITIONS
8-64 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT }if (i == CoreNum) { //Did not match any bucket, start new bucketCoreIDBucket[i] = PackageID[ProcessorNum] | CoreID[ProcessorNum];CoreProcessorMask[i] = ProcessorMask;CoreNum++; } }// CoreNum has the number of cores started in the OS// CoreProcessorMask[] ar...
Page 373 - Monitor/Mwait Address Range Determination
Vol. 3 8-67 MULTIPLE-PROCESSOR MANAGEMENT Power management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause the monitor event pending flag to be cleared. Faults will not cause the monitor event pending flag to be cleared.Software should not allow for volu...
Page 374 - Required Operating System Support; Check if lock is free
8-68 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT the two parameters should default to be the same (the size of the monitor triggering area is the same as the system coherence line size).Based on the monitor line sizes returned by the CPUID, the OS should dynamically allocate structures with appropriate pad...
Page 378 - Execution Resources
8-72 Vol. 3 MULTIPLE-PROCESSOR MANAGEMENT { MONITOR WorkQueue // Setup of eax with WorkQueue LinearAddress, // ECX, EDX = 0 IF (WorkQueue != 0) THEN { STIMWAIT // EAX, ECX = 0 } } 8.10.6.5 Guidelines for Scheduling Threads on Logical Processors Sharing Execution Resources Because the logical process...
Page 379 - Memory
Vol. 3 8-73 MULTIPLE-PROCESSOR MANAGEMENT • A high resolution timer within the processor (such as, the local APIC timer or the time-stamp counter). For additional information, see the Intel® 64 and IA-32 Architectures Optimization Reference Manual. 8.10.6.7 Place Locks and Semaphores in Aligned, 128...
Page 381 - OVERVIEW
Vol. 3 9-1 CHAPTER 9 PROCESSOR MANAGEMENT AND INITIALIZATION This chapter describes the facilities provided for managing processor wide functions and for initializing the processor. The subjects covered include: processor initializa-tion, x87 FPU initialization, processor configuration, feature dete...
Page 382 - PROCESSOR MANAGEMENT AND INITIALIZATION; Processor State After Reset
9-2 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION The software-initialization code performs all system-specific initialization of the BSP or primary processor and the system logic.At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or secondary) processor to enab...
Page 385 - Model and Stepping Information; Figure 9-1. Contents of CR0 Register after Reset
Vol. 3 9-5 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.3 Model and Stepping Information Following a hardware reset, the EDX register contains component identification and revision information (see Figure 9-2). For example, the model, family, and processor type returned for the first processor in the...
Page 386 - First Instruction Executed; X87 FPU INITIALIZATION; Configuring the x87 FPU Environment
9-6 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.1.4 First Instruction Executed The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 bytes below the processor’s uppermost physical address. The EPROM containing ...
Page 387 - Setting the Processor for x87 FPU Software Emulation; EM
Vol. 3 9-7 PROCESSOR MANAGEMENT AND INITIALIZATION The EM flag determines whether floating-point instructions are executed by the x87 FPU (EM is cleared) or a device-not-available exception (#NM) is generated for all floating-point instructions so that an exception handler can emulate the floating-p...
Page 388 - ENABLING; CR0 Bit
9-8 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION • It allows x87 FPU code to run on an IA-32 processor that has neither an integrated x87 FPU nor is connected to an external math coprocessor, by using a floating-point emulator. • It allows floating-point code to be executed using a special or nons...
Page 390 - EXTENSIONS
9-10 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION all the MTRRs must be cleared to 0, which selects the uncached (UC) memory type. See Section 11.11, “Memory Type Range Registers (MTRRs),” for detailed informa-tion on the MTRRs. 9.6 INITIALIZING SSE/SSE2/SSE3/SSSE3 EXTENSIONS For processors that c...
Page 391 - Real-Address Mode IDT; SOFTWARE INITIALIZATION FOR PROTECTED-MODE
Vol. 3 9-11 PROCESSOR MANAGEMENT AND INITIALIZATION mode. The protected-mode data structures that must be loaded are described in Section 9.8, “Software Initialization for Protected-Mode Operation.” 9.7.1 Real-Address Mode IDT In real-address mode, the only system data structure that must be loaded ...
Page 392 - System Data Structures
9-12 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION modules into memory to support reliable operation of the processor in protected mode. These data structures include the following: • A IDT. • A GDT. • A TSS. • (Optional) An LDT. • If paging is to be used, at least one page directory and one page t...
Page 393 - Initializing Protected-Mode Exceptions and Interrupts
Vol. 3 9-13 PROCESSOR MANAGEMENT AND INITIALIZATION descriptors in the GDT. Some operating systems allocate new segments and LDTs as they are needed. This provides maximum flexibility for handling a dynamic program-ming environment. However, many operating systems use a single LDT for all tasks, all...
Page 394 - Multitasking
9-14 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.4 Initializing Multitasking If the multitasking mechanism is not going to be used and changes between privilege levels are not allowed, it is not necessary load a TSS into memory or to initialize the task register.If the multitasking mechanism ...
Page 395 - IA-32e Mode System Data Structures
Vol. 3 9-15 PROCESSOR MANAGEMENT AND INITIALIZATION following instructions must be located in an identity-mapped page (until such time that a branch to non-identity mapped pages can be effected). 64-bit mode paging tables must be located in the first 4 GBytes of physical-address space prior to activ...
Page 396 - 4-bit Mode and Compatibility Mode Operation
9-16 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.8.5.3 64-bit Mode and Compatibility Mode Operation IA-32e mode uses two code segment-descriptor bits (CS.L and CS.D, see Figure 3-8) to control the operating modes after IA-32e mode is initialized. If CS.L = 1 and CS.D = 0, the processor is runni...
Page 397 - Switching to Protected Mode; in control register CR0.
Vol. 3 9-17 PROCESSOR MANAGEMENT AND INITIALIZATION from 64-bit mode through compatibility mode to legacy or real mode and then back through compatibility mode to 64-bit mode. 9.9 MODE SWITCHING To use the processor in protected mode after hardware or software reset, a mode switch must be performed ...
Page 398 - Switching Back to Real-Address Mode; interrupts can be disabled with external circuitry.
9-18 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 7. If a local descriptor table is going to be used, execute the LLDT instruction to load the segment selector for the LDT in the LDTR register. 8. Execute the LTR instruction to load the task register with a segment selector to the initial protecte...
Page 399 - INITIALIZATION AND MODE SWITCHING EXAMPLE; Establish a basic real-address mode operating environment.
Vol. 3 9-19 PROCESSOR MANAGEMENT AND INITIALIZATION 4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor containing the following values, which are appropriate for real-address mode:— Limit = 64 KBytes (0FFFFH)— Byte granular (G = 0)— Expand up (E = 0)— Writable (W = 1)—...
Page 401 - Table 9-4. Main Initialization Steps in STARTUP.ASM Source Listing; Numbers
Vol. 3 9-21 PROCESSOR MANAGEMENT AND INITIALIZATION Figure 9-3. Processor State After Reset Table 9-4. Main Initialization Steps in STARTUP.ASM Source Listing STARTUP.ASM Line Numbers Description From To 157 157 Jump (short) to the entry code in the EPROM 162 169 Construct a temporary GDT in RAM wit...
Page 402 - Usage
9-22 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.1 Assembler Usage In this example, the Intel assembler ASM386 and build tools BLD386 are used to assemble and build the initialization code module. The following assumptions are used when using the Intel ASM386 and BLD386 tools. • The ASM386 w...
Page 403 - Listing
Vol. 3 9-23 PROCESSOR MANAGEMENT AND INITIALIZATION 9.10.2 STARTUP.ASM Listing Example 9-1 provides high-level sample code designed to move the processor into protected mode. This listing does not include any opcode and offset information. Example 9-1. STARTUP.ASM MS-DOS* 5.0(045-N) 386(TM) MACRO AS...
Page 414 - Files; Example 9-3. Batch File to Assemble and Build the Application
9-34 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION CODE SEGMENT ER use32 PUBLIC main_start: nop nop nop CODE ENDS END main_start, ds:data, ss:stack 9.10.4 Supporting Files The batch file shown in Example 9-3 can be used to assemble the source code files STARTUP.ASM and MAIN.ASM and build the final ...
Page 415 - Table 9-5. Relationship Between BLD Item and ASM Source File; Item
Vol. 3 9-35 PROCESSOR MANAGEMENT AND INITIALIZATION TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ( 10: PROTECTED_MODE_TASK , startup.startup_code , startup.startup_data , main_module.data , main_module.code , main_module.stack ) ), IDT ( LOCATION = IDT_EPROM ); MEMORY ( RESERVE = (0..3FFFH -- Area for...
Page 416 - UPDATE
9-36 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11 MICROCODE UPDATE FACILITIES The Pentium 4, Intel Xeon, and P6 family processors have the capability to correct errata by loading an Intel-supplied data block into the processor. The data block is called a microcode update. This section describ...
Page 417 - Update
Vol. 3 9-37 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.1 Microcode Update A microcode update consists of an Intel-supplied binary that contains a descriptive header and data. No executable code resides within the update. Each microcode update is tailored for a specific list of processor signatures...
Page 418 - Table 9-6. Microcode Update Field Definitions; Field Name
9-38 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION NOTE The optional extended signature table is supported starting with processor family 0FH, model 03H. . Table 9-6. Microcode Update Field Definitions Field Name Offset (bytes) Length (bytes) Description Header Version 0 4 Version number of the upd...
Page 421 - Optional Extended Signature Table; Table 9-8. Extended Processor Signature Table Header Structure; Extended Signature Count ‘n’; Table 9-9. Processor Signature Structure
Vol. 3 9-41 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.2 Optional Extended Signature Table The extended signature table is a structure that may be appended to the end of the encrypted data when the encrypted data only supports a single processor signature (optional case). The extended signature ta...
Page 422 - Identification
9-42 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION a processor signature embedded in the microcode update with the processor signa-ture returned by CPUID will cause the BIOS to reject the update.Example 9-5 shows how to check for a valid processor signature match between the processor and microcode...
Page 424 - Microcode Update Checksum; Example 9-7. Pseudo Code Example of Checksum Test
9-44 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION } Else { // // Assume the Data Size has been used to calculate the // location of Update.ProcessorSignature[N] and a match // on Update.ProcessorSignature[N] has already succeeded // If (Update.ProcessorFlags[n] & Flag) { Load Update } } } 9.11...
Page 427 - Update Signature and Verification
Vol. 3 9-47 PROCESSOR MANAGEMENT AND INITIALIZATION If processor core supports Intel Hyper-Threading Technology, the guideline described in Section 9.11.6.3 also applies. 9.11.6.5 Update Loader Enhancements The update loader presented in Section 9.11.6, “Microcode Update Loader,” is a minimal implem...
Page 430 - The update contains a correct checksum.
9-50 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION There are no optional functions. BIOS must load the appropriate update for each processor during system initialization.A Header Version of an update block containing the value 0FFFFFFFFH indicates that the update block is unused and available for s...
Page 432 - Example 9-12 represents a calling program.
9-52 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION } } NOTES The platform Id bits in IA32_PLATFORM_ID are encoded as a three-bit binary coded decimal field. The platform bits in the microcode update header are individually bit encoded. The algorithm must do a translation from one format to the othe...
Page 435 - Table 9-12. Microcode Update Functions; Microcode Update
Vol. 3 9-55 PROCESSOR MANAGEMENT AND INITIALIZATION } // // Compare the Update read to that written // If (Update read != Update written) { Display Diagnostic exit } I ← I + (size of microcode update / 2048) } // // Enable Update Loading, and inform user // Issue the Update Control function with Tas...
Page 436 - Table 9-13. Parameters for the Presence Test; Input
9-56 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION In general, each function returns with CF cleared and AH contains the returned status. The general return codes and other constant definitions are listed in Section 9.11.8.9, “Return Codes.”The OEM error field (AL) is provided for the OEM to return...
Page 437 - Table 9-14. Parameters for the Write Update Data Function
Vol. 3 9-57 PROCESSOR MANAGEMENT AND INITIALIZATION 9.11.8.6 Function 01H—Write Microcode Update Data This function integrates a new microcode update into the BIOS storage device. Table 9-14 lists the parameters and return codes for the function. Table 9-14. Parameters for the Write Update Data Func...
Page 438 - recognized by the BIOS.
9-58 Vol. 3 PROCESSOR MANAGEMENT AND INITIALIZATION Description The BIOS is responsible for selecting an appropriate update block in the non-volatile storage for storing the new update. This BIOS is also responsible for ensuring the integrity of the information provided by the caller, including auth...
Page 443 - Table 9-17. Parameters for the Read Microcode Update Data Function
Vol. 3 9-63 PROCESSOR MANAGEMENT AND INITIALIZATION The READ_FAILURE error code returned by this function has meaning only if the control function is implemented in the BIOS NVRAM. The state of this feature (enabled/disabled) can also be implemented using CMOS RAM bits where READ failure errors cann...
Page 447 - LOCAL AND I/O APIC OVERVIEW
Vol. 3 10-1 CHAPTER 10 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The Advanced Programmable Interrupt Controller (APIC), referred to in the following sections as the local APIC, was introduced into the IA-32 processors with the Pentium processor (see Section 19.27, “Advanced Programmable Inte...
Page 448 - ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC)
10-2 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) interrupt pins (LINT0 and LINT1). The I/O devices may also be connected to an 8259-type interrupt controller that is in turn connected to the processor through one of the local interrupt pins. • Externally connected I/O devices — These in...
Page 450 - Multiple-Processor Systems
10-4 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) also be delivered to the individual processors through the local interrupt pins; however, this mechanism is commonly not used in MP systems. Figure 10-2. Local APICs and I/O APIC When Intel Xeon Processors Are Used in Multiple-Processor S...
Page 451 - SYSTEM BUS VS. APIC BUS
Vol. 3 10-5 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The IPI mechanism is typically used in MP systems to send fixed interrupts (inter-rupts for a specific vector number) and special-purpose interrupts to processors on the system bus. For example, a local APIC can use an IPI to forward a fi...
Page 452 - APIC; The Local APIC Block Diagram
10-6 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) forward extendability for future Intel platform innovations. These extensions and modifications are noted in the following sections. 10.4 LOCAL APIC The following sections describe the architecture of the local APIC and how to detect it, ...
Page 454 - Table 10-1 Local APIC Register Address Map; Address
10-8 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-1 shows how the APIC registers are mapped into the 4-KByte APIC register space. Registers are 32 bits, 64 bits, or 256 bits in width; all are aligned on 128-bit boundaries. All 32-bit registers should be accessed using 128-bit al...
Page 456 - Presence of the Local APIC; The local APIC can be enabled or disabled in either of two ways:
10-10 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.4.2 Presence of the Local APIC Beginning with the P6 family processors, the presence or absence of an on-chip local APIC can be detected using the CPUID instruction. When the CPUID instruction is executed with a source operand of 1 in...
Page 457 - Local APIC Status and Location; Indicates if the processor is the bootstrap processor (BSP).
Vol. 3 10-11 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 1. Using the APIC global enable/disable flag in the IA32_APIC_BASE MSR (MSR address 1BH; see Figure 10-5 ): — When IA32_APIC_BASE[11] is 0, the processor is functionally equivalent to an IA-32 processor without an on-chip APIC. The CPUID...
Page 458 - APIC Global Enable flag, bit 11; Enables or disables the local APIC (see; APIC Base field, bits 12 through 35; Specifies the base address of the APIC; Relocating the Local APIC Registers
10-12 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • APIC Global Enable flag, bit 11 ⎯ Enables or disables the local APIC (see Section 10.4.3, “Enabling or Disabling the Local APIC” ). This flag is available in the Pentium 4, Intel Xeon, and P6 family processors. It is not guaranteed to ...
Page 459 - Local APIC State
Vol. 3 10-13 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is alw...
Page 460 - Local APIC State After It Has Been Software Disabled; Disabling the Local APIC”
10-14 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC will introduce 32-bit ID; see Section 10.5 . 10.4.7.1 Local APIC State After Power-Up or Reset Following a power-up or RESET of the processor, the state of local APIC and its regis-ters are as follows: • The following registers ar...
Page 461 - Local APIC Version Register; The version numbers of the local APIC:; Max LVT Entry; Shows the number of LVT entries minus 1. For the Pentium 4 and
Vol. 3 10-15 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • The mask bits for all the LVT entries are set. Attempts to reset these bits will be ignored. • (For Pentium and P6 family processors) The local APIC continues to listen to all bus messages in order to keep its arbitration ID synchroniz...
Page 462 - XAPIC; Adds new features to enhance performance of interrupt delivery; DETECTING AND ENABLING x2APIC; Figure 10-7. Local APIC Version Register
10-16 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.5 EXTENDED XAPIC (X2APIC) The x2APIC architecture extends the xAPIC architecture (described in Section 9.4) in a backward compatible manner and provides forward extendability for future Intel platform innovations. Specifically, x2APIC...
Page 463 - Table 10-2. x2APIC Operating Mode Configurations; xAPIC global enable
Vol. 3 10-17 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Table 10-2 , “x2APIC operating mode configurations” describe the possible combina- tions of the enable bit (EN - bit 11) and the extended mode bit (EXTD - bit 10) in the IA32_APIC_BASE MSR. Once the local APIC has been switched to x2APIC...
Page 464 - Table 10-3. Local APIC Register Address Map Supported by x2APIC; MMIO Offset
10-18 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 32-bit register. Similarly executing the WRMSR instruction with the APIC register address in ECX, writes bits 0 to 31 of register EAX to bits 0 to 31 of the specified APIC register. If the register is a 64-bit register then bits 0 to 31 ...
Page 468 - x2APIC Register Availability; provides the interactions between the; MSR Access in x2APIC Mode; MMIO Interface
10-22 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) to enable BIOS and/or platform firmware to re-configure the x2APIC IDs in some clusters to provide for unique and non-overlapping system wide IDs before config-uring the disconnected components into a single system. 10.5.2 x2APIC Registe...
Page 469 - Directed EOI with x2APIC Mode
Vol. 3 10-23 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) field, VM-exit MSR-load address filed, and VM-entry MSR-load address field in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B).The X2APIC MSRs cannot to be loaded and stored on VMX transitions. A VMX transi-tion ...
Page 470 - x2APIC State Transitions; The valid states for a local x2APIC unit is listed in; Figure 10-10. Local APIC Version Register of x2APIC
10-24 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The default value for SVR[bit 12] is clear, indicating that an EOI broadcast will be performed.The support for Directed EOI capability can be detected by means of bit 24 in the Local APIC Version Register. This feature is supported in bo...
Page 472 - x2APIC After RESET
10-26 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC After RESET The valid transitions from the xAPIC mode state are: • to the x2APIC mode by setting EXT to 1 (resulting EN=1, EXTD= 1). The physical x2APIC ID (see Figure 10-6 ) is preserved across this transition and the logical x2A...
Page 473 - x2APIC Transitions From x2APIC Mode; The Logical Destination Register is not preserved.; System Software Transitions
Vol. 3 10-27 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) x2APIC Transitions From x2APIC Mode From the x2APIC mode, the only valid x2APIC transition using IA32_APIC_BASE is to the state where the x2APIC is disabled by setting EN to 0 and EXTD to 0. The x2APIC ID (32 bits) and the legacy local x...
Page 474 - CPUID Extensions And Topology Enumeration
10-28 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Support for the x2APIC architecture can be implemented in the local APIC unit. All existing PCI/MSI capable devices and IOxAPIC unit should work with the x2APIC extensions defined in this document. The x2APIC architecture also provides f...
Page 476 - HANDLING LOCAL INTERRUPTS; Local Vector Table; “CMCI Local APIC Interface”
10-30 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.6 HANDLING LOCAL INTERRUPTS The following sections describe facilities that are provided in the local APIC for handling local interrupts. These include: the processor’s LINT0 and LINT1 pins, the APIC timer, the performance-monitoring ...
Page 478 - Delivery Mode; Specifies the type of interrupt to be sent to the processor. Some
10-32 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The setup information that can be specified in the registers of the LVT table is as follows:Vector Interrupt vector number. Delivery Mode Specifies the type of interrupt to be sent to the processor. Some delivery modes will only operate ...
Page 479 - Interrupt Input Pin Polarity; Selects the trigger mode for the local LINT0 and LINT1 pins: (0); Mask; Valid Interrupt Vectors; Section 6.2, “Exception and Interrupt Vectors”
Vol. 3 10-33 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Interrupt Input Pin Polarity Specifies the polarity of the corresponding interrupt pin: (0) active high or (1) active low. Remote IRR Flag (Read Only) For fixed mode, level-triggered interrupts; this flag is set when the local APIC accep...
Page 482 - Timer
10-36 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) If the ICR is programmed with lowest priority delivery mode then the "Re-directible IPI" bit will be set in x2APIC modes (same as legacy xAPIC behavior) and the inter-rupt will not be processed.Write to the ICR with both lowest p...
Page 483 - Figure 10-15. Divide Configuration Register
Vol. 3 10-37 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The time base for the timer is derived from the processor’s bus clock, divided by the value specified in the divide configuration register.The timer can be configured through the timer LVT entry for one-shot or periodic operation. In one...
Page 484 - Local Interrupt Acceptance; rupts”; INTERPROCESSOR; To send an interrupt to another processor.
10-38 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.6.5 Local Interrupt Acceptance When a local interrupt is sent to the processor core, it is subject to the acceptance criteria specified in the interrupt acceptance flow chart in Figure 10-25 . If the inter- rupt is accepted, it is log...
Page 485 - Specifies the type of IPI to be sent. This field is also know as the
Vol. 3 10-39 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The ICR consists of the following fields. Vector The vector number of the interrupt being sent. Delivery Mode Specifies the type of IPI to be sent. This field is also know as the IPI message type field. 000 (Fixed) Delivers the interrupt...
Page 487 - Level; For the INIT level de-assert delivery mode this flag must be set; Trigger Mode; Selects the trigger mode when using the INIT level de-assert; Destination Shorthand
Vol. 3 10-41 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Destination Mode Selects either physical (0) or logical (1) destination mode (see Section 10.7.2, “Determining IPI Destination” ). Delivery Status (Read Only) Indicates the IPI delivery status, as follows: 0 (Idle) There is currently no ...
Page 488 - Destination; Specifies the target processor or processors. This field is only; Local xAPIC Interrupt Command Register
10-42 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination field set to FH for Pentium and P6 family processors and to FFH for Pentium 4 and Intel Xeon processors. 11: (All Excluding Self) The IPI is sent to all processors in a system with the exception of the processor sending the I...
Page 489 - Table 10-7 Valid Combinations for the P6 Family Processors’
Vol. 3 10-43 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Self Invalid X Lowest Priority, NMI, INIT, SMI, Start- Up X All Including Self Valid Edge Fixed X All Including Self Invalid 2 Level Fixed X All Including Self Invalid X Lowest Priority, NMI, INIT, SMI, Start- Up X All Excluding Self Val...
Page 492 - Determining IPI Destination; excluding self, or self as the destination.
10-46 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.7.2 Determining IPI Destination The destination of an IPI can be one, all, or a subset (group) of the processors on the system bus. The sender of the IPI specifies the destination of an IPI with the following APIC registers and fields...
Page 495 - provides the layout of the; Figure 10-21. Logical Destination Register in x2APIC Mode
Vol. 3 10-49 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) lowest priority delivery mode is not supported in cluster mode and must not be configured by software.The hierarchical cluster destination model can be used with Pentium 4, Intel Xeon, P6 family, or Pentium processors. With this model, a...
Page 496 - Deriving Logical x2APIC ID from the Local x2APIC ID
10-50 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) mode is not supported in the x2APIC mode. Hence the Destination Format Register (DFR) is eliminated in x2APIC mode. The 32-bit logical x2APIC ID field of LDR is partitioned into two sub-fields: • Cluster ID (LDR[31:16]): is the address o...
Page 498 - IPI Delivery and Acceptance
10-52 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Here, the TPR value is the task priority value in the TPR (see Figure 10-26 ), the IRRV value is the vector number for the highest priority bit that is set in the IRR (see Figure 10-28 ) or 00H (if no IRR bit is set), and the ISRV value ...
Page 499 - SYSTEM AND APIC BUS ARBITRATION
Vol. 3 10-53 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) The SELF IPI register is a write-only register. A RDMSR instruction with address of the SELF IPI register will raise a GP fault. The handling and prioritization of a self-IPI sent via the SELF IPI register is architec-turally identical t...
Page 500 - INTERRUPTS; Interrupt Handling with the Pentium 4 and Intel Xeon; Intel Xeon Processors)
10-54 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) priorities of the local APICs by resetting Arb ID register of each agent to its current APIC ID value. (The Pentium 4 and Intel Xeon processors do not implement the Arb ID register.) Section 10.11, “APIC Bus Message Passing Mechanism and...
Page 501 - Interrupt Handling with the P6 Family and Pentium
Vol. 3 10-55 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 3. If the local APIC determines that it is the designated destination for the interrupt but the interrupt request is not one of the interrupts given in step 2, the local APIC sets the appropriate bit in the IRR. 4. When interrupts are pe...
Page 505 - Interrupt Acceptance for Fixed Interrupts
Vol. 3 10-59 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Its value in the PPR is computed as follows: IF TPR[7:4] ≥ ISRV[7:4] THEN PPR[7:0] ← TPR[7:0] ELSE PPR[7:4] ← ISRV[7:4] PPR[3:0] ← 0 Here, the ISRV value is the vector number of the highest priority ISR bit that is set, or 00H if no ISR ...
Page 507 - Signaling Interrupt Servicing Completion; Signaling Interrupt Servicing Completion in x2APIC Mode; Task Priority in IA-32e Mode
Vol. 3 10-61 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) bit is cleared for edge-triggered interrupts and set for level-triggered interrupts. If a TMR bit is set when an EOI cycle for its corresponding interrupt vector is generated, an EOI message is sent to all I/O APICs. 10.9.5 Signaling Int...
Page 508 - Interaction of Task Priorities between CR8 and APIC; The processor powers up with the local APIC enabled.
10-62 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) • Loading the TPR with a value of 8 (01000B) blocks all interrupts with a priority of 8 or less while allowing all interrupts with a priority of nine or more to be recognized. • Loading the TPR with zero enables all external interrupts. ...
Page 509 - APIC Software; APIC”; Checking
Vol. 3 10-63 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) There are no ordering mechanisms between direct updates of the APIC.TPR and CR8. Operating software should implement either direct APIC TPR updates or CR8 style TPR updates but not mix them. Software can use a serializing instruction (fo...
Page 510 - APIC BUS MESSAGE PASSING MECHANISM AND
10-64 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.11 APIC BUS MESSAGE PASSING MECHANISM AND PROTOCOL (P6 FAMILY, PENTIUM PROCESSORS) The Pentium 4 and Intel Xeon processors pass messages among the local and I/O APICs on the system bus, using the system bus message passing mechanism a...
Page 511 - MESSAGE SIGNALLED INTERRUPTS; and
Vol. 3 10-65 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) the bus regardless of its sender’s arbitration priority, unless more than one APIC issues an EOI message simultaneously. In the latter case, the APICs sending the EOI messages arbitrate using their arbitration priorities.If the APICs are...
Page 512 - Figure 10-32. Layout of the MSI Message Address Register
10-66 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) 10.12.1 Message Address Register Format The format of the Message Address Register (lower 32-bits) is shown in Figure 10-32. Fields in the Message Address Register are as follows:1. Bits 31-20 — These bits contain a fixed value for inter...
Page 513 - Figure 10-33. Layout of the MSI Message Data Register
Vol. 3 10-67 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) destination mode and only the processor in the system that has the matching APIC ID is considered for delivery of that interrupt (this means no re-direction). If RH is 1 and DM is 1, the Destination ID Field is interpreted as in logical ...
Page 514 - c. 010B (System Management Interrupt or SMI) — The delivery mode is
10-68 Vol. 3 ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC) Reserved fields are not assumed to be any value. Software must preserve their contents on writes. Other fields in the Message Data Register are described below.1. Vector — This 8-bit field contains the interrupt vector associated with th...
Page 516 - Figure 11-2. Cache Structure of the Intel Core i7 Processors; Cache or Buffer
11-2 Vol. 3 MEMORY CACHE CONTROL Figure 11-2 shows the cache arrangement of Intel Core i7 processor. Figure 11-2. Cache Structure of the Intel Core i7 Processors Table 11-1. Characteristics of the Caches, TLBs, Store Buffer, and Write Combining Buffer in Intel 64 and IA-32 Processors Cache or Buffer...
Page 521 - TERMINOLOGY
Vol. 3 11-7 MEMORY CACHE CONTROL Processors based on Intel Core microarchitectures implement one level of instruction TLB and two levels of data TLB. Intel Core i7 processor provides a second-level unified TLB. The store buffer is associated with the processors instruction execution units. It allows...
Page 522 - METHODS OF CACHING AVAILABLE
11-8 Vol. 3 MEMORY CACHE CONTROL (depending on the write policy currently in force) can also write it out to memory. If the operand is to be written out to memory, it is written first into the store buffer, and then written from the store buffer to memory when the system bus is available. (Note that...
Page 523 - Table 11-2. Memory Types and Their Properties
Vol. 3 11-9 MEMORY CACHE CONTROL registers to access UC memory that may have read or write side effects. • Uncacheable (UC-) — Has same characteristics as the strong uncacheable (UC) memory type, except that this memory type can be overridden by programming the MTRRs for the WC memory type. This mem...
Page 525 - Buffering of Write Combining Memory Locations
Vol. 3 11-11 MEMORY CACHE CONTROL 11.3.1 Buffering of Write Combining Memory Locations Writes to the WC memory type are not cached in the typical sense of the word cached. They are retained in an internal write combining buffer (WC buffer) that is separate from the internal L1, L2, and L3 caches and...
Page 526 - Choosing a Memory Type
11-12 Vol. 3 MEMORY CACHE CONTROL The WC memory type is weakly ordered by definition. Once the eviction of a WC buffer has started, the data is subject to the weak ordering semantics of its defini-tion. Ordering is not maintained between the successive allocation/deallocation of WC buffers (for exam...
Page 527 - Code Fetches in Uncacheable Memory; CACHE CONTROL PROTOCOL
Vol. 3 11-13 MEMORY CACHE CONTROL large data structure should be marked as uncacheable, or reading it will evict cached lines that the processor will be referencing again. A similar example would be a write-only data structure that is written to (to export the data to another agent), but never read ...
Page 528 - CONTROL; Cache Line State
11-14 Vol. 3 MEMORY CACHE CONTROL The L1 instruction cache in P6 family processors implements only the “SI” part of the MESI protocol, because the instruction cache is not writable. The instruction cache monitors changes in the data cache to maintain consistency between the caches when instructions ...
Page 529 - Cache Control Registers and Bits
Vol. 3 11-15 MEMORY CACHE CONTROL 11.5.1 Cache Control Registers and Bits Figure 11-3 depicts cache-control mechanisms in IA-32 processors. Other than for the matter of memory address space, these work the same in Intel 64 processors.The Intel 64 and IA-32 architectures provide the following cache-c...
Page 532 - CD NW
11-18 Vol. 3 MEMORY CACHE CONTROL • NW flag, bit 29 of control register CR0 — Controls the write policy for system memory locations (see Section 2.5, “Control Registers”). If the NW and CD flags are clear, write-back is enabled for the whole of system memory, but may be restricted for individual pag...
Page 534 - Precedence of Cache Controls; Selecting Memory Types for Pentium Pro and Pentium II
11-20 Vol. 3 MEMORY CACHE CONTROL page-table entries) permit caching in an external L2 cache to be controlled on a page-by-page basis, consistent with the control exercised on the L1 cache of these processors. The P6 and more recent processor families do not provide these pins because the L2 cache i...
Page 535 - Table 11-6. Effective Page-Level Memory Type for Pentium Pro and; MTRR Memory Type
Vol. 3 11-21 MEMORY CACHE CONTROL When normal caching is in effect, the effective memory type shown in Table 11-6 is determined using the following rules:1. If the PCD and PWT attributes for the page are both 0, then the effective memory type is identical to the MTRR-defined memory type. 2. If the P...
Page 536 - Selecting Memory Types for Pentium III and More Recent; Processor Families
11-22 Vol. 3 MEMORY CACHE CONTROL 11.5.2.2 Selecting Memory Types for Pentium III and More Recent Processor Families The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Intel Core Solo, Pentium M, Pentium 4, Intel Xeon, and Pentium III processors use the PAT to select effective page-level memory types...
Page 537 - Writing Values Across Pages with Different Memory Types; Architectures Software Developer’s Manual
Vol. 3 11-23 MEMORY CACHE CONTROL 11.5.2.3 Writing Values Across Pages with Different Memory Types If two adjoining pages in memory have different memory types, and a word or longer operand is written to a memory location that crosses the page boundary between those two pages, the operand might be w...
Page 538 - Caching
11-24 Vol. 3 MEMORY CACHE CONTROL 11.5.3 Preventing Caching To disable the L1, L2, and L3 caches after they have been enabled and have received cache fills, perform the following steps:1. Enter the no-fill cache mode. (Set the CD flag in control register CR0 to 1 and the NW flag to 0. 2. Flush all c...
Page 539 - Disabling and Enabling the L3 Cache
Vol. 3 11-25 MEMORY CACHE CONTROL 11.5.4 Disabling and Enabling the L3 Cache On processors based on Intel NetBurst microarchitecture, the third-level cache can be disabled by bit 6 of the IA32_MISC_ENABLE MSR. The third-level cache disable flag (bit 6 of the IA32_MISC_ENABLE MSR) allows the L3 cache...
Page 540 - L1 Data Cache Context Mode
11-26 Vol. 3 MEMORY CACHE CONTROL The CLFLUSH instruction allow selected cache lines to be flushed from memory. This instruction give a program the ability to explicitly free up cache space, when it is known that cached section of system memory will not be accessed in the near future.The non-tempora...
Page 542 - CACHING; The Pentium
11-28 Vol. 3 MEMORY CACHE CONTROL To avoid problems related to implicit caching, the operating system must explicitly invalidate the cache when changes are made to cacheable data that the cache coher-ency mechanism does not automatically handle. This includes writes to dual-ported or physically alia...
Page 543 - INVALIDATING THE TRANSLATION LOOKASIDE; Writing to control register CR0 to modify the PG or PE flag.; BUFFER; When an exception or interrupt is generated.
Vol. 3 11-29 MEMORY CACHE CONTROL 11.9 INVALIDATING THE TRANSLATION LOOKASIDE BUFFERS (TLBS) The processor updates its address translation caches (TLBs) transparently to soft-ware. Several mechanisms are available, however, that allow software and hardware to invalidate the TLBs either explicitly or...
Page 544 - Table 11-8. Memory Types That Can Be Encoded in MTRRs; Memory Type and Mnemonic
11-30 Vol. 3 MEMORY CACHE CONTROL The discussion of write ordering in Section 8.2, “Memory Ordering,” gives a detailed description of the operation of the store buffer. 11.11 MEMORY TYPE RANGE REGISTERS (MTRRS) The following section pertains only to the P6 and more recent processor families.The memo...
Page 545 - Figure 11-4. Mapping Physical Memory With MTRRs
Vol. 3 11-31 MEMORY CACHE CONTROL Reserved* 03H Write-through (WT) 04H Write-protected (WP) 05H Writeback (WB) 06H Reserved* 7H through FFH NOTE: * Use of these encodings results in a general-protection exception (#GP). Figure 11-4. Mapping Physical Memory With MTRRs Table 11-8. Memory Types That Ca...
Page 547 - Setting Memory Ranges with MTRRs
Vol. 3 11-33 MEMORY CACHE CONTROL 11.11.2 Setting Memory Ranges with MTRRs The memory ranges and the types of memory specified in each range are set by three groups of registers: the IA32_MTRR_DEF_TYPE MSR, the fixed-range MTRRs, and the variable range MTRRs. These registers can be read and written ...
Page 550 - Register Pair
11-36 Vol. 3 MEMORY CACHE CONTROL — The width of the PhysMask field depends on the maximum physical address size supported by the processor. CPUID.80000008H reports the maximum physical address size supported by the processor. If CPUID.80000008H is not available, software may assume that the process...
Page 552 - Example Base and Mask Calculations
11-38 Vol. 3 MEMORY CACHE CONTROL Before attempting to access these SMRR registers, software must test bit 11 in the IA32_MTRRCAP register. If SMRR is not supported, reads from or writes to registers cause general-protection exceptions.When the valid flag in the IA32_SMRR_PHYSMASK MSR is 1, accesses...
Page 554 - Address Support
11-40 Vol. 3 MEMORY CACHE CONTROL IA32_MTRR_PHYSBASE5 = 0000 0000 A000 0001HIA32_MTRR_PHYSMASK5 = 0000 000F FF80 0800H Caches A0000000-A0800000 as WC type. This MTRR setup uses the ability to overlap any two memory ranges (as long as the ranges are mapped to WB and UC memory types) to minimize the n...
Page 555 - Range Size and Alignment Requirement; For ranges greater than 4 KBytes, each range must be of length 2; Initialization
Vol. 3 11-41 MEMORY CACHE CONTROL 11.11.4 Range Size and Alignment Requirement A range that is to be mapped to a variable-range MTRR must meet the following “power of 2” size and alignment rules:1. The minimum range size is 4 KBytes and the base address of the range must be on at least a 4-KByte bou...
Page 556 - MTRR Maintenance Programming Interface
11-42 Vol. 3 MEMORY CACHE CONTROL the MTRRs according to known types of memory, including memory on devices that it auto-configures. Initialization is expected to occur prior to booting the operating system.See Section 11.11.8, “MTRR Considerations in MP Systems,” for information on initializing MTR...
Page 560 - MTRR Considerations in MP Systems
11-46 Vol. 3 MEMORY CACHE CONTROL END The physical address to variable range mapping algorithm in the MemTypeSet func-tion detects conflicts with current variable range registers by cycling through them and determining whether the physical address in question matches any of the current ranges. Durin...
Page 562 - Detecting Support for the PAT Feature
11-48 Vol. 3 MEMORY CACHE CONTROL The requirement that all 4-KByte ranges in a large page are of the same memory type implies that large pages with different memory types may suffer a performance penalty, since they must be marked with the lowest common denominator memory type.The Pentium 4, Intel X...
Page 563 - MSR; Table 11-10. Memory Types That Can Be Encoded With PAT; Encoding
Vol. 3 11-49 MEMORY CACHE CONTROL 11.12.2 IA32_PAT MSR The IA32_PAT MSR is located at MSR address 277H (see to Appendix B, “Model-Specific Registers (MSRs),” and this address will remain at the same address on future IA-32 processors that support the PAT feature. Figure 11-9. shows the format of the...
Page 564 - Selecting a Memory Type from the PAT
11-50 Vol. 3 MEMORY CACHE CONTROL 11.12.3 Selecting a Memory Type from the PAT To select a memory type for a page from the PAT, a 3-bit index made up of the PAT, PCD, and PWT bits must be encoded in the page-table or page-directory entry for the page. Table 11-11 shows the possible encodings of the ...
Page 567 - This chapter describes those features of the Intel; EMULATION OF THE MMX INSTRUCTION SET; Table 12-1. Action Taken By MMX Instructions
Vol. 3 12-1 CHAPTER 12 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING This chapter describes those features of the Intel ® MMX™ technology that must be considered when designing or enhancing an operating system to support MMX tech-nology. It covers MMX instruction set emulation, the MMX state, aliasing...
Page 568 - Figure 12-1. Mapping of MMX Registers to Floating-Point Registers
12-2 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING result, the MMX register mapping is fixed and is not affected by value in the Top Of Stack (TOS) field in the floating-point status word (bits 11 through 13). When a value is written into an MMX register using an MMX instruction, the value also...
Page 569 - Instructions on the x87 FPU Tag Word; Table 12-2. Effects of MMX Instructions on x87 FPU State
Vol. 3 12-3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • When the EMMS instruction is executed, each tag field in the x87 FPU tag word is set to 11B (empty). • Each time an MMX instruction is executed, the TOS value is set to 000B. Execution of MMX instructions does not affect the other bits in the...
Page 570 - SAVING AND RESTORING THE MMX STATE AND; x87 FPU, and FXSAVE/FXRSTOR Instructions on the
12-4 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING 12.3 SAVING AND RESTORING THE MMX STATE AND REGISTERS Because the MMX registers are aliased to the x87 FPU data registers, the MMX state can be saved to memory and restored from memory as follows: • Execute an FSAVE, FNSAVE, or FXSAVE instructi...
Page 571 - SAVING MMX STATE ON TASK OR CONTEXT
Vol. 3 12-5 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • Execute eight MOVQ instructions to save the contents of the MMX0 through MMX7 registers to memory. An EMMS instruction may then (optionally) be executed to clear the MMX state in the x87 FPU. • Execute eight MOVQ instructions to read the save...
Page 572 - Effect of MMX Instructions on Pending x87 Floating-Point; MMX
12-6 Vol. 3 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING • System exceptions:— Invalid Opcode (#UD), if the EM flag in control register CR0 is set when an MMX instruction is executed (see Section 12.1, “Emulation of the MMX Instruction Set”). — Device not available (#NM), if an MMX instruction is exe...
Page 573 - Figure 12-2. Mapping of MMX Registers to x87 FPU Data Register Stack
Vol. 3 12-7 INTEL ® MMX ™ TECHNOLOGY SYSTEM PROGRAMMING When the TOS equals 2 (case B in Figure 12-2), ST0 points to the physical location R2. MM0 maps to ST6, MM1 maps to ST7, MM2 maps to ST0, and so on. Figure 12-2. Mapping of MMX Registers to x87 FPU Data Register Stack MM0 MM1 MM2 MM3 MM4 MM5 MM...
Page 575 - maintaining various system programming resources.
Vol. 3 13-1 CHAPTER 13 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR EXTENDED STATES This chapter describes system programming features for instruction set extensions operating on the processor state extension known as the SSE state (XMM registers, MXCSR) and for processor extended...
Page 577 - SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND; Checking for Support for the FXSAVE and FXRSTOR
Vol. 3 13-3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND To use POPCNT instruction, software must check CPUID.1:ECX.POPCNT[bit 23] = 1 13.1.3 Checking for Support for the FXSAVE and FXRSTOR Instructions A separate check must be made to insure that the processor supports FXSAVE and FXRSTOR. ...
Page 579 - Providing Non-Numeric Exception Handlers for Exceptions
Vol. 3 13-5 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND The SIMD floating-point exception mask bits (bits 7 through 12), the flush-to-zero flag (bit 15), the denormals-are-zero flag (bit 6), and the rounding control field (bits 13 and 14) in the MXCSR register should be left in their defau...
Page 581 - Providing an Handler for the SIMD Floating-Point Exception
Vol. 3 13-7 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND — Device not available (#NM). This exception is generated by executing a SSE/SSE2/SSE3/SSSE3/SSE4 instruction when the TS flag (bit 3) of CR0 is set to 1. Other exceptions can occur indirectly due to faulty execution of the above exce...
Page 583 - TASK OR CONTEXT SWITCHES
Vol. 3 13-9 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND • Execute a LDMXCSR instruction to restore the state of the MXCSR register from memory. 13.4 SAVING THE SSE/SSE2/SSE3/SSSE3/SSE4 STATE ON TASK OR CONTEXT SWITCHES When switching from one task or context to another, it is often necessa...
Page 584 - Using the TS Flag to Control the Saving of the
13-10 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR when a suspended task is resumed (using an FXRSTOR instruction). Here, the x87 FPU/MMX/SSE/SSE2/SSE3/SSE4 state must be saved as part of the task state. This approach is appropriate for preemptive multitasking operating sys...
Page 585 - State During an Operating-System Controlled Task Switch
Vol. 3 13-11 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND The TS flag can be set either explicitly (by executing a MOV instruction to control register CR0) or implicitly (using the IA-32 architecture’s native task switching mech-anism). When the native task switching mechanism is used, the ...
Page 586 - This exception handler code performs the following tasks:; XSAVE/XRSTOR AND PROCESSOR EXTENDED STATE; support of XSAVE/XRSTOR architecture extensions
13-12 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR If a new task attempts to access an x87 FPU, MMX, XMM, or MXCSR register while the TS flag is set to 1, a device-not-available exception (#NM) is generated. The device-not-available exception handler executes the following ...
Page 587 - Header
Vol. 3 13-13 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND — CPUID leaf function 0DH enumerates the list of processor states (including legacy x87 FPU, SSE states and processor extended states), the offset and size of individual save area for each processor extended state. • Control register...
Page 588 - of Processor State Extensions; Byte Offset
13-14 Vol. 3 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND PROCESSOR The XSAVE header is 64 bytes in length and must be aligned on 64 byte boundary. Therefore, the XSAVE/XRSTOR region must be aligned on 64-byte boundary. The format of the header is as follows (see Table 13-3): The value of e...
Page 589 - INTEROPERABILITY OF XSAVE/XRSTOR AND
Vol. 3 13-15 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND enabled), a value of "1" in the corresponding bit of HEADER.XSTATE_BV causes the processor state to be updated with contents of the save area read from the memory image. A value of "0" in HEADER.XSTATE_BV causes the p...
Page 591 - DETECTION, ENUMERATION, ENABLING PROCESSOR; Figure 13-3. OS Enabling of Processor Extended State Support
Vol. 3 13-17 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND 13.8 DETECTION, ENUMERATION, ENABLING PROCESSOR EXTENDED STATE SUPPORT An OS can determine if the XSAVE/XRSTOR/XGETBV/XSETBV instructions and the XFEATURE_ENABLED_MASK register (XCR0) are available in the processor by checking the va...
Page 593 - Extended State
Vol. 3 13-19 SYSTEM PROGRAMMING FOR INSTRUCTION SET EXTENSIONS AND If all three requirements are met, applications can use the target new instruction set extensions. If any of the above requirements are not met, an attempt to execute an instruction operating on a processor extended state correspondi...
Page 595 - ENHANCED INTEL SPEEDSTEP; Enhanced Intel SpeedStep; Software Interface For Initiating Performance State
Vol. 3 14-1 CHAPTER 14 POWER AND THERMAL MANAGEMENT This chapter describes facilities of Intel 64 and IA-32 architecture used for power management and thermal monitoring. 14.1 ENHANCED INTEL SPEEDSTEP ® TECHNOLOGY Enhanced Intel SpeedStep ® Technology was introduced in the Pentium M processor; it is...
Page 596 - P-STATE HARDWARE COORDINATION
14-2 Vol. 3 POWER AND THERMAL MANAGEMENT tools can access model-specific events and report the occurrences of state transitions. 14.2 P-STATE HARDWARE COORDINATION The Advanced Configuration and Power Interface (ACPI) defines performance states (P-state) that are used facilitate system software’s ab...
Page 598 - SYSTEM SOFTWARE CONSIDERATIONS AND; Dynamic
14-4 Vol. 3 POWER AND THERMAL MANAGEMENT // This example does not cover the additional logic or algorithms// necessary to coordinate multiple logical processors to a target P-state. TargetPstate = FindPstate(PercentPerformance); if (TargetPstate != currentPstate) { SetPState(TargetPstate); } // WRMS...
Page 599 - Discover Hardware Support and Enabling of Opportunistic
Vol. 3 14-5 POWER AND THERMAL MANAGEMENT corresponding enable mechanism is activated, the headroom is available and certain criteria are met. • The opportunistic processor performance operation is generally transparent to most application software. • System software (BIOS and Operating system) must ...
Page 602 - Intel Turbo Boost Technology
14-8 Vol. 3 POWER AND THERMAL MANAGEMENT • When the OS timer service transfers control, the application can use RDPMC (with ECX = 4000_0001H) to read IA32_PERF_FIXED_CTR1 (MSR address 30AH) to record the unhalted core clocktick (UCC) value; followed by RDPMC (ECX=4000_0002H) to read IA32_PERF_FIXED_...
Page 603 - MWAIT EXTENSIONS FOR ADVANCED POWER; IA-32 processors may support a number of C-states
Vol. 3 14-9 POWER AND THERMAL MANAGEMENT Software can program the lowest four bits of IA32_ENERGY_PERF_BIAS MSR with a value from 0 - 15. The values represent a sliding scale, where a value of 0 (the default reset value) corresponds to a hint preference for highest performance and a value of 15 corr...
Page 604 - THERMAL MONITORING AND PROTECTION; processor’s core temperature rises above a preset limit.
14-10 Vol. 3 POWER AND THERMAL MANAGEMENT Reference, A-M,” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A).If CPUID.05H.ECX[Bit 1] = 1, the target processor supports using interrupts as break-events for MWAIT, even when interrupts are disabled. Use this feature to measur...
Page 605 - Figure 14-5. Processor Modulation Through Stop-Clock Mechanism
Vol. 3 14-11 POWER AND THERMAL MANAGEMENT consumption; this is in addition to the reduction offered by automatic thermal monitoring mechanisms. 4. On-die digital thermal sensor and interrupt mechanisms permit the OS to manage thermal conditions natively without relying on BIOS or other system board ...
Page 606 - Catastrophic Shutdown Detector
14-12 Vol. 3 POWER AND THERMAL MANAGEMENT 14.5.1 Catastrophic Shutdown Detector P6 family processors introduced a thermal sensor that acts as a catastrophic shut-down detector. This catastrophic shutdown detector was also implemented in Pentium 4, Intel Xeon and Pentium M processors. It is always en...
Page 607 - Family/Model/Stepping Signature Encoded as 0x69n or 0x6Dn
Vol. 3 14-13 POWER AND THERMAL MANAGEMENT Support for TM2 is indicated by CPUID.1:ECX.TM2[bit 8] = 1. 14.5.2.3 Two Methods for Enabling TM2 On processors with CPUID family/model/stepping signature encoded as 0x69n or 0x6Dn (early Pentium M processors), TM2 is enabled if the TM_SELECT flag (bit 16) o...
Page 608 - Performance State Transitions and Thermal Monitoring
14-14 Vol. 3 POWER AND THERMAL MANAGEMENT 14.5.2.4 Performance State Transitions and Thermal Monitoring If the thermal control circuitry (TCC) for thermal monitor (TM1/TM2) is active, writes to the IA32_PERF_CTL will effect a new target operating point as follows: • If TM1 is enabled and the TCC is ...
Page 610 - Controlled Clock Modulation
14-16 Vol. 3 POWER AND THERMAL MANAGEMENT interrupt enable flags in the IA32_THERM_INTERRUPT MSR are cleared (interrupts are disabled) and the thermal LVT entry is set to mask interrupts. This interrupt should be handled either by the operating system or system management mode (SMM) code.Note that t...
Page 611 - Table 14-1. On-Demand Clock Modulation Duty Cycle Field Encoding; Duty Cycle Field Encoding
Vol. 3 14-17 POWER AND THERMAL MANAGEMENT The IA32_CLOCK_MODULATION MSR contains the following flag and field used to enable software-controlled clock modulation and to select the clock modulation duty cycle: • On-Demand Clock Modulation Enable, bit 4 — Enables on-demand software controlled clock mo...
Page 612 - Detection of Thermal Monitor and Software Controlled
14-18 Vol. 3 POWER AND THERMAL MANAGEMENT clock modulation at the duty cycle specified by TM1 takes precedence, regardless of the setting of the on-demand clock modulation duty cycle.For Hyper-Threading Technology enabled processors, the IA32_CLOCK_MODULATION register is duplicated for each logical ...
Page 617 - MACHINE-CHECK ARCHITECTURE; processors. See Chapter 6, “Interrupt 18—Machine-Check Exception
Vol. 3 15-1 CHAPTER 15 MACHINE-CHECK ARCHITECTURE This chapter describes the machine-check architecture and machine-check exception mechanism found in the Pentium 4, Intel Xeon, and P6 family processors. See Chapter 6, “Interrupt 18—Machine-Check Exception (#MC),” for more information on machine-che...
Page 618 - WITH; data parity errors during read cycles; Processor Machine-Check; MSRS
15-2 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.2 COMPATIBILITY WITH PENTIUM PROCESSOR The Pentium 4, Intel Xeon, and P6 family processors support and extend the machine-check exception mechanism introduced in the Pentium processor. The Pentium processor reports the following machine-check errors: • data ...
Page 619 - and to write these registers.; Machine-Check Global Control MSRs
Vol. 3 15-3 MACHINE-CHECK ARCHITECTURE Each error-reporting bank is associated with a specific hardware unit (or group of hardware units) in the processor. Use RDMSR and WRMSR to read and to write these registers. 15.3.1 Machine-Check Global Control MSRs The machine-check global control MSRs include...
Page 622 - Error-Reporting Register Banks; addresses of the error-reporting registers P6 family processors.; DisplayFamily_DisplayModel
15-6 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.3.1.3 IA32_MCG_CTL MSR The IA32_MCG_CTL MSR is present if the capability flag MCG_CTL_P is set in the IA32_MCG_CAP MSR. IA32_MCG_CTL controls the reporting of machine-check exceptions. If present, writing 1s to this register enables machine-check features an...
Page 623 - encoding of 06H_1AH and onward
Vol. 3 15-7 MACHINE-CHECK ARCHITECTURE encoding of 06H_1AH and onward ): the operating system or executive software must not modify the contents of the IA32_MC0_CTL MSR. This MSR is internally aliased to the EBL_CR_POWERON MSR and controls platform-specific error handling features. System specific f...
Page 627 - posted for that event is retained. In either case, the OVER bit; First Event
Vol. 3 15-11 MACHINE-CHECK ARCHITECTURE In Table 15-2, the values in the two left-most columns are IA32_MCi_STATUS[54:53]. If a second event overwrites a previously posted event, the information (as guarded by individual valid bits) in the MCi bank is entirely from the second event. Similarly, if a ...
Page 629 - The IA32_MCi_CTL2 MSR provides the programming interface to use; Definition
Vol. 3 15-13 MACHINE-CHECK ARCHITECTURE • Recoverable Address LSB (bits 5:0): The lowest valid recoverable address bit. Indicates the position of the least significant bit (LSB) of the recoverable error address. For example, if the processor logs bits [43:9] of the address, the LSB sub-field in IA32...
Page 630 - shared by more than one logical processors. For example, the
15-14 Vol. 3 MACHINE-CHECK ARCHITECTURE When IA32_MCG_CAP[10] = 1, the IA32_MCi_CTL2 MSR for each bank exists, i.e. reads and writes to these MSR are supported. However, signaling interface for corrected MC errors may not be supported in all banks. The layout of IA32_MCi_CTL2 is shown in Figure 15-8...
Page 631 - Table 15-4. Extended Machine Check State MSRs
Vol. 3 15-15 MACHINE-CHECK ARCHITECTURE 15.3.2.6 IA32_MCG Extended Machine Check State MSRs The Pentium 4 and Intel Xeon processors implement a variable number of extended machine-check state MSRs. The MCG_EXT_P flag in the IA32_MCG_CAP MSR indicates the presence of these extended registers, and the...
Page 632 - Table 15-5. Extended Machine Check State MSRs
15-16 Vol. 3 MACHINE-CHECK ARCHITECTURE Table 15-5. Extended Machine Check State MSRs In Processors With Support For Intel 64 Architecture MSR Address Description IA32_MCG_RAX 180H Contains state of the RAX register at the time of the machine- check error. IA32_MCG_RBX 181H Contains state of the RBX...
Page 633 - When a machine-check error is detected on a Pentium 4 or Intel Xeon; processors map these registers to the IA32_MCi_STATUS and; Contains state of the R14 register at the time of the machine-
Vol. 3 15-17 MACHINE-CHECK ARCHITECTURE When a machine-check error is detected on a Pentium 4 or Intel Xeon processor, the processor saves the state of the general-purpose registers, the R/EFLAGS register, and the R/EIP in these extended machine-check state MSRs. This information can be used by a de...
Page 634 - ENHANCED CACHE ERROR REPORTING; of lines (ECC blocks) in a cache that incur repeated corrections. The; CORRECTED MACHINE CHECK ERROR INTERRUPT
15-18 Vol. 3 MACHINE-CHECK ARCHITECTURE processor; the handler must be written to interpret P5_MC_TYPE encodings correctly. 15.4 ENHANCED CACHE ERROR REPORTING Starting with Intel Core Duo processors, cache error reporting was enhanced. In earlier Intel processors, cache status was based on the numb...
Page 635 - CMCI is not affected by the CR4.MCE bit, and it is not affected by the; CMCI Local APIC Interface; The interaction of CMCI is depicted in Figure 15-9.
Vol. 3 15-19 MACHINE-CHECK ARCHITECTURE beyond those of threshold-based error reporting (Section 15.4). With threshold-based error reporting, software is limited to use periodic polling to query the status of hardware corrected MC errors. CMCI provides a signaling mechanism to deliver a local interr...
Page 636 - allows the 4 delivery modes, an 8 bit interrupt vector, and masking.; error. The vector information is ignored.; Figure 15-10. Local APIC CMCI LVT Register
15-20 Vol. 3 MACHINE-CHECK ARCHITECTURE CMCI interrupt delivery is configured by writing to the LVT CMCI register entry in the local APIC register space at default address of APIC_BASE + 2F0H. A CMCI interrupt can be delivered to more than one logical processors if multiple logical processors are af...
Page 637 - System Software Recommendation for Managing CMCI and; minimize contentions to access shared MSR resources.
Vol. 3 15-21 MACHINE-CHECK ARCHITECTURE • Delivery status, bits 12 — It is a read-only bit that, when set, indicates that an interrupt from this source has been delivered to the processor core, but has not yet been accepted. • Mask, bits 16 — When set, inhibits reception of the interrupt. (Unlike th...
Page 638 - bits as threshold for the overflow comparison with; Software can set the initial threshold value to 1 by writing 1 to
15-22 Vol. 3 MACHINE-CHECK ARCHITECTURE b. Each thread examines IA32_MCi_CTL2[30] indicator for each bank to determine if another thread has already claimed ownership of that bank. • If IA32_MCi_CTL2[30] had been set by another thread. This thread can not own bank i and should proceed to step b. and...
Page 639 - maximum threshold supported by the processor.; Clear the MSRs of this MC bank.; RECOVERY OF UNCORRECTED RECOVERABLE (UCR); feature is 45nm Intel 64 processor with CPUID signature
Vol. 3 15-23 MACHINE-CHECK ARCHITECTURE • Write 7FFFH to IA32_MCi_CTL2[15:0], • Read back IA32_MCi_CTL2[15:0], the lower 15 bits (14:0) is the maximum threshold supported by the processor. b. Increase the threshold to a value below the maximum value discovered using step a. 15.5.2.3 CMCI Interrupt H...
Page 640 - Detection of Software Error Recovery Support; IA32_MCi_STATUS MSR is used for reporting UCR errors and existing
15-24 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.6.1 Detection of Software Error Recovery Support Software must use bit 24 of IA32_MCG_CAP (MCG_SER_P) to detect the presence of software error recovery support (see Figure 15-2). When IA32_MCG_CAP[24] is set, this indicates that the processor supports soft-...
Page 641 - UCR Error Classification; With the S and AR flag encoding in the IA32_MCi_STATUS register, UCR
Vol. 3 15-25 MACHINE-CHECK ARCHITECTURE • S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank and system software needs to check the AR flag and the MCA error code fields in the IA32_MCi_STATUS register to identify...
Page 643 - UCR Error Overwrite Rules; In general, the overwrite rules are as follows:; UCR errors will overwrite corrected errors.; IA32_MCi_STATUS register. As UCNA and SRA0 errors do not require; First Event Second Event UC PCC S
Vol. 3 15-27 MACHINE-CHECK ARCHITECTURE 15.6.4 UCR Error Overwrite Rules In general, the overwrite rules are as follows: • UCR errors will overwrite corrected errors. • Uncorrected (PCC=1) errors overwrite UCR (PCC=0) errors. • UCR errors are not written over previous UCR errors. • Corrected errors ...
Page 644 - AVAILABILITY; registers to zero when doing a soft-reset.
15-28 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.7 MACHINE-CHECK AVAILABILITY The machine-check architecture and machine-check exception (#MC) are model-specific features. Software can execute the CPUID instruction to determine whether a processor implements these features. Following the execution of the ...
Page 645 - INTERPRETING THE MCA ERROR CODES
Vol. 3 15-29 MACHINE-CHECK ARCHITECTURE (* enables all MCA features *) FI (* Determine number of error-reporting banks supported *) COUNT ← IA32_MCG_CAP.Count; MAX_BANK_NUMBER ← COUNT - 1; IF (Processor Family is 6H and Processor EXTMODEL:MODEL is less than 1AH)THEN (* Enable logging of all errors e...
Page 646 - machine-check exception handler must read the VAL flag for each; Simple Error Codes; Error Code
15-30 Vol. 3 MACHINE-CHECK ARCHITECTURE also write a 16-bit model-specific error code in the IA32_MCi_STATUS register depending on the implementation of the machine-check architec- ture of the processor.The MCA error codes are architecturally defined for Intel 64 and IA-32 processors. To determine t...
Page 647 - Compound Error Codes; form of the compound error codes.
Vol. 3 15-31 MACHINE-CHECK ARCHITECTURE 15.9.2 Compound Error Codes Compound error codes describe errors related to the TLBs, memory, caches, bus and interconnect logic, and internal timer. A set of sub-fields is common to all of compound errors. These sub-fields describe the type of access, level i...
Page 648 - external APIC bus separate from the system bus. The generic type is
15-32 Vol. 3 MACHINE-CHECK ARCHITECTURE The behavior of error filtering after crossing the yellow threshold is model- specific. 15.9.2.2 Transaction Type (TT) Sub-Field The 2-bit TT sub-field (Table 15-10) indicates the type of transaction (data, instruction, or generic). The sub-field applies to th...
Page 649 - the other requests apply to TLBs, caches and interconnects.
Vol. 3 15-33 MACHINE-CHECK ARCHITECTURE caused the error. Eviction and snoop requests apply only to the caches. All of the other requests apply to TLBs, caches and interconnects. 15.9.2.5 Bus and Interconnect Errors The bus and interconnect errors are defined with the 2-bit PP (participation), 1-bit...
Page 650 - The memory controller errors are defined with the 3-bit MMM (memory; Architecturally Defined UCR Errors; The following two SRAO errors are architecturally defined.; UCR Errors detected by memory controller scrubbing and
15-34 Vol. 3 MACHINE-CHECK ARCHITECTURE 15.9.2.6 Memory Controller Errors The memory controller errors are defined with the 3-bit MMM (memory transaction type), and 4-bit CCCC (channel) sub-fields. The encodings for MMM and CCCC are defined in Table 15-14. 15.9.3 Architecturally Defined UCR Errors S...
Page 651 - 5-9). Their values and compound encoding format are given in Table
Vol. 3 15-35 MACHINE-CHECK ARCHITECTURE 15-9). Their values and compound encoding format are given in Table 15-15. Table 15-16 lists values of relevant bit fields of IA32_MCi_STATUS for archi- tecturally defined SRAO errors. For both the memory scrubbing and L3 explicit writeback errors, the ADDRV a...
Page 652 - The following two SRAR errors are architecturally defined.; UCR Errors detected on data load and; SRAO Type; Table 15-18. MCA Compound Error Code Encoding for SRAR Errors; Data Load
15-36 Vol. 3 MACHINE-CHECK ARCHITECTURE IA32_MCG_STATUS register for the memory scrubbing and L3 explicit write- back errors on both the reporting and non-reporting logical processors. 15.9.3.2 Architecturally Defined SRAR Errors The following two SRAR errors are architecturally defined. • UCR Error...
Page 653 - tecturally defined SRAR errors.
Vol. 3 15-37 MACHINE-CHECK ARCHITECTURE Table 15-19 lists values of relevant bit fields of IA32_MCi_STATUS for archi- tecturally defined SRAR errors. For both the data load and instruction fetch errors, the ADDRV and MISCV flags in the IA32_MCi_STATUS register are set to indicate that the offending ...
Page 654 - consumption errors from that affected page.; Multiple MCA Errors
15-38 Vol. 3 MACHINE-CHECK ARCHITECTURE For Instruction Fetch recoverable error, the affected logical processor should find that the RIPV flag and the EIPV Flag in the IA32_MCG_STATUS register are cleared, indicating that the error is detected at the instruction pointer saved on the stack may not be...
Page 655 - Machine-Check Error Codes Interpretation; GUIDELINES FOR WRITING MACHINE-CHECK; To periodically check and log machine errors.
Vol. 3 15-39 MACHINE-CHECK ARCHITECTURE • When multiple recoverable errors are reported and no other fatal condition (e.g.. overflowed condition for SRAR error) is found for the reported recoverable errors, it is possible for system software to recover from the multiple recoverable errors by taking ...
Page 656 - error logging utility are given in the following sections.; a debugger or shut down the system.; following when writing a machine-check exception handler:
15-40 Vol. 3 MACHINE-CHECK ARCHITECTURE Guidelines for writing a machine-check exception handler or a machine- error logging utility are given in the following sections. 15.10.1 Machine-Check Exception Handler The machine-check exception (#MC) corresponds to vector 18. To service machine-check excep...
Page 657 - Processor Machine-Check Exception Handling; that check the processor’s support of MCA.
Vol. 3 15-41 MACHINE-CHECK ARCHITECTURE generated). If this flag is clear, the processor may still be able to be restarted (for debugging purposes) but not without loss of program continuity. • For unrecoverable errors, the EIPV flag in the IA32_MCG_STATUS register indicates whether the instruction ...
Page 658 - handler uses the RDMSR instruction to read the error type from the; Assume that execution is restartable
15-42 Vol. 3 MACHINE-CHECK ARCHITECTURE When machine-check exceptions are enabled for the Pentium processor (MCE flag is set in control register CR4), the machine-check exception handler uses the RDMSR instruction to read the error type from the P5_MC_TYPE register and the machine check address from...
Page 659 - possible when damage has not occurred (The PCC flag is clear, in the
Vol. 3 15-43 MACHINE-CHECK ARCHITECTURE AND PCC flag in IA32_MC i _STATUS = 1 OR RIPV flag in IA32_MCG_STATUS = 0(* execution is not restartable *) THEN RESTARTABILITY = FALSE;return RESTARTABILITY to calling procedure; FI; Save time-stamp counter and processor ID;Set IA32_MC i _STATUS to all 0s; Ex...
Page 660 - determine the appropriate recovery strategy.; Software; recovery from Uncorrected Recoverable (UCR) errors, consider the
15-44 Vol. 3 MACHINE-CHECK ARCHITECTURE mechanism to indicate the frequency of exceptions. A multiprocessing oper- ating system stores the identity of the processor node incurring the excep- tion using a unique identifier, such as the processor’s APIC ID (see Section 10.9, “Handling Interrupts”). Th...
Page 667 - Example 15-5 gives pseudocode for a CMCI handler with UCR support.; Example 15-5. Corrected Error Handler Pseudocode with UCR Support
Vol. 3 15-51 MACHINE-CHECK ARCHITECTURE before these errors are actually handled and processed by the MCE handler for attempted software error recovery. Example 15-5 gives pseudocode for a CMCI handler with UCR support. Example 15-5. Corrected Error Handler Pseudocode with UCR Support Corrected Erro...
Page 669 - STAMP COUNTER; OVERVIEW OF DEBUG SUPPORT FACILITIES
Vol. 3 16-1 CHAPTER 16 DEBUGGING, PROFILING BRANCHES AND TIME- STAMP COUNTER Intel 64 and IA-32 architectures provide debug facilities for use in debugging code and monitoring performance. These facilities are valuable for debugging application software, system software, and multitasking operating s...
Page 672 - Debug Registers DR4 and DR5
16-4 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.2.1 Debug Address Registers (DR0-DR3) Each of the debug-address registers (DR0 through DR3) holds the 32-bit linear address of a breakpoint (see Figure 16-1). Breakpoint comparisons are made before physical address translation occur...
Page 674 - For Pentium; Breakpoint Field Recognition
16-6 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 10 — Break on I/O reads or writes.11 — Break on data reads or writes but not instruction fetches. When the DE flag is clear, the processor interprets the R/Wn bits the same as for the Intel386™ and Intel486™ processors, which is as fol...
Page 676 - Debug Registers and Intel; Data operations that do not trap
16-8 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.2.6 Debug Registers and Intel ® 64 Processors For Intel 64 architecture processors, debug registers DR0–DR7 are 64 bits. In 16-bit or 32-bit modes (protected mode and compatibility mode), writes to a debug register fill the upper 32...
Page 682 - RECORDING OVERVIEW
16-14 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING OVERVIEW P6 family processors introduced the ability to set breakpoints on taken branches, interrupts, and exceptions, and to single-step from one branch to the next. This capabilit...
Page 683 - on Intel Core
Vol. 3 16-15 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER in the last branch record (LBR) stack. For more information, see the Section 16.5.1, “LBR Stack”. • BTF (single-step on branches) flag (bit 1) — When set, the processor treats the TF flag in the EFLAGS register as a “single-step on br...
Page 684 - Monitoring Branches, Exceptions, and Interrupts
16-16 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • FREEZE_LBRS_ON_PMI flag (bit 11) — When set, the LBR stack is frozen on a hardware PMI request (e.g. when a counter overflows and is configured to trigger PMI). • FREEZE_PERFMON_ON_PMI flag (bit 12) — When set, a PMI request clears ...
Page 685 - Branch Trace Messages
Vol. 3 16-17 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER a bug to a particular block of code before instruction single-stepping further narrows the search. If the BTF flag is set when the processor generates a debug exception, the processor clears the BTF flag along with the TF flag. The de...
Page 686 - CPL-Qualified Branch Trace Mechanism; Freezing LBRs and PMCs on PMIs occur when:
16-18 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.6 CPL-Qualified Branch Trace Mechanism CPL-qualified branch trace mechanism is available to a subset of Intel 64 and IA-32 processors that support the branch trace storing mechanism. The processor supports the CPL-qualified branc...
Page 687 - Stack; Table 16-3. LBR Stack Size and TOS Pointer Range; DisplayFamily_DisplayModel Size of LBR Stack
Vol. 3 16-19 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8 LBR Stack The last branch record stack and top-of-stack (TOS) pointer MSRs are supported across Intel 64 and IA-32 processor families. However, the number of MSRs in the LBR stack and the valid range of TOS pointer value can va...
Page 688 - 4 Processors
16-20 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8.1 LBR Stack and Intel ® 64 Processors LBR MSRs are 64-bits. If IA-32e mode is disabled, only the lower 32-bits of the address is recorded. If IA-32e mode is enabled, the processor writes 64-bit values into the MSR. In 64-bit mo...
Page 689 - BTS and DS Save Area; The IA32_DS_AREA MSR can be programmed to point to the DS save area.
Vol. 3 16-21 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.4.8.3 Last Exception Records and Intel 64 Architecture Intel 64 and IA-32 processors also provide MSRs that store the branch record for the last branch taken prior to an exception or an interrupt. The location of the last excep-tio...
Page 698 - BTS
16-30 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 2. Set the TR and BTS flags in the IA32_DEBUGCTL for Intel Core Solo and Intel Core Duo processors or later processors (or MSR_DEBUGCTLA MSR for processors based on Intel NetBurst Microarchitecture; or MSR_DEBUGCTLB for Pentium M proc...
Page 700 - CORE; LBR stack on a PMI request is available.
16-32 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • The ISR must clear the mask bit in the performance counter LVT entry. • The ISR must re-enable the counters to count via IA32_PERF_GLOBAL_CTRL/IA32_PERF_GLOBAL_OVF_CTRL if it is servicing an overflow PMI due to PEBS (or via CCCR's E...
Page 703 - Table 16-8. LBR Stack Size and TOS Pointer Range; Filtering of Last Branch Records
Vol. 3 16-35 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER Processors based on Intel microarchitecture (Nehalem) have an LBR MSR Stack as shown in Table 16-8. Table 16-8. LBR Stack Size and TOS Pointer Range 16.6.2 Filtering of Last Branch Records MSR_LBR_SELECT is cleared to zero at RESET, a...
Page 704 - RECORDING (PROCESSORS BASED ON INTEL
16-36 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.7 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PROCESSORS BASED ON INTEL NETBURST ® MICROARCHITECTURE) Pentium 4 and Intel Xeon processors based on Intel NetBurst microarchitecture provide the following methods for recording ta...
Page 706 - LBR Stack for Processors Based on Intel NetBurst
16-38 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • BTS (branch trace store) flag (bit 3) — When set, enables the BTS facilities to log BTMs to a memory-resident BTS buffer that is part of the DS save area. See Section 16.4.9, “BTS and DS Save Area.” • BTINT (branch trace interrupt) ...
Page 707 - Table 16-10. LBR MSR Stack Size and TOS Pointer Range for the Pentium
Vol. 3 16-39 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER LBR MSR pair) that contains the most recent (last) branch record placed on the stack. Prior to placing a new branch record on the stack, the TOS is incremented by 1. When the TOS pointer reaches it maximum value, it wraps around to 0....
Page 708 - Last Exception Records; CoreTM i7 and Intel; Figure 16-13. LBR MSR Branch Record Layout for the Pentium 4
16-40 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER Additional information is saved if an exception or interrupt occurs in conjunction with a branch instruction. If a branch instruction generates a trap type exception, two branch records are stored in the LBR stack: a branch record for...
Page 710 - and Intel Core
16-42 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • Debug store (DS) feature flag (bit 21), returned by the CPUID instruction — Indicates that the processor provides the debug store (DS) mechanism, which allows BTMs to be stored in a memory-resident BTS buffer. See Section 16.4.5, “B...
Page 711 - Figure 16-15. LBR Branch Record Layout for the Intel Core Solo
Vol. 3 16-43 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.9 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS) Like the Pentium 4 and Intel Xeon processor family, Pentium M processors provide last branch interrupt and exception recording. The capability operates almost...
Page 713 - Figure 16-17. LBR Branch Record Layout for the Pentium M Processor
Vol. 3 16-45 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER For more detail on these capabilities, see Section 16.7.3, “Last Exception Records,” and Appendix B.7, “MSRs In the Pentium M Processor.” 16.10 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) The P6 family proce...
Page 714 - Last Branch and Last Exception MSRs
16-46 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER • BTF (single-step on branches) flag (bit 1) — When set, the processor treats the TF flag in the EFLAGS register as a “single-step on branches” flag. See Section 16.4.3, “Single-Stepping on Branches, Exceptions, and Interrupts.” • PBi...
Page 716 - COUNTER
16-48 Vol. 3 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER 16.11 TIME-STAMP COUNTER The Intel 64 and IA-32 architectures (beginning with the Pentium processor) define a time-stamp counter mechanism that can be used to monitor and identify the relative time occurrence of processor events. The ...
Page 717 - TSC
Vol. 3 16-49 DEBUGGING, PROFILING BRANCHES AND TIME-STAMP COUNTER NOTE To determine average processor clock frequency, Intel recommends the use of EMON logic to count processor core clocks over the period of time for which the average is required. See Section 30.10, “Counting Clocks,” and Appendix A...
Page 719 - MODE
Vol. 3 17-1 CHAPTER 17 8086 EMULATION IA-32 processors (beginning with the Intel386 processor) provide two ways to execute new or legacy programs that are assembled and/or compiled to run on an Intel 8086 processor: • Real-address mode. • Virtual-8086 mode. Figure 2-3 shows the relationship of these...
Page 721 - Translation in Real-Address Mode
Vol. 3 17-3 8086 EMULATION • A single interrupt table, called the “interrupt vector table” or “interrupt table,” is provided for handling interrupts and exceptions (see Figure 17-2). The interrupt table (which has 4-byte entries) takes the place of the interrupt descriptor table (IDT, with 8-byte en...
Page 726 - Vector
17-8 Vol. 3 8086 EMULATION 17.2 VIRTUAL-8086 MODE Virtual-8086 mode is actually a special type of a task that runs in protected mode. When the operating-system or executive switches to a virtual-8086-mode task, the processor emulates an Intel 8086 processor. The execution environment of the processo...
Page 727 - Structure of a Virtual-8086 Task
Vol. 3 17-9 8086 EMULATION 17.2.1 Enabling Virtual-8086 Mode The processor runs in virtual-8086 mode when the VM (virtual machine) flag in the EFLAGS register is set. This flag can only be set when the processor switches to a new protected-mode task or resumes virtual-8086 mode via an IRET instructi...
Page 728 - Paging of Virtual-8086 Tasks
17-10 Vol. 3 8086 EMULATION The processor enters virtual-8086 mode to run the 8086 program and returns to protected mode to run the virtual-8086 monitor.The virtual-8086 monitor is a 32-bit protected-mode code module that runs at a CPL of 0. The monitor consists of initialization, interrupt- and exc...
Page 729 - Protection within a Virtual-8086 Task
Vol. 3 17-11 8086 EMULATION Paging is not necessary for a single virtual-8086-mode task, but paging is useful or necessary in the following situations: • When running multiple virtual-8086-mode tasks. Here, paging allows the lower 1 MByte of the linear address space for each virtual-8086-mode task t...
Page 733 - Instructions; Let the 8086 program perform I/O directly.
Vol. 3 17-15 8086 EMULATION execution sequence after verifying that it was entered as a result of a HLT execution. See Section 17.3, “Interrupt and Exception Handling in Virtual-8086 Mode”, for infor-mation on leaving virtual-8086 mode to handle an interrupt or exception generated in virtual-8086 mo...
Page 735 - instruction in Chapter 3,
Vol. 3 17-17 8086 EMULATION In virtual-8086 mode, the interrupts and exceptions are divided into three classes for the purposes of handling: • Class 1 — All processor-generated exceptions and all hardware interrupts, including the NMI interrupt and the hardware interrupts sent to the processor’s ext...
Page 736 - Class 1—Hardware Interrupt and Exception Handling in; Trap or Interrupt Gate
17-18 Vol. 3 8086 EMULATION in the previous paragraphs. These sections describe three possible types of interrupt and exception handlers: • Protected-mode interrupt and exceptions handlers — These are the standard handlers that the processor calls through the protected-mode IDT. • Virtual-8086 monit...
Page 738 - 086 program interrupt table.
17-20 Vol. 3 8086 EMULATION Interrupt and exception handlers can examine the VM flag on the stack to determine if the interrupted procedure was running in virtual-8086 mode. If so, the interrupt or exception can be handled in one of three ways: • The protected-mode interrupt or exception handler tha...
Page 739 - Handling an Interrupt or Exception Through a Task Gate; VM flag and causes the processor to switch to protected mode.
Vol. 3 17-21 8086 EMULATION 2. Store the EFLAGS (low-order 16 bits only), CS and EIP values of the 8086 program on the privilege-level 3 stack. This is the stack that the virtual-8086-mode task is using. (The 8086 handler may use or modify this information.) 3. Change the return link on the privileg...
Page 742 - Class 3—Software Interrupt Handling in Virtual-8086 Mode
17-24 Vol. 3 8086 EMULATION 5. Upon returning to virtual-8086 mode, the processor continues execution of the 8086 program. When the 8086 program is ready to receive maskable hardware interrupts, it executes the STI instruction to set the VIF flag (enabling maskable hardware interrupts). Prior to set...
Page 745 - Figure 17-5. Software Interrupt Redirection Bit Map in TSS
Vol. 3 17-27 8086 EMULATION Redirecting software interrupts back to the 8086 program potentially speeds up interrupt handling because a switch back and forth between virtual-8086 mode and protected mode is not required. This latter interrupt-handling technique is particu-larly useful for 8086 operat...
Page 754 - Stacks in expand-down segments with the G and B flags clear.; TRANSFERRING CONTROL AMONG MIXED-SIZE CODE; Make the call through a 32-bit call gate.
18-4 Vol. 3 MIXING 16-BIT AND 32-BIT CODE 18.3 SHARING DATA AMONG MIXED-SIZE CODE SEGMENTS Data segments can be accessed from both 16-bit and 32-bit code segments. When a data segment that is larger than 64 KBytes is to be shared among 16- and 32-bit code segments, the data that is to be accessed fr...
Page 759 - Writing Interface Procedures; The possible invalidation of the upper bits of the ESP register.
Vol. 3 18-9 MIXING 16-BIT AND 32-BIT CODE 18.4.5 Writing Interface Procedures Placing interface code between 32-bit and 16-bit procedures can be the solution to the following interface problems: • Allowing procedures in 16-bit code segments to call procedures with offsets greater than FFFFH in 32-bi...
Page 761 - PROCESSOR FAMILIES AND CATEGORIES
Vol. 3 19-1 CHAPTER 19 ARCHITECTURE COMPATIBILITY Intel 64 and IA-32 processors are binary compatible. Compatibility means that, within limited constraints, programs that execute on previous generations of proces-sors will produce identical results when executed on later processors. The compati-bili...
Page 762 - BITS
19-2 Vol. 3 ARCHITECTURE COMPATIBILITY • Pentium D Processors — A family of dual-core Intel 64 processors that provides two processor cores in a physical package. Each core is based on the Intel NetBurst microarchitecture. • Pentium Processor Extreme Editions — A family of dual-core Intel 64 process...
Page 763 - DETECTING THE PRESENCE OF NEW FEATURES
Vol. 3 19-3 ARCHITECTURE COMPATIBILITY original value results in a general-protection exception (#GP). So, programs that execute on the P6 family and Pentium processors cannot erroneously enable func-tions that may be implemented in future IA-32 processors. The P6 family and Pentium processors do no...
Page 764 - ADDITIONAL STREAMING SIMD EXTENSIONS
19-4 Vol. 3 ARCHITECTURE COMPATIBILITY control and status register. These instructions and registers are designed to allow SIMD computations to be made on single-precision floating-point numbers. Several of these new instructions also operate in the MMX registers. SSE instructions and registers are ...
Page 765 - Hyper-Threading Technology Architecture.”; TECHNOLOGY; SPECIFIC FEATURES OF DUAL-CORE PROCESSOR
Vol. 3 19-5 ARCHITECTURE COMPATIBILITY 19.10 INTEL HYPER-THREADING TECHNOLOGY Intel Hyper-Threading Technology provides two logical processors that can execute two separate code streams (called threads) concurrently by using shared resources in a single processor core or in a physical package. This ...
Page 766 - Instructions Added Prior to the Pentium Processor; The following instructions were added in the Intel486 processor:; Table 19-1. New Instruction in the Pentium Processor and
19-6 Vol. 3 ARCHITECTURE COMPATIBILITY 19.13.1 Instructions Added Prior to the Pentium Processor The following instructions were added in the Intel486 processor: • BSWAP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and exchange) instruction. • Ι NVD (invalidate ...
Page 768 - OPERATIONS; SP
19-8 Vol. 3 ARCHITECTURE COMPATIBILITY The following flags were added to the EFLAGS register in the Pentium processor: • VIF (virtual interrupt flag), bit 19. • VIP (virtual interrupt pending), bit 20. • ID (identification flag), bit 21. The AC flag (bit 18) was added to the EFLAGS register in the I...
Page 769 - FPU
Vol. 3 19-9 ARCHITECTURE COMPATIBILITY XCHG BP, [BP] This code functions as the 8086 processor PUSH SP instruction on the P6 family, Pentium, Intel486, Intel386, and Intel 286 processors. 19.17.2 EFLAGS Pushed on the Stack The setting of the stored values of bits 12 through 15 (which includes the IO...
Page 772 - Types; Formats
19-12 Vol. 3 ARCHITECTURE COMPATIBILITY Software written to run on a 16-bit IA-32 math coprocessor may not operate correctly on a 16-bit x87 FPU, if it uses the FLDENV, FRSTOR, or FXRSTOR instruc-tions to change tags to values (other than to empty) that are different from actual register contents.Th...
Page 773 - . Under the most common rounding modes, this
Vol. 3 19-13 ARCHITECTURE COMPATIBILITY ters. The only affect may be in how software handles the tags in the tag word (see also: Section 19.18.4, “x87 FPU Tag Word”). 19.18.6 Floating-Point Exceptions This section identifies the implementation differences in exception handling for floating-point ins...
Page 782 - FPU AND MATH COPROCESSOR INITIALIZATION; 87 Math Coprocessor Initialization; 87 DX math coprocessor) by sampling its ERROR# input some; Microprocessor/Intel 487 SX Math Coprocessor System
19-22 Vol. 3 ARCHITECTURE COMPATIBILITY 19.20 FPU AND MATH COPROCESSOR INITIALIZATION Table 9-1 shows the states of the FPUs in the P6 family, Pentium, Intel486 processors and of the Intel 387 math coprocessor and Intel 287 coprocessor following a power-up, reset, or INIT, or following the execution...
Page 783 - Table 19-3. EM and MP Flag Interpretation
Vol. 3 19-23 ARCHITECTURE COMPATIBILITY Following is an example code sequence to initialize the system and check for the presence of Intel486 SX processor/Intel 487 SX math coprocessor. fninitfstcw mem_locmov ax, mem_loccmp ax, 037fhjz Intel487_SX_Math_CoProcessor_present ;ax=037fh jmp Intel486_SX_m...
Page 785 - New Memory Management Control Flags
Vol. 3 19-25 ARCHITECTURE COMPATIBILITY • NE — Numeric error. Enables the normal mechanism for reporting floating-point numeric errors. • WP — Write protect. Write-protects read-only pages against supervisor-mode accesses. • AM — Alignment mask. Controls whether alignment checking is performed. Oper...
Page 787 - Changes in Segment Descriptor Loads; FACILITIES; Differences in Debug Register DR6; Break on instruction execution only.
Vol. 3 19-27 ARCHITECTURE COMPATIBILITY 19.22.4 Changes in Segment Descriptor Loads On the Intel386 processor, loading a segment descriptor always causes a locked read and write to set the accessed bit of the descriptor. On the P6 family, Pentium, and Intel486 processors, the locked read and write o...
Page 788 - RECOGNITION OF BREAKPOINTS
19-28 Vol. 3 ARCHITECTURE COMPATIBILITY are enabled (the DE flag is set), attempts to reference registers DR4 or DR5 will result in an invalid-opcode exception (#UD). 19.24 RECOGNITION OF BREAKPOINTS For the Pentium processor, it is recommended that debuggers execute the LGDT instruction before retu...
Page 790 - Architecture
19-30 Vol. 3 ARCHITECTURE COMPATIBILITY 19.25.1 Machine-Check Architecture The Pentium Pro processor introduced a new architecture to the IA-32 for handling and reporting on machine-check exceptions. This machine-check architecture (described in detail in Chapter 15, “Machine-Check Architecture”) gr...
Page 791 - ADVANCED PROGRAMMABLE INTERRUPT; Software Visible Differences Between the Local APIC and
Vol. 3 19-31 ARCHITECTURE COMPATIBILITY 19.26.3 IDT Limit The LIDT instruction can be used to set a limit on the size of the IDT. A double-fault exception (#DF) is generated if an interrupt or exception attempts to read a vector beyond the limit. Shutdown then occurs on the 32-bit IA-32 processors i...
Page 792 - TASK SWITCHING AND TSS
19-32 Vol. 3 ARCHITECTURE COMPATIBILITY • The remote read delivery mode provided in the 82489DX and local APIC for Pentium processors is not supported in the local APIC in the Pentium 4, Intel Xeon, and P6 family processors. • For the 82489DX, in the lowest priority delivery mode, all the target loc...
Page 793 - P6 Family and Pentium Processor TSS
Vol. 3 19-33 ARCHITECTURE COMPATIBILITY 19.28.1 P6 Family and Pentium Processor TSS When the virtual mode extensions are enabled (by setting the VME flag in control register CR4), the TSS in the P6 family and Pentium processors contain an interrupt redirection bit map, which is used in virtual-8086 ...
Page 796 - Pages
19-36 Vol. 3 ARCHITECTURE COMPATIBILITY 19.29.2 Disabling the L3 Cache A unified third-level (L3) cache in processors based on Intel NetBurst microarchitec-ture (see Section 11.1, “Internal Caches, TLBs, and Buffers”) provides the third-level cache disable flag, bit 6 of the IA32_MISC_ENABLE MSR. Th...
Page 798 - Fault Handling Effects on the Stack
19-38 Vol. 3 ARCHITECTURE COMPATIBILITY • The initial stack pointer is FFFCH (32-bit operand) or FFFEH (16-bit operand) and will wrap around to 0H as a result of the POP operation. The result of the memory write is implementation-specific. For example, in P6 family processors, the result of the memo...
Page 799 - SEGMENT AND ADDRESS WRAPAROUND
Vol. 3 19-39 ARCHITECTURE COMPATIBILITY 19.32 MIXING 16- AND 32-BIT SEGMENTS The features of the 16-bit Intel 286 processor are an object-code compatible subset of those of the 32-bit IA-32 processors. The D (default operation size) flag in segment descriptors indicates whether the processor treats ...
Page 800 - Wraparound; STORE BUFFERS AND MEMORY ORDERING
19-40 Vol. 3 ARCHITECTURE COMPATIBILITY 19.33.1 Segment Wraparound On the 8086 processor, an attempt to access a memory operand that crosses offset 65,535 or 0FFFFH or offset 0 (for example, moving a word to offset 65,535 or pushing a word when the stack pointer is set to 1) causes the offset to wra...
Page 804 - Counters
19-44 Vol. 3 ARCHITECTURE COMPATIBILITY Earlier IA-32 processors (such as the Intel486 and Pentium processors) used the KEN# (cache enable) pin and external logic to maintain an external memory map and signal cacheable accesses to the processor. The MTRR mechanism simplifies hard-ware designs by eli...
Page 805 - TWO WAYS TO RUN INTEL 286 PROCESSOR TASKS
Vol. 3 19-45 ARCHITECTURE COMPATIBILITY The performance-monitoring counters are useful for debugging programs, optimizing code, diagnosing system failures, or refining hardware designs. See Chapter 30, “Performance Monitoring,” for more information on these counters. 19.38 TWO WAYS TO RUN INTEL 286 ...