ISBN.nu logo
isbn.nu
search for books and compare prices
Search >
Tables of Contents for Pentium Pro and Pentium II System Architecture
Chapter/Section Title
Page #
Page Count
About This Book
1
10
The MindShare Architecture Series
1
1
Cautionary Note
2
1
What This Book Covers
2
1
What this Book Does not Cover
2
1
Organization of This Book
2
2
Who this Book is For
4
1
Prerequisite Knowledge
4
1
Documentation Conventions
4
1
Hexadecimal Notation
4
1
Binary Notation
4
1
Decimal Notation
5
1
Signal Name Representation
5
1
Warning
5
1
Identification of Bit Fields (logical groups of bits or signals)
6
1
Register Field References
6
1
Resources
6
1
Visit Our Web Site
6
1
We Want Your Feedback
7
4
Part 1: System Overview
11
14
Chapter 1: System Overview
11
14
Introduction
11
2
What is a Cluster?
13
1
What Is a Quad or 4-Way System?
13
1
Bootstrap Processor
13
1
Starting Up Other Processors
13
1
Relationship of Processors to Main Memory
14
1
Processors' Relationships to Each Other
14
3
Host/PCI Bridges
17
8
Bridges' Relationship to Processors
17
2
Bridges' Relationship to PCI Masters and Main Memory
19
1
Bridges' Relationship to PCI Targets
19
1
Bridges' Relationship to EISA or ISA Targets
19
1
Bridges' Relationship to Each Other
20
1
Bridges' Relationship to EISA and ISA Masters and DMA
20
5
Part 2: Processor's Hardware Characteristics
25
354
Hardware Section 1: The Processor
25
174
Chapter 2: Processor Overview
25
10
Two Bus Interfaces
25
1
External Bus
26
3
Bus on Earlier Processors Inefficient for Multiprocessing
26
1
Pentium Bus has Limited Transaction Pipelining Capability
27
1
Pentium Pro Bus Tuned for Multiprocessing
28
1
IA = Legacy
29
1
Instruction Set
29
2
IA Instructions Vary in Length and are Complex
29
1
Pentium Pro Translates IA Instructions into RISC Instructions
29
1
In-Order Front End
30
1
Out-of-Order Middle
30
1
In-Order Rear End
30
1
Register Set
31
2
IA Register Set is Small
31
1
Pentium Pro has 40 General-Purpose Registers
32
1
Elimination of False Register Dependencies
32
1
Introduction to the Internal Architecture
33
2
Chapter 3: Processor Power-On Configuration
35
16
Automatically Configured Features
35
2
Example of Captured Configuration Information
37
8
Setup and Hold Time Requirements
39
1
Run BIST Option
39
1
Error Observation Options
39
1
In-Order Queue Depth Selection
40
1
Power-On Restart Address Selection
40
1
FRC Mode Enable/Disable
40
1
APIC ID Selection
40
1
Selecting Tri-State Mode
41
1
Processor Core Speed Selection
41
1
Processor's Agent and APIC ID Assignment
42
2
FRC Mode
44
1
Program-Accessible Startup Features
45
6
Chapter 4: Processor Startup
51
24
Selection of Processor's Agent and APIC IDs
51
1
Processor's State After Reset
52
3
EDX Contains Processor Identification Info
55
1
State of Caches and Processor's Ability to Cache
55
1
Selection of Bootstrap Processor (BSP)
56
6
Introduction
56
2
BSP Selection Process
58
4
APIC Arbitration Background
58
1
Startup APIC Arbitration ID Assignment
58
1
BSP Selection Process
58
1
Example of APIC Bus Traffic Captured during BSP Selection
59
3
Initial BSP Memory Accesses
62
8
General
62
1
When Caching Disabled, Prefetcher Always Does 32-byte Code Reads
63
1
State 2: 1st Line Read (and jump at FFFFFFF0h executed)
63
1
State 10: Branch Trace Message for Jump of FFFFFFF0h
64
1
State 16: Branch Target Line Read (and 2nd jump executed)
64
1
State 26: Branch Trace Message for 2nd Jump
64
1
State 32: CLI Fetched and Executed
65
1
State 42: CLD Fetched and Executed
65
1
State 50: JMP Rel/Fwd Fetched and Executed
65
1
State 58: Branch Trace Message for JMP Rel/Fwd
65
1
State 64: SHL EDX,10h Fetched and Executed
66
1
State 74: MOV DX,048Bh Fetched and Executed
66
1
State 82: AND Fetched and Executed
66
1
State 90: OUT Fetched and Executed
66
1
State 98: IO Write to 48Bh
66
1
State 106: OR Fetched and Executed
67
1
State 114: MOV BX,CS Fetched and Executed
67
1
State 122: MOV SS,BX Fetched and Executed
67
1
State 130: JE Fetched and Executed
67
1
State 138: Branch Trace Message for JE
67
3
How APs are Started
70
5
AP Detection by the POST/BIOS
70
3
Introduction
70
1
The POST/BIOS Code
70
2
The FindAndInitAllCPUs Routine
72
1
The OS is Loaded and Control is Passed to It
73
1
Uni-Processor OS
73
1
MP OS
73
2
Chapter 5: The Fetch, Decode, Execute Engine
75
58
Please Note
75
1
Introduction
76
2
Enabling the Caches
78
1
Prefetcher
78
2
Issues Sequential Read Requests to Code Cache
78
1
Introduction to Prefetcher Operation
79
1
Brief Description of Pentium Pro Processor
80
2
Beginning, Middle and End
81
1
In-Order Front End
81
1
Out-of-Order (OOO) Middle
81
1
In-Order Rear End
81
1
Intro to the Instruction Pipeline
82
2
In-Order Front End
84
10
Instruction Fetch Stages
84
4
IFU1 Stage: 32-Byte Line Fetched from Code Cache
84
1
IFU2 Stage: Marking Boundaries and Dynamic Branch Prediction
85
1
IFU3 Stage: Align Instructions for Delivery to Decoders
85
3
Decode Stages
88
2
DEC1 Stage: Translate IA Instructions into Micro-Ops
88
1
Micro Instruction Sequencer (MIS)
88
1
DEC2 Stage: Move Micro-Ops to ID Queue
89
1
Queue Micro-Ops for Placement in Pool
89
1
Second Chance for Branch Prediction
90
1
RAT Stage: Overcoming the Small IA Register Set
90
4
ReOrder Buffer (ROB) Stage
91
1
Instruction Pool (ROB) is a Circular Buffer
91
3
Out-of-Order (OOO) Middle
94
1
In-Order Rear End (RET1 and RET2 Stages)
94
1
Three Scenarios
95
23
Scenario One: Reset Just Removed
95
9
Starvation!
95
1
First Instruction Fetch
96
1
First Memory Read Bus Transaction
96
1
Eight Bytes Placed in Prefetch Streaming Buffer
96
2
Instruction Boundaries Marked and BTB Checked
98
1
Between One and Three Instructions Decoded into Micro-Ops
98
2
Source Operand Location Selected (RAT)
100
1
Micro-Ops Advanced to ROB and RS
101
1
Micro-Ops Dispatched for Execution
102
1
Micro-Ops Executed
102
1
Result to ROB Entry (and other micro-ops if necessary)
102
1
Micro-op Ready for Retirement?
102
1
Micro-Op Retired
103
1
Scenario Two: Processor's Caches Just Enabled
104
1
Scenario Three: Caches Enabled for Some Time
105
13
Memory Data Accesses--Loads and Stores
118
4
Handling Loads
118
2
Handling Stores
120
2
Description of Branch Prediction
122
6
486 Branch Handling
122
1
Pentium Branch Prediction
122
1
Pentium Pro Branch Prediction
122
6
Mispredicted Branches are VERY Costly!
122
3
Dynamic Branch Prediction
125
1
General
125
1
Yeh's Prediction Algorithm
125
1
Return Stack Buffer (RSB)
125
1
Static Branch Prediction
126
2
Code Optimization
128
5
General
128
1
Reduce Number of Branches
128
1
Follow Static Branch Prediction Algorithm
129
1
Identify and Improve Unpredictable Branches
129
1
Don't Intermingle Code and Data
129
1
Align Data
129
1
Avoid Serializing Instructions
129
1
Where Possible, Do Context Switches in Software
130
1
Eliminate Partial Stalls: Small Write Followed by Full-Register Read
130
1
Data Segment Register Changes Serialize Execution
130
3
Chapter 6: Rules of Conduct
133
16
The Problem
133
1
General
133
1
A Memory-Mapped IO Example
134
1
Pentium Solution
134
1
Pentium Pro Solution
135
2
State of the MTRRs after Reset
137
1
Memory Types
138
3
Uncacheable (UC) Memory
138
1
Write-Combining (WC) Memory
138
1
Write-Through (WT) Memory
139
1
Write-Protect (WP) Memory
140
1
Write-Back (WB) Memory
141
1
Rules as Defined by MTRRs
141
2
Rules of Conduct Provided in Bus Transaction
143
1
MTRRs and Paging: When Worlds Collide
143
4
Detailed Description of the MTRRs
147
2
Chapter 7: The Processor Caches
149
50
Cache Overview
149
4
Introduction to Data Cache Features
150
1
Introduction to Code Cache Features
151
1
Introduction to L2 Cache Features
151
1
Introduction to Snooping
152
1
Determining Processor's Cache Sizes and Structures
153
1
L1 Code Cache
154
6
Code Cache Uses MESI Subset: S and I
154
1
Code Cache Contains Only Raw Code
154
2
Code Cache View of Memory Space
156
1
Code TLB (ITLB)
156
1
Code Cache Lookup
157
1
Code Cache Hit
157
1
Code Cache Miss
157
2
Code Cache LRU Algorithm: Make Room for the New Guy
157
1
Code Cache Castout
158
1
Code Cache Snoop Ports
159
1
L1 Data Cache
160
19
Data Cache Uses MESI Cache Protocol
160
2
Data Cache View of Memory Space
162
1
Data TLB (DTLB)
162
1
Data Cache Lookup
163
1
Data Cache Hit
163
1
Relationship of L2 and L1 Caches
163
8
Relationship of L2 to L1 Code Cache
164
1
Relationship of L2 and L1 Data Cache
165
1
Read Miss on L1 and L2
165
1
Read Miss On All Other Caches
166
1
Read Hit On E or S Line in One or More Other Caches
166
1
Read Hit On Modified Line in One Other Cache
166
1
Write Hit On L1 Data Cache
166
1
Write Hit on S Line in Data Cache
166
1
Write Hit On E Line in Data Cache
167
1
Write Hit On M Line in Data Cache
167
1
Write Miss On L1 Data Cache
167
1
L1 Data Cache Castout
168
1
Data Cache LRU Algorithm: Make Room for the New Guy
168
3
Data Cache Pipeline
171
2
Data Cache is Non-Blocking
173
1
Earlier Processor Caches Blocked, but Who Cares?
173
1
Pentium Pro Data Cache is Non-Blocking, and That's Important!
173
1
Data Cache has Two Service Ports
174
4
Two Address and Two Data Buses
174
2
Simultaneous Load/Store Constraint
176
2
Data Cache Snoop Ports
178
1
Unified L2 Cache
179
10
L2 Cache Uses MESI Protocol
179
3
L2 Cache View of Memory Space
182
1
Request Received
182
1
L2 Cache Lookup
182
1
L2 Cache Hit
183
1
L2 Cache Miss
184
1
L2 Cache LRU Algorithm: Make Room for the New Guy
184
1
L2 Cache Pipeline
185
3
L2 Cache Snoop Ports
188
1
Toggle Mode Transfer Order
189
2
Self-Modifying Code and Self-Snooping
191
4
Description
191
3
Don't Let Your Data and Code Get Too Friendly!
194
1
ECC Error Handling
195
1
Procedure to Disable All Caching
195
4
Hardware Section 2: Bus Intro and Arbitration
199
62
Chapter 8: Bus Electrical Characteristics
199
8
Introduction
199
1
Everything's Relative
200
1
All Signals Active Low
201
1
Powerful Pullups Snap Lines High Fast
202
1
The Layout
202
1
Synchronous Bus
203
1
Setup and Hold Specs
203
1
Setup Time
203
1
Hold Time
203
1
How High is High and How Low is Low?
204
1
After You See Something, You have One Clock to Do Something About It
205
2
Chapter 9: Bus Basics
207
14
Agents
207
2
Agent Types
207
1
Multiple Personalities
208
1
Uniprocessor vs. Multiprocessor Bus
209
2
Request Agents
211
1
Request Agent Types
211
1
Agent ID
211
1
What Agent ID Used For
211
1
How Agent ID Assigned
212
1
Transaction Phases
212
1
Pentium Transaction Phases
212
1
Pentium Pro Transaction Phases
212
1
Transaction Pipelining
213
6
Bus Divided into Signal Groups
213
1
Step One: Gain Ownership of Request Signal Group
213
1
Step Two: Issue Transaction Request
214
1
Step Three: Yield Request Signal Group, Proceed to Next Signal Group
214
1
Phases Proceed in Predefined Order
214
2
Request Phase
215
1
Error Phase
215
1
Snoop Phase
215
1
Response Phase
215
1
Data Phase
216
1
Next Agent Can't Use Signal Group Until Current Agent Done With It
216
3
Transaction Tracking
219
2
Request Agent Transaction Tracking
219
1
Snoop Agent Transaction Tracking
219
1
Response Agent Transaction Tracking
220
1
The IOQ
220
1
Chapter 10: Obtaining Bus Ownership
221
40
Request Phase
221
1
Symmetric Agent Arbitration--Democracy at Work
222
9
No External Arbiter Required
222
2
Agent ID Assignment
224
1
Arbitration Algorithm
224
1
Rotating ID
224
1
Busy/Idle State
224
1
Bus Parking
224
1
Be Fair!
225
1
What Signal Group are You Arbitrating For?
225
1
Requesting Ownership
225
4
Example of One Symmetric Agent Requesting Ownership
226
1
Example of Two Symmetric Agents Requesting Ownership
227
2
Definition of an Arbitration Event
229
1
Once BREQn# Asserted, Keep Asserted Until Ownership Attained
230
1
Example Case Where Transaction Cancelled Before Started
230
1
Bus Parking Revisited
230
1
Priority Agent Arbitration--Despotism
231
10
Example Priority Agents
231
2
Priority Agent Beats Symmetric Agents, Unless
233
1
Using Simple Approach, Priority Agent Suffers Penalty
234
3
Smarter Priority Agent Gets Ownership Faster
237
4
Ownership Attained in 2 BCLKs
237
2
Ownership Attained in 3 BCLKs
239
2
Be Fair to the Common People
241
1
Priority Agent Parking
241
1
Locking--Shared Resource Acquisition
241
9
Shared Resource Concept
241
1
Testing Availability and Gaining Ownership of Shared Resources
242
1
Race Condition Can Present Problem
242
1
Guaranteeing Atomicity of Read/Modify/Write
243
4
LOCK Instruction Prefix
245
1
Processor Automatically Asserts LOCK# for Some Operations
245
1
Use Locked RMW to Obtain and Give Up Semaphore Ownership
245
1
Duration of Locked Transaction Series
246
1
Back-to-Back RMW Operations
247
1
Locking a Cache Line
247
3
Advantage of Cache Line Locking
248
1
New Directory Bit--Cache Line Locked
248
1
Read and Invalidate Transaction (RWITM, or Kill)
248
1
Line in E or M State
248
1
Semaphore Not in Processor's L1 or L2 Cache
249
1
Semaphore in Cache in E State
250
1
Semaphore in Cache in S State
250
1
Semaphore in Cache in M State
250
1
Blocking New Requests-Stop! I'm Full!
250
11
BNR# is Shared Signal
251
1
Stalled/Throttled/Free Indicator
251
1
Open Gate, Let One Out, Close Gate
252
1
Open Gate, Leave It Open, Let Them All Out
253
1
Gate Wide Open and then Slammed Shut
253
1
BNR# Behavior at Powerup
253
1
BNR# and the Built-In Self-Test (BIST)
254
2
BNR# Behavior During Runtime
256
5
Hardware Section 3: The Transaction Phases
261
66
Chapter 11: The Request and Error Phases
261
16
Caution
261
1
Request Phase
262
11
Introduction to the Request Phase
262
1
Request Signal Group is Multiplexed
263
1
Introduction to the Transaction Types
264
2
Contents of Request Packet A
266
2
32-bit vs. 36-bit Addresses
268
1
Contents of Request Packet B
269
4
Error Phase
273
4
In-Flight Corruption
273
2
Who Samples AERR#?
275
1
Request Agent
275
1
Other Bus Agents
275
1
Who Drives AERR#?
275
1
Request Agent's Response to AERR# Assertion
275
1
Other Guys are Very Polite
276
1
Chapter 12: The Snoop Phase
277
20
Agents Involved in Snoop Phase
277
3
Snoop Phase Has Two Purposes
280
1
Snoop Result Signals are Shared, DEFER# Isn't
280
1
Snoop Phase Duration Is Variable
280
3
Is There a Snoop Stall Duration Limit?
283
1
Memory Transaction Snooping
284
4
Snoop's Effects on Caches
284
3
After Snoop Stall, How Soon Can Next Snoop Result be Presented?
287
1
Self-Snooping
288
1
Non-Memory Transactions Have a Snoop Phase
288
1
Transaction Retry and Deferral
288
6
Permission to Defer Transaction Completion
288
1
DEFER# Assertion Delays Transaction Completion
289
1
Transaction Retry
289
1
Transaction Deferral
290
1
Mail Delivery Analogy
290
1
Example System Operation Overview
290
3
The Wrong Way
290
2
The Right Way
292
1
Bridge Should be a Faithful Messenger
293
1
Detailed Deferred Transaction Description
294
1
What if HITM# and DEFER# both Asserted?
294
2
How Does Locking Change Things?
296
1
Chapter 13: The Response and Data Phases
297
30
Note on Deferred Transactions
297
1
Purpose of Response Phase
297
1
Response Phase Signal Group
298
1
Response Phase Start Point
299
1
Response Phase End Point
299
1
List of Responses
299
2
Response Phase May Complete Transaction
301
1
Data Phase Signal Group
302
1
Five Example Scenarios
302
16
Transaction that Doesn't Transfer Data
302
2
Read that Doesn't Hit a Modified Line and is Not Deferred
304
3
Basics
304
2
Detailed Description
306
1
How Does Response Agent Know Transfer Length?
307
1
What's the Earliest that DBSY# Can be Deasserted?
307
1
Relaxed DBSY# Deassertion
307
1
Write that Doesn't Hit a Modified Line and Isn't Deferred
307
4
Basics
309
1
Previous Transaction May Involve a Write
309
1
Earliest TRDY# Assertion is 1 Clock After Previous Response Issued
309
1
When Does Request Agent First Sample TRDY#?
309
1
When Does Request Agent Start Using Data Bus?
310
1
When Can TRDY# Be Deasserted?
310
1
When Does Request Agent Take Ownership of Data Bus?
311
1
Deliver the Data
311
1
On AERR# or Hard Failure Response
311
1
Snoop Agents Change State of Line from E-greater than I or S-greater than I
311
1
Read that Hits a Modified Line
311
4
Basics
312
2
Transaction Starts as a Read from Memory
314
1
From Memory Standpoint, Changes from Read to Write
314
1
Memory Asserts TRDY# to Accept Data
314
1
Memory Must Drive Response At Right Time
314
1
Snoop Agent Asserts DBSY# and Memory Drives Response
315
1
Snoop Agent Supplies Line to Memory and to Request Agent
315
1
Snoop Agent Changes State of Line from M-greater than S
315
1
Write that Hits a Modified Line
315
3
Data Phase Wait States
318
3
Special Case--Single Quadword, O-Wait State Transfer
321
2
Response Phase Parity
323
4
Hardware Section 4: Other Bus Topics
327
52
Chapter 14: Transaction Deferral
327
22
Introduction to Transaction Deferral
327
1
Example System Model
327
2
Typical PC Server Model
329
19
The Problem
329
1
Possible Solutions
330
1
An Example Read
331
9
Read Transaction Memorized and Deferred Response Issued
331
2
Bridge Performs PCI Read Transaction
333
1
Deferred Reply Transaction Issued
333
4
Original Request Agent Selected
337
1
Bridge Provides Snoop Result
337
1
Response Phase--Role Reversal
337
1
Data Phase
338
1
Trackers Retire Transaction
338
1
Other Possible Responses
338
2
An Example Write
340
8
Transaction and Write Data Memorized, Deferred Response Issued
340
3
PCI Transaction Performed and Data Delivered to Target
343
1
Deferred Reply Transaction Issued
343
2
Original Request Agent Selected
345
1
Bridge Provides Snoop Result
345
1
Response Phase--Role Reversal
345
1
There is No Data Phase
346
1
Trackers Retire Transaction
346
1
Other Possible Responses
346
2
Pentium Pro Support for Transaction Deferral
348
1
Chapter 15: IO Transactions
349
4
Introduction
349
1
IO Address Range
350
1
Data Transfer Length
350
3
Behavior Permitted by Specification
350
1
How Pentium Pro Processor Operates
351
2
Chapter 16: Central Agent Transactions
353
12
Point-to-Point vs. Broadcast
353
1
Interrupt Acknowledge Transaction
354
4
Background
354
3
How Pentium Pro is Different
357
1
Host/PCI Bridge is Response Agent
357
1
Special Transaction
358
2
General
358
1
Message Types
358
2
Branch Trace Message Transaction Used for Program Debug
360
5
What's the Problem?
360
1
What's the Solution?
361
1
Enabling Branch Trace Message Capability
361
1
Branch Trace Message Transaction
362
3
Packet A Composition
362
1
Packet B Composition
362
1
Proper Response
363
1
Data Composition
363
2
Chapter 17: Other Signals
365
14
Error Reporting Signals
365
3
Bus Initialize (BINIT#)
365
1
Description
365
1
Assertion/Deassertion Protocol
366
1
Bus Error (BERR#)
366
1
Description
366
1
BERR#/BINIT# Assertion/Deassertion Protocol
367
1
Internal Error (IERR#)
367
1
Functional Redundancy Check Error (FRCERR)
367
1
PC-Compatibility Signals
368
1
A20 Mask (A20M#)
368
1
FERR# and IGNNE#
368
1
Diagnostic Support Signals
369
1
Interrupt-Related Signals
370
2
Processor Present Signals
372
1
Power Supply Pins
372
2
Miscellaneous Signals
374
5
Part 3: Pentium II Processor
379
30
Chapter 18: Pentium II Processor
379
30
Introduction
379
1
Single-Edge Cartridge
380
6
Pentium and Pentium Pro Sockets
380
1
Pentium II Processor Cartridge
380
2
Processor Side of SEC Substrate
382
1
General
383
1
Processor Core
383
1
Non-Processor Side of SEC Substrate
383
3
Cartridge Block Diagram
386
1
Dual-Independent Bus Architecture (DIBA)
387
1
Caches
387
1
L1 Code and Data Caches
387
1
L2 Cache
388
1
Cache Error Protection
388
1
Processor Signature
388
1
CPUID Cache Geometry Information
389
1
Fast System Call Instructions
389
1
Frequency of the Processor Core and Buses
390
1
Signal Differences Between Pentium II and Pentium Pro
391
1
MMX
392
1
16-bit Code Optimization
393
1
Pentium Pro Not Optimized
393
1
Pentium II Shadows the Data Segment Registers
393
1
Multiprocessor Capability
394
1
Pentium Pro Processor Bus Arbitration
394
1
Pentium II Processor Bus Arbitration
394
1
Power-Conservation Modes
395
6
Introduction
395
3
Normal State
398
1
AutoHalt Power Down State
398
1
Stop Grant State
399
1
Halt/Grant Snoop State
400
1
Sleep State
400
1
Deep Sleep State
401
1
Voltage Identification
401
2
Treatment of Unused Bus Pins
403
1
Unused Reserved Pins
403
1
TESTHI Pins
403
1
When APIC Signals Are Unused
404
1
Unused GTL+Inputs
404
1
Unused Active Low CMOS Inputs
404
1
Unused Active High Inputs
404
1
Unused Outputs
404
1
Test Access Port (TAP)
404
1
Deschutes Version of the Pentium II Processor
405
1
Slot 2
405
1
Pentium II Chip Sets
406
1
Boxed Processor
406
3
Part 4: Processor's Software Characteristics
409
114
Chapter 19: Instruction Set Enhancements
409
14
Introduction
409
1
CPUID Instruction Enhanced
409
8
Before Executing CPUID, Determine if Supported
409
2
Basic Description
411
1
Vendor ID and Max Input Value Request
411
1
Request for Vendor ID String and Max EAX Value
412
1
Request for Version and Supported Features
412
2
Request for Cache and TLB Information
414
2
CPUID is a Serializing Instruction
416
1
Serializing Instructions Impact Performance
417
1
Conditional Move (CMOV) Eliminates Branches
417
1
Conditional FP Move (FCMOV) Eliminates Branches
418
1
FCOMI, FCOMIP, FUCOMI, and FUCOMIP
418
1
Read Performance Monitoring Counter (RDPMC)
418
2
What's RDPMC Used For?
419
1
Who Can Execute RDPMC?
419
1
RDPMC Not Serializing Instruction
420
1
RDPMC Description
420
1
Read Time Stamp Counter (RDTSC)
420
1
What's RDTSC Used For?
420
1
Who Can Execute RDTSC?
421
1
RDTSC Doesn't Serialize
421
1
RDTSC Description
421
1
My Favorite--UD2
421
1
Accessing MSRs
421
2
Testing for Processor MSR Support
422
1
Causes GP Exception If
422
1
Input Parameters
422
1
Chapter 20: Register Set Enhancements
423
8
New Registers
423
3
Introduction
423
1
DebugCTL, LastBranch and LastException MSRs
424
2
Introduction
424
1
Last Branch, Interrupt or Exception Recording
425
1
Single-Step Exception on Branch, Exception or Interrupt
426
1
MSR not Defined in Earlier Pentium Pro Documentation
426
1
Disable Instruction Streaming Buffer
426
1
Disable Cache Line Boundary Lock
426
1
New Bits in Pre-Existent Registers
427
2
CR4 Enhancements
427
1
CR3 Enhancements
428
1
Local APIC Base Address Relocation
429
2
Chapter 21: BIOS Update Feature
431
8
The Problem
431
1
The Solution
432
1
The BIOS Update Image
432
3
Introduction
432
2
BIOS Update Header Data Structure
434
1
The BIOS Update Loader
435
1
CPUID Instruction Enhanced
436
1
Determining if New Update Supercedes Previously-Loaded Update
437
1
Effect of RESET# on Previously-Loaded Update
437
1
When Must Update Load Take Place?
437
1
Updates in a Multiprocessor System
438
1
Chapter 22: Paging Enhancements
439
25
Background on Paging
439
1
Page Size Extension (PSE) Feature
440
3
The Problem
440
1
The Solution--Big Pages
441
1
How It Works
441
2
Physical Address Extension (PAE) Feature
443
13
How Paging Normally Works
443
2
What Is the PAE?
445
1
How Is the PAE Enabled?
445
1
Changes to the Paging-Related Data Structures
445
3
Programmer Still Restricted to 32-bit Addresses and 220 Pages
448
1
Pages Can be Distributed Throughout Lower 64GB
448
1
CR3 Contains Base Address of PDPT
448
1
Format of PDPT Entry
449
2
TLB Flush Necessary after PDPT Entry Change
451
1
Format of Page Directory Entry
451
3
Format of Page Table Entry
454
2
The PAE and the Page Size Extension (PSE)
456
4
Global Page Feature
460
1
The Problem
460
1
The Solution
460
1
Propagation of Page Table Entry Changes to Multiple Processors
461
3
Chapter 23: Interrupt Enhancements
464
13
New Exceptions
464
1
Added APIC Functionality
464
1
VM86 Mode Extensions
465
10
VM86 Mode Background
465
1
Interrupt-Related Problems and VM86 Tasks
465
4
Software Overhead Associated with CLI/STI Execution
466
2
Attempted Execution of CLI by VM86 Task
466
1
Attempted Execution of STI Instruction
467
1
Servicing of Software Interrupts by DOS or OS
468
1
Solution--VM86 Mode Extensions
469
6
Introduction
469
2
CLI/STI Solution
471
1
EFLAGS[VIF] = 1, EFLAGS[IF] = 1, Interrupt Occurs
471
1
EFLAGS[VIF] = 0, EFLAGS[IF] = 1, Interrupt Occurs
471
1
Software Interrupt Redirection Solution
472
3
Virtual Interrupt Handling in Protected Mode
475
2
Chapter 24: Machine Check Architecture
477
23
Purpose of Machine Check Architecture
477
1
Machine Check Architecture in the Pentium Processor
478
1
Testing for Machine Check Support
479
2
Machine Check Exception
481
1
Machine Check Architecture Register Set
481
9
Composition of Global Register Set
483
2
MCG_CAP Register
483
1
MCG_STATUS Register
483
1
MCG_CTL Register
484
1
Composition of Each Register Bank
485
3
General
485
1
MCi_STATUS Register
486
2
MSR Addresses of the Machine Check Registers
488
2
Initialization of Register Set
490
1
Machine Check Architecture Error Format
491
4
Simple Error Codes
491
1
Compound Error Codes
492
3
External Bus Error Interpretation
495
5
Chapter 25: Performance Monitoring and Timestamp
500
7
Time Stamp Counter Facility
500
1
Time Stamp Counter (TSC) Definition
500
1
Detecting Presence of the TSC
500
1
Accessing the Time Stamp Counter
500
1
Reading the TSC Using RDTSC Instruction
500
1
Reading the TSC Using RDMSR Instruction
501
1
Writing to the TSC
501
1
Performance Monitoring Facility
501
6
Purpose of the Performance Monitoring Facility
501
1
Performance Monitoring Registers
501
3
PerfEvtSe10 and PerfEvtSe11 MSRs
502
1
PerfCtr0 and PerfCtr1
503
1
Accessing the Performance Monitoring Registers
504
1
Accessing the PerfEvtSel MSRs
504
1
Accessing the PerfCtr MSRs
504
1
Accessing Using RDPMC Instruction
504
1
Accessing Using RDMSR/WRMSR Instructions
504
1
Event Types
505
1
Starting and Stopping the Counters
505
1
Starting the Counters
505
1
Stopping the Counters
505
1
Performance Monitoring Interrupt on Overflow
505
2
Chapter 26: MMX: Matrix Math Extensions
507
16
Please Note
507
1
Problems Addressed by MMX
508
5
Problem: Math on Packed Bytes/Words/Dwords
508
1
Solution: MMX Matrix Math/Logical Operations
508
1
Problem: Data not Packed
509
1
Solution: MMX Pack and Unpack Instructions
510
1
Problem: Math Overflows/Underflows
510
1
Solution: Saturating Math
510
1
Problem: Comparisons and Branches
510
1
Solution: MMX Parallel Comparisons
511
2
Single Instruction, Multiple Data (SIMD)
513
1
Detecting Presence of MMX
514
1
Changes to Programming Environment
514
2
General
514
1
Handling a Task Switch
515
1
When Exiting MMX Routine, Execute EMMS
516
1
MMX Instruction Set
516
3
Instruction Groups
516
1
Instruction Syntax
517
1
Instruction Set
517
2
Pentium II MMX Execution Units
519
4
Part 5: Overview of Intel Pentium Pro Chipsets
523
44
Chapter 27: 450GX and KX Chipsets
523
36
Processor Bus Operation
523
1
PCI Bus Operation
523
1
450GX Chipset
523
34
Overview
523
1
Major Features
524
4
Overview of Compatibility PB
528
1
Compatibility PB is the Target
528
1
Transaction Initiated on Processor Bus
528
1
Transaction Initiated on PCI Bus
528
1
Target on Other Side of Compatibility PB
529
1
Transaction Initiated on Processor Bus
529
1
Transaction Initiated on PCI Bus
529
1
Neither Compatibility PB nor Device on Other Side Is Target
529
1
Overview of Aux PB
529
2
Aux PB is the Target
530
1
Transaction Initiated on Processor Bus
530
1
Transaction Initiated on PCI Bus
530
1
Target on Other Side of Aux PB
530
1
Transaction Initiated on Processor Bus
530
1
Transaction Initiated on PCI Bus
531
1
Neither Aux PB nor Device on Other Side Is Target
531
1
Overview of Memory Controller
531
1
Startup Autoconfiguration
531
4
Introduction
531
1
Autoconfiguration of PBs
532
2
Autoconfiguration of Memory Controller
534
1
Processor Bus Agent Configuration
535
3
Transaction Deferral Not Implemented-Retry Used Instead
538
1
How Chipset Members are Configured by Software
539
18
Chipset Configuration Mechanism
539
2
PB Configuration Registers
541
11
Memory Controller Configuration Registers
552
5
450KX Chipset
557
2
Overview
557
1
Major Features
557
2
Chapter 28: 440FX Chipset
559
8
Processor Bus Operation
559
1
PCI Bus Operation
559
1
ChipSet Overview
559
1
Major Features
560
1
PMC Configuration Registers
561
6
Appendix A: The MTRR Registers
567
8
Introduction
567
1
Feature Determination
567
1
MTRRdefType Register
568
1
Fixed-Range MTRRs
569
3
Enabling the Fixed-Range MTRRs
569
1
Define Rules Within 1st MB
570
2
Variable-Range MTRRs
572
3
Enabling the Variable-Range MTRRs
572
1
Number of Variable-Range MTRRs
572
1
Format of Variable-Range MTRR Register Pairs
572
1
MTRRphysBasen Register
573
1
MTRRphysMaskn Register
573
1
Examples
573
2
Index
575