Alpha Serve Res 40 Service Guide
Alpha Serve Res 40 Service Guide
Service Guide
Warning! This is a Class A product. In a domestic environment this product may cause
radio interference in which case the user may be required to take adequate measures.
Achtung! Dieses ist ein Gert der Funkstrgrenzwertklasse A. In Wohnbereichen
knnen bei Betrieb dieses Gertes Rundfunkstrungen auftreten, in welchen Fllen der
Benutzer fr entsprechende Gegenmanahmen verantwortlich ist.
FCC Notice: This equipment generates, uses, and may emit radio frequency energy.
The equipment has been type tested and found to comply with the limits for a Class A
digital device pursuant to Part 15 of FCC rules, which are designed to provide reasonable
protection against such radio frequency interference.
Operation of this equipment in a residential area may cause interference in which case
the user at his own expense will be required to take whatever measures may be required
to correct the interference.
Any modifications to this deviceunless expressly approved by the manufacturercan
void the users authority to operate this equipment under part 15 of the FCC rules.
Contents
Preface .......................................................................................................................xv
Chapter 1
1.1
1.2
1.3
1.4
1.5
1.5
1.6
1.7
1.8
1.9
1.10
1.10.1
1.10.2
1.11
1.12
1.13
1.14
1.15
1.16
Chapter 2
2.1
2.2
2.3
2.3.1
2.3.2
2.3.3
2.3.4
System Overview
Troubleshooting
2.3.5
2.3.6
2.3.7
2.3.8
2.4
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
Chapter 3
3.1
3.2
3.3
3.3.1
3.3.2
3.3.3
3.3.4
3.4
3.4.1
3.4.2
3.4.3
3.4.4
3.4.5
3.5
3.6
Chapter 4
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
vi
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
4.18
4.19
4.20
4.21
4.22
hd ...................................................................................................... 4-24
info ...................................................................................................... 4-26
kill and kill_diags................................................................................ 4-33
memexer.............................................................................................. 4-34
memtest............................................................................................... 4-36
net ...................................................................................................... 4-41
nettest ................................................................................................. 4-43
set sys_serial_num .............................................................................. 4-47
show error ........................................................................................... 4-48
show fru............................................................................................... 4-51
show_status......................................................................................... 4-54
sys_exer ............................................................................................... 4-56
test4-58
Chapter 5
5.1
5.1.1
5.1.2
5.1.3
5.2
5.3
5.1.1
5.4
Chapter 6
6.1
6.1.1
6.1.2
6.2
6.1
6.2
6.2.1
6.3
6.4
6.4.1
6.4.2
6.4.3
6.4.4
6.5
6.6
Error Logs
vii
6.6.1
6.6.2
6.6.3
6.6.4
6.7
6.8
6.9
6.10
6.10.1
Chapter 7
7.1
7.2
7.2.1
7.3
7.4
7.5
7.6
7.6.1
7.6.2
7.6.3
7.6.4
7.6.5
7.6.6
7.6.7
7.6.8
7.7
7.8
Chapter 8
8.1
8.1.1
8.1.2
8.1.3
8.2
8.3
8.4
8.5
8.6
8.7
viii
8.8
8.9
8.10
8.11
8.12
8.13
8.14
8.15
8.16
8.17
Appendix A
Appendix B
B.1
B.2
B.3
B.4
B.5
Appendix C
C.1
Appendix D
D.1
D.2
D.3
D.4
D.5
D.6
D.7
D.8
D.9
D.10
D.11
D.12
D.13
D.14
Registers
ix
D.15
D.16
D.17
D.18
D.19
D.20
D.21
D.22
DPR Registers for 680 Correctable Machine Check Logout Frames . D-37
DPR Power Supply Status Registers ................................................. D-40
DPR 680 Fatal Registers.................................................................... D-41
CPU and System Uncorrectable Machine Check Logout Frame ..... D-42
Console Data Log Event Environmental Error Logout Frame (680
Uncorrectable).................................................................................... D-43
CPU and System Correctable Machine Check Logout Frame .......... D-44
Environmental Error Logout Frame (680 Correctable) ..................... D-45
Platform Logout Frame Register Translation ................................... D-46
Appendix E
E.1
E.2
E.3
Index
Examples
31
32
33
34
35
41
42
43
44
45
46
47
48
49
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
51
61
62
63
64
65
71
72
73
74
75
76
77
78
79
710
memtest............................................................................................... 4-36
net -ic and net -s.................................................................................. 4-41
nettest ................................................................................................. 4-43
set sys_serial_num .............................................................................. 4-47
show error ........................................................................................... 4-48
show fru............................................................................................... 4-51
show status ......................................................................................... 4-54
sys_exer ............................................................................................... 4-56
test -lb ................................................................................................. 4-58
Console Level Environmental Error Logout Frame............................ 5-18
set ocp_text............................................................................................ 6-4
set password........................................................................................ 6-26
set secure............................................................................................. 6-27
clear password..................................................................................... 6-27
Booting Linux...................................................................................... 6-44
set com1_mode .................................................................................... 7-15
status................................................................................................... 7-16
env....................................................................................................... 7-18
dump ................................................................................................... 7-20
power on/off ......................................................................................... 7-22
halt in/out............................................................................................ 7-23
reset..................................................................................................... 7-23
Dial-In Configuration.......................................................................... 7-24
Dial-Out Alert Configuration .............................................................. 7-26
set escape ............................................................................................ 7-29
Figures
11
12
13
14
15
16
17
18
19
110
111
112
113
xi
114
115
116
117
31
32
51
52
53
54
55
61
62
63
64
65
66
67
68
69
610
611
71
72
73
74
81
82
83
84
85
86
87
88
89
810
811
812
813
814
815
816
xii
817
818
819
820
821
822
823
B1
B2
B3
B4
Tables
1
11
21
22
23
24
25
31
32
33
34
41
42
43
51
52
53
61
62
71
72
73
81
82
A1
B1
B2
xiii
B3
B4
C1
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
D16
D17
D18
D19
D20
D21
E1
E2
E3
E4
E5
xiv
Preface
Intended Audience
This manual is for service providers and self-maintenance customers who are
responsible for servicing ES40 systems.
WARNING: To prevent injury, access is limited to persons who
have appropriate technical training and experience. Such
persons are expected to understand the hazards of working
within this equipment and take measures to minimize danger to
themselves or others. These measures include:
1. Remove any jewelry that may conduct electricity.
2. If accessing the system card cage, power down the system and
wait 2 minutes to allow components to cool.
3. Wear an anti-static wrist strap when handling internal
components.
Document Structure
This manual uses a structured documentation design. Topics are organized into
small sections, usually consisting of two facing pages. Most topics begin with an
abstract that provides an overview of the section, followed by an illustration or
example. The facing page contains descriptions, procedures, and syntax
definitions.
xv
Appendix C, DPR Address Layout, shows the address layout of the dualport RAM (DPR).
xvi
Documentation Titles
1
Title
Order Number
QA-6E88A-G8
EK-ES240-UG
EK-ES240-UI
EK-ES240-PD
EK-ES240-RN
AG-RF9HA-BE
Maintenance Kit
Service Guide
Service Guide HTML Help
Illustrated Parts Breakdown
QZ-01BAB-GZ
EK-ES240-SV
AG-RKAKA-BE
EK-ES240-IP
EK-ES240-RG
EK-ES4RM-TP
EK-ES4M2-UP
EK-MS610-DM
EK-KN610-UP
EK-KN610-CL
xvii
Chapter 1
System Overview
System Architecture
System Enclosures
Control Panel
System Motherboard
CPU Card
PCI Backplane
Power Supplies
Fans
System Access
Console Terminal
System Overview
1-1
1.1
System Architecture
C-chip
Control lines for D-chips
CAPbus
P-chip
64 bit PCI
P-chip
64 bit PCI
PAD
Bus
First
CPU
CPUs
1 or 2
Memory
Arrays
CPU
Data
Bus
Memory
Data
Bus
8 D-chips
B-cache
1-2
1 or 2
Memory
Arrays
PKW1400A-99
This system is designed to fully exploit the potential of the Alpha EV6 and
EV67 chips by using a switch-based (or point-to-point) interconnect system.
With a traditional bus design, the processors, memory, and I/O modules share
the bus. As the number of bus users increases, the transactions interfere with
one another, increasing latency and decreasing aggregate bandwidth. With a
switch-based system, speed is maintained and little degradation in performance
occurs as the number of CPUs, memory, and I/O users increases.
The switched system interconnect uses a set of complex microprocessor 21272
support chips that route the traffic over multiple paths. This chipset consists of
one C-chip, two P-chips, and eight D-chips.
C-chip. Provides the command interface from the CPUs and main memory.
The C-chip allows each CPU to do transactions simultaneously.
D-chips. Provide the data path for the CPUs, main memory, and I/O.
System Overview
1-3
1.2
System Enclosures
Rackmount
Pedestal
Tower
PK0212
1-4
Model Variants
ES40 systems are offered in two models. The entry-level model provides
connectors for four DIMMs on each of the memory motherboards (MMBs) and
connectors for six PCI options on the PCI backplane. To upgrade from Model 1
to Model 2, you replace the PCI backplane and the four memory motherboards.
Model 1
Model 2
14 CPUs
14 CPUs
Up to 16 DIMMs
(4 DIMMs on each MMB
Up to 32 DIMMs
(8 DIMMs on each MMB)
6 PCI slots
10 PCI slots
Common Components
The following components are common to all ES40 systems:
CD-ROM drive
Up to two storage drive cages that house up to four 1.6-inch drives per cage
A 25-pin parallel port, two 9-pin serial ports, two universal serial bus (USB)
ports, mouse and keyboard ports, and one MMJ connector for a local console
terminal
System Overview
1-5
1.3
4
1
9
3
6
2
5
PK0201
1-6
1.4
1
PK0206
Power supplies
PCI bulkhead
I/O ports
System Overview
1-7
1.5
9
8
10
1
10
2
3
4
5
6
8
Tower
1-8
PK0209
USB ports.
System Overview
1-9
1.5
Control Panel
The control panel provides system controls and status indicators. The
controls are the Power, Halt, and Reset buttons. A 16-character back-lit
alphanumeric display indicates system state. The panel has two LEDs:
a green Power OK indicator and an amber Halt indicator.
6
PK0204
1-10
Power LED (green). Lights when the power button is depressed and
system power passes initial checks.
Reset button. A momentary contact switch that restarts the system and
reinitializes the console firmware. Power-up messages are displayed, and
then the console prompt is displayed or the operating system boot
messages are displayed, depending on how the startup sequence has been
defined.
Halt LED (amber). Lights when you press the Halt button.
Halt button. Halts the system.
If the operating system is running, pressing the Halt button halts the
operating system and returns to the SRM console.
If the Halt button is latched when the system is reset or powered up,
the system halts in the SRM console. Systems that are configured to
autoboot cannot boot until the Halt button is unlatched.
Commands issued from the remote management console (RMC) can be used to
reset, halt, and power the system on or off.
RMC Command
Function
Reset
System Overview
1-11
1.6
System Motherboard
The system motherboard is located on the floor of the system card cage.
It has slots for the CPUs and memory motherboards (MMBs) and has
the PCI backplane interconnect.
PCI
Connector to I/O
P-chip
P-chip
MMB1
J7
D-chip
D-chip
D-chip
CPU3
J18
CPU2
J34
CPU1
D-chip
MMB3
J8
J17
C-chip
MMB0
J5
D-chip
J6
D-chip
D-chip
MMB2
D-chip
CPU0
J40
Vterm
Cterm
PK-0323-99
1-12
The system motherboard has the majority of the logic for the system, including:
CPU connectors
MMB connectors
RMC logic
System Overview
1-13
1.7
CPU Card
An ES40 can have up to four CPU cards. In addition to the Alpha EV6 or
EV67 chip, the CPU card has a 4-Mbyte second-level cache and a 2.2V
DC-to-DC converter with heatsink that provides the required voltage to
the Alpha chip. Power-up diagnostics are stored in a flash SROM on the
card.
PK0271
1-14
System Overview
1-15
1.8
The system has two 256-bit wide memory data buses, which can move
large amounts of data simultaneously.
MMB0
MMB3
MMB1
Data
Bus 1
To all eight D-Chips
C-Chip
Data
Bus 0
To all eight D-Chips
PK0272
1-16
Memory Architecture
Memory throughput in this system is maximized by the following features:
Very low memory latency (120 ns) and high bandwidth with 12 ns clock
ECC memory
Each data bus is 256 bits wide (32 bytes). The memory bus speed is 83 MHz.
This yields 2.6 GB/sec bandwidth per bus (32 x 83 MHz = 2.6 GB/sec). The
maximum bandwidth is 5.2 GB/sec.
The switch interconnect design takes full advantage of the capabilities of the
two wide data buses. The 256 data bits are distributed equally over two
memory motherboards (MMBs). Simultaneously, in a read operation, 128 bits
come from one MMB and the other 128 bits come from another MMB, to make
one 256-bit read. Another 256-bit read operation can occur at the same time on
the other independent data bus.
In addition, two address buses per MMB (one for each array) allow
overlapping/pipelined accesses to maximize use of each data bus. When all
arrays are identical (same size and speed), the memory is interleaved; that is,
sequential blocks of memory are distributed across all four arrays.
Memory Options
Each memory option consists of a set of four 100 MHz, 200-pin industrystandard DIMMs. The DIMMs are synchronous DRAMs. The Model 1 system
supports up to four memory options (16 DIMMs), and the Model 2 system
supports up to eight options (32 DIMMs). Memory options are available in the
following sizes:
Memory options are installed into memory motherboards (MMBs) located on the
system motherboard (see Figure 17). There are four MMBs. The MMBs have
either four or eight slots for installing DIMMs.
See Chapter 6 for memory configuration.
System Overview
1-17
1.9
PCI Backplane
The PCI backplane has two independent 64-bit, 33 MHz PCI buses that
support 64-bit PCI slots. The 64-bit PCI slots are split across the two
buses. The PCI buses support 3.3 V and 5 V options.
P-chip 0
COM1
COM2
Modem
Printer
Floppy
Flash ROM
Keyboard
Mouse
CD-ROM
USB
(NVRAM functions)
C-chip
Interrupts
(4) or (3)
PCI Slot
Config
(6) or (3)
P-chip 1
PCI Slot
PCI 1
PK-0319A-98
1-18
Supports three address spaces: PCI I/O, PCI memory, and PCI configuration
space
I/O Implementation
In a system with 10 I/O slots, PCI 0 has 4 slots, and PCI 1 has 6 slots. In a
system with 6 slots, each PCI has 3 slots; the middle four connectors are not
present.
The Acer Labs 1543C chip provides the bridge from PCI 0 to ISA. The C-chip
controls accesses to memory on behalf of both P-chips.
I/O Ports
The I/O ports are shown in Section 1.5.
System Overview
1-19
RMC
PIC
PICADBUS
ADDR ADDRESS
Latch
DUART
COM1(Modem Port)
System COM1 UART
AUX5
AUX5
AUX5
DATA
DualPort
SRAM
AUX5
ADDRESS
DATA
Bus
Isolator
AUX5
RMC
Flash
RAM
STATUS
SPC
PIC
PWR5
AUX5
CONTROL
ADDRESS
DATA
AUX5
1-20
TIG
SPC
Register
Array
AUX5
STATUS
CONTROL
PKO912
System Overview
1-21
1-22
Address latch
I C temperature sensors
2
2
System Overview
1-23
0
1
2
Pedestal/Rack
PK0207
1-24
One to three power supplies provide power to components in the system box.
The system supports redundant power configurations to ensure continued
system operation if a power supply fails. See Chapter 6 for power supply
configurations.
When more than one power supply is installed, the supplies share the load. The
power supplies select line voltage automatically (120V or 240V and 50 Hz or
60 Hz).
Power Supply LEDs
Each power supply has two green LEDs that indicate the state of power to the
system.
+5 V Auxiliary
System Overview
1-25
1.12 Fans
The system has six hot-plug fans that provide front-to-back airflow.
3
4
PK0208a
1-26
The system fans are shown in Figure 113 and described in Table 11.
Area Cooled
Power supplies
Left drive cage
4.5-in.
,
4.5-in.
4.5-in.
redundant
6.75-in.
main fan
System Overview
1-27
1
PK0233
1-28
Figure 115 Hard Disk Storage Cage with Drives (Tower View)
PK0935
System Overview
1-29
Tower
Pedestal
1-30
PK0224
Both the tower and pedestal systems have a small front door through which the
control panel and removable media devices are accessible. At the time of delivery, the system keys are taped inside this door.
The tower front door has a lock that lets you secure access to the disk drives and
to the rest of the system.
The pedestal has two front doors, both of which can be locked. The upper door
secures the disk drives and access to the rest of the system, and the lower door
secures the expanded storage.
System Overview
1-31
VT
Tower
VT
Pedestal/Rack
1-32
PK0225
Chapter 2
Troubleshooting
This chapter describes the starting points for diagnosing problems on ES40
systems. The chapter also provides information resources.
Questions to Consider
Diagnostic Tables
Information Resources
Troubleshooting
2-1
2.1
Questions to Consider
2-2
If the operating system is down, but you are able to access the SRM
console, use the console environment diagnostic tools, including the OCP
display, power-up display, and SRM commands.
If you are unable to access the SRM console, enter the RMC CLI and
issue commands to determine the hardware status. See Chapter 7.
If the operating system has crashed and rebooted, the CCAT (Compaq
Crash Analysis Tool), the Compaq Analyze service tools (to interpret
error logs), the SRM crash command, operating system exercisers, and
DEC VET can be used to diagnose system problems.
2.2
Diagnostic Tables
Troubleshooting
2-3
Action
2-4
Reference
Chapter 7
Appendix B
Action
Reference
Chapter 3
Chapter 1
Chapter 6
Chapter 6
Chapter 7
Troubleshooting
2-5
Action
Reference
Console firmware is
corrupted. Load new
firmware with fail-safe
loader.
Chapter 3
Chapter 3 and
Chapter 8
Chapter 3
Chapter 4
Power-up screen or
console event log
indicates problems with
mass storage devices.
Checking seating of
modules.
2-6
Action
Reference
Chapter 6
Chapter 6
Chapter 4
Troubleshooting
2-7
Action
Reference
Chapter 4
Operating system
has crashed and
rebooted.
2-8
Chapter 7
Chapter 5
2.2
This section lists some of the tools and utilities available for acceptance
testing and diagnosis and gives recommendations for their use.
2.2.1
2.2.1
Loopback Tests
Internal and external loopback tests are used to test the components on the I/O
connector assembly (junk I/O) and to test Ethernet cards. The loopback tests
are a subset of the SRM diagnostics.
Use loopback tests to isolate problems with the COM2 serial port, the parallel
port, and Ethernet controllers. See the test command in Chapter 4 for
instructions on performing loopback tests.
2.2.2
SRM console commands are used to set and examine environment variables and
device parameters. For example, the show configuration and show device
commands are used to examine the configuration, and the set envar and show
envar commands are used to set and view environment variables.
SRM commands are also used to invoke ROM-based diagnostics and to run
native exercisers. For example, the test and sys_exer commands are used to
test the system.
See Chapter 6 for information on configuration-related console commands and
environment variables. See Chapter 4 for information on running console
exercisers. See Appendix A for a list of console commands used most often on
ES40 systems.
Troubleshooting
2-9
2.2.2
The remote management console (RMC) is used for managing the server either
locally or remotely. It also plays a key role in error analysis by passing error log
information to the dual-port RAM (DPR), which is shared between the RMC and
the system motherboard logic, so that this information can be accessed by the
system. RMC also controls the control panel display. RMC has a command-line
interface from which you can enter a few diagnostic commands.
RMC can be accessed as long as the power cord for a working supply is plugged
into the AC wall outlet and a console terminal is attached to the system. This
feature ensures that you can gather information when the operating system is
down and the SRM console is not accessible. See Chapter 7.
2.2.3
The Verifier and Exerciser Tool (DEC VET) is an on-line diagnostic tool used to
ensure the proper installation and operation of hardware and base operating
system software. Use DEC VET as part of acceptance testing to ensure that the
CPU, memory, disk, tape, file system, and network are interacting properly.
2.2.3
Crash Dumps
For fatal errors, the operating systems save the contents of memory to a crash
dump file. This file can be used to determine why the system crashed.
CCAT, the Compaq Crash Analysis Tool, is the primary crash dump analysis
tool for analyzing crash dumps on Alpha systems. CCAT compares the results
of a crash dump with a set of rules. If the results match one or more rules,
CCAT notifies the system user of the cause of the crash and provides
information to avoid similar crashes in the future.
2-10
2.2.4
Troubleshooting
2-11
2.3
Information Resources
2.3.1
2.3.2
The information contained in this guide, including the FRU procedures and
illustrations, is available in HTML Help format as part of the Maintenance Kit
(QZ-01BAB-GZ). It can also be accessed from The Learning Utility and ProSIC
Web sites.
2.2.4
The firmware resides in the flash ROM on the system motherboard. You can
obtain the latest system firmware from CD-ROM or over the network.
Quarterly Update Service
The Alpha Systems Firmware Update Kit CD-ROM is available by subscription.
Alpha Firmware Internet Access
You can obtain Alpha Firmware updates from the World Wide Web from the
following Web site:
http://ftp.digital.com/pub/Digital/Alpha/firmware/readme.html
The README file describes the firmware directory structure and how to
download and use the files.
2-12
If you do not have a Web browser, you can download the files using
anonymous ftp:
ftp.digital.com/pub/Digital/Alpha/firmware
2.3.3
Fail-Safe Loader
The fail-safe loader (FSL) allows you to boot the firmware update utility in an
attempt to repair corrupted console files that reside within the flash ROMs on
the system motherboard. You can download the fail-safe loader from the
Internet (using the firmware update URL) to create your own fail-safe loader
diskettes. See Chapter 3 for information on forcing a fail-safe floppy load.
2.3.4
Software Patches
Software patches for the supported operating systems are available from the
World Wide Web as follows:
http://www.digital.com/alphaserver/support.html
Troubleshooting
2-13
2.3.5
You can download up-to-date files and late-breaking technical information from
the Internet.
The information includes firmware updates, the latest configuration utilities,
software patches, lists of supported options, and more.
http://www.digital.com/alphaserver/es40/es40.html
2.3.6
Supported Options
2-14
Chapter 3
Power-Up Diagnostics
and Display
This chapter describes the power-up process and RMC, SROM, and SRM powerup diagnostics. The following topics are covered:
Power-Up Displays
3-1
3.1
The power-up process begins with the power-on of the power supplies.
After the AC and DC power-up sequences are completed, the remote
management console (RMC) reads EEROM information and deposits it
into the DPR. The SROM minimally tests the CPUs, initializes and tests
backup cache, and minimally tests memory. Finally, the SROM loads
the SRM console program into memory and jumps to the first
instruction in the console program.
There are three distinct sets of power-up diagnostics:
1.
2.
3.
3-2
3.2
1. When the power cord is plugged into the wall outlet, 5V auxiliary AC
voltage is enabled. The 5 V AUX LEDs on the power supplies are lit, and
the system power controller and RMC are initialized.
2. Pressing the Power button on the control panel or subsequently issuing the
power-on command from the RMC turns on power to the power supplies,
CPU converters, and VTERM regulators. The POK LEDs on the power
supplies are lit and the power supplies are tested. If all power supplies are
bad, power-up stops. All DC/DC converters and regulators are then tested.
If any converter or regulator is bad, power-up stops.
3. CPU_DCOK and SYS_DC_OK are set to true, which means that DC power
on the CPUs and system is okay. All CPUs load the initial Y divisor (clock
multiplier). The OCP power LED is lit.
4. SYS_RESET is set to false. This setting releases the system motherboard
logic and PCI backplane logic from the Reset state.
5. The primary CPU is selected and CPU_(P)_RESET is set to false. This
allows the primary CPU to attempt to load flash SROM code.
6. If the primary CPU is good, it loads flash SROM. If bad, the system tries
the next available CPU and if that CPU is good, it becomes the primary.
The remaining CPUs load flash SROM. The SROM power-up then
continues, as described in Section 3.3.
3-3
Turn on VTERM
regulators
No
CPU =
"Alive"?
Disable CPU
All CPUs reload
initial Y divisor
Yes
Continue SROM power-up
PK0943
3-4
Determine Config
Bad
Good
Reload Using
Flash SROM
Init EV6
Test PCI
Release CPUs
B-Cache Tests
Load SRM
PK0964
3-5
3.3
Power-Up Displays
NOTE: The power-up text that is displayed on the screen depends on what kind
of terminal is connected as the console terminal: VT or VGA.
If the SRM console environment variable is set to serial, the entire
power-up display, consisting of the SROM and SRM power-up
messages, is displayed on the VT terminal screen. If console is set to
graphics, no SROM messages are displayed, and the SRM messages
are delayed until VGA initialization has been completed.
3-6
Section 3.3.1 describes the SROM power-up sequence and shows the SROM
power-up messages and corresponding OCP messages.
Section 3.3.2 shows the messages that are displayed once the SROM has
transferred control to the SRM console.
3-7
3.3.1
OCP Message
MHz
3-8
PCI Test
Power on
Reload
RelCPU1
RelCPU2
RelCPU3
BC Data
BC Addr
Size Mem
Cfg Mem
Load ROM
Jump to
Console
When the system powers up, the SROM code is loaded into the I-cache
(instruction cache) on the first available CPU, which becomes the primary
CPU. The order of precedence is CPU0, CPU1, and so on. The primary
CPU attempts to access the PCI bus. If it cannot, either a hang or a failure
occurs, and this is the only message displayed.
The primary CPU interrogates the I C EEROM as stored in the DPR. The
primary CPU determines the optimum CPU and system configuration to
jump to.
The primary CPU next checks the SROM checksum to determine the
validity of the flash SROM sectors.
If flash SROM is invalid, the primary CPU reports the error and continues
the execution of the SROM code. Invalid flash SROM must be reprogrammed.
If flash SROM is good, the primary CPU programs appropriate registers
with the values from the flash data and selects itself as the target CPU to
be loaded.
The primary CPU (usually CPU0) initializes and then loads the flash
SROM code to the next CPU. That CPU then initializes the EV6/EV67
(21264 chip) and marks itself as a secondary CPU. Once the primary CPU
sees the secondary, it loads the flash SROM code to the next CPU until all
remaining CPUs are loaded.
The flash SROM performs B-cache tests. For example, the ECC data test
verifies the detection logic for single- and double-bit errors.
The primary CPU sizes memory and initiates all memory tests. The
memory is tested for address and data errors for the first 32 MB of
memory. It also initializes all the sized memory in the system.
If a memory failure occurs, an error is reported. An untested memory
array is assigned to address 0 and the failed memory array is deassigned.
The memory tests are re-run on the first 32 MB of memory. If all memory
fails, the No Memory Available message is reported and the system halts.
If all memory passes, the primary CPU loads the console and transfers
control to it.
3-9
3.3.2
3-10
The primary CPU prints a message indicating that it is running the console.
Starting with this message, the power-up display is sent to any console
terminal, regardless of the state of the console environment variable.
If console is set to graphics, the display from this point on is saved in a
memory buffer and displayed on the VGA monitor after the PCI buses are
sized and the VGA device is initialized.
3-11
Intlv Mode
---------2-Way
2-Way
2-Way
2-Way
3-12
The console is started on the secondary CPUs. The example shows a fourprocessor system.
3-13
3.3.3
3-14
Intlv Mode
---------2-Way
2-Way
2-Way
2-Way
3-15
3-16
3-17
.
.
.
bus 0, slot 15 -- dqbAcer Labs M1543C IDE
starting drivers
entering idle loop
initializing keyboard
starting console on CPU 1
initialized idle PCB
initializing idle process PID
lowering IPL
CPU 1 speed is 500 MHz
create powerup
.
.
.
Memory Testing and Configuration Status
Array
Size
Base Address
Intlv Mode
--------- ---------- ---------------- ---------0
256Mb
0000000060000000
2-Way
1
512Mb
0000000040000000
2-Way
2
256Mb
0000000070000000
2-Way
3
1024Mb
0000000000000000
2-Way
2048 MB of System Memory
Testing the System
Testing the Disks (read only)
Testing the Network
Partition 0, Memory base: 000000000, size: 080000000
initializing GCT/FRU at offset 1dc000
AlphaServer ES40 Console V5.6-102, built on Dec 14 1999 at 01:57:42
P00>>>show heap_expand
heap_expand
64KB
P00>>>
3-18
3.3.4
The SRM console event log helps you troubleshoot problems that do not
prevent the system from coming up to the SRM console. The console
event log consists of status messages received during power-up selftests.
3-19
3.4
3.4.1
Associated
Messages
Jump to
Console
1-3
Meaning
SROM code has completed execution. System jumps to
SRM console. SRM messages should start to be
displayed. If no SRM messages are displayed, it may
indicate corrupted firmware. See Section 3.4.2.
VGA monitor not plugged in. The first beep is a long
beep.
1-1-4
ROM err
2-1-2
Cfg ERR n
Cfg ERR s
1-2-4
BC error
CPU error
BC bad
1-3-3
No mem
3-20
A few SROM error messages that appear on the operator control panel are
announced by audible error beep codes, an indicated in Table 31. For example,
a 1-1-4 beep code consists of one beep, a pause (indicated by the hyphen), one
beep, a pause, and a burst of four beeps. This beep code is accompanied by the
message ROM err.
Related messages are also displayed on the console terminal if the console
device is connected to the serial line and the SRM console environment
variable is set to serial.
3-21
3.4.2
Checksum Error
3-22
***** Loadable Firmware Update Utility *****
------------------------------------------------------------Function
Description
-----------------------------------------------------------Display
Displays the systems configuration table.
Exit
Done exit LFU (reset).
List
Lists the device, revision, firmware name, and
update revision.
Readme
Lists important release information.
Update
Replaces current firmware with loadable data
image.
Verify
Compares loadable and hardware images.
? or Help
Scrolls this function table.
-------------------------------------------------------------
UPD> update
The sequence shown in Example 35 is as follows:
The system detects the checksum error and writes a message to the console
screen.
The system attempts to automatically load the FSL program from the
floppy drive.
At the P00>>> console prompt, boot the Loadable Firmware Update Utility
(LFU) from the Alpha Systems Firmware CD (shown in the example as the
variable update_cd).
After the entering idle loop message, the banner for the Loadable
Firmware Update Utility is displayed.
At the UPD> prompt, enter the update command to load the new console
firmware images.
NOTE: For more information on the LFU, see the Firmware Updates Web site:
http://ftp.digital.com/pub/digital/Alpha/firmware/
3-23
3.4.3
No MEM Error
If the SROM code cannot find any usable memory, a 1-3-3 beep code is
issued (one beep, a pause, a burst of three beeps, a pause, and another
burst of three beeps), and the message No MEM is displayed on the
OCP. The system does not come up to the console program. This error
indicates missing or bad DIMMs.
The OCP and console terminal display text similar to the following:
Failed M:1 D:2
Failed M:1 D:1
Failed M:0 D:2
Failed M:0 D:1
Incmpat M:1 D:4
Incmpat M:1 D:3
Incmpat M:0 D:4
Incmpat M:0 D:3
Missing M:3 D:2
Illegal M:2 D:2
No usable memory detected
3-24
Indicates that some DIMMs in this array are mismatched. All DIMMs in
the affected array are marked as incompatible (incmpat).
Indicates that a DIMM in this array is missing. All missing DIMMs in the
affected array are marked as missing.
Indicates that the DIMM data for this array is unreadable. All unreadable
DIMMs in the affected array are marked as illegal.
See Chapter 6 for memory configuration rules.
3-25
3.4.4
Meaning
AC loss
CPUn failed
CPU failed. n is 0, 1, 2, or 3.
VTERM failed
CTERM failed
Fan5, 6 failed
OverTemp failure
No CPU in slot 0
TIG error
3-26
Meaning
PSn failed
OverTemp Warning
Fann failed
5V bulk warn
VTERM warn
CTERM warn
3-27
3.4.5
The SROM power-up identifies errors that may or may not prevent the
system from coming up to the console. It is possible that these errors
may prevent the system from successfully booting the operating
system. Errors encountered during SROM power-up are displayed on
the OCP. Some errors are also displayed on the console terminal
screen if the console output is set to serial.
Table 34 lists the SROM error messages. The code numbers shown in the
Code column are displayed in place of OCP or SROM messages if the SROM
flash is invalid.
SROM Message
OCP Message
FD
FA
PCI Err
No Mem
EF
EE
ED
EC
BC Error
BC Error
BC Error
CPU Err
EB
EA
E9
CPU Err
BC Error
BC Error
E8
E7
E6
E5
BC Error
BC Error
ROM Err
Flpy Err
E4
E3
E2
TOY Err
Mem Err
Mem Err
E1
E0
7F
Mem Err
Mem Err
CfgERR 3
3-28
SROM Message
OCP Message
7E
7D
CfgERR 2
CfgERR 1
7C
7B
7A
CfgERR 0
BC Bad 3
BC Bad 2
79
78
77
76
BC Bad 1
BC Bad 0
MtrERR 3
MtrERR 2
75
74
73
MtrERR 1
MtrERR 0
RCPU 3 E
72
71
70
6F
RCPU 2 E
RCPU 1 E
RCPU 0 E
CfgERR S
3-29
3.5
Under some circumstances, you may need to force the activation of the
FSL. For example, if you install a system motherboard that has an
older version of the firmware than your system requires, you may not
be able to bring up the SRM console. In that case you need to force a
floppy load so that you can update the SRM firmware.
J21
J20
J22
J23
1 2 3 1 2 3 1 2 3 1 2 3
E296
1 2 3 4 5 6 7 8 9 10
ON
OFF
SC0033
3-30
1. Turn off the system. Unplug the power cord from each power supply and
wait for the 5V AUX indicators to extinguish.
2. Remove enclosure covers (tower and pedestal) or the front bezel (rackmount)
to access the system chassis. See Chapter 8 for illustrations.
3. Remove the fan cover and the system card cage cover to gain access to the
system motherboard. See Chapter 8 for illustrations.
4. Remove MMB 1 (closest to the PCI backplane) so that you can access the
function jumpers.
5. Locate the J22 function jumper on the system motherboard. See
Figure 32.
6. Enable the fail-safe loader by moving the J22 jumper from pins 1 and 2 to
pins 2 and 3.
NOTE: The J20 and J23 function jumpers must be in their default positions
over pins 1 and 2.
7. Replace the chassis covers and enclosure covers. Plug in the power supplies.
8. Insert the LFU diskette into the floppy drive, and insert the update CD into
the CD-ROM drive.
9. Power up the system and check the control panel display for progress
messages.
10. At the P00>>> prompt, boot the update CD. Enter update at the UPD>
prompt and press Return. Enter yes at the Confirm update prompt.
11. After the update is complete, turn off the system and unplug the power
supplies.
12. Place J22 over pins 1 and 2.
13. Replace MMB 1.
14. Replace the chassis covers and enclosure covers, plug in the power supplies,
and power up the system.
NOTE: For more information on the LFU, see the Firmware Updates Web site:
http://ftp.digital.com/pub/digital/Alpha/firmware/
3-31
3.6
If the RMC is not working, the control panel displays the following message:
Bad RMC flash
The SRM console also sends a message to the terminal screen:
*** Error - RMC detected power up error - RMC Flash corrupted ***
3-32
You can update the remote management console firmware from flash ROM
using the LFU.
1. Load the update medium.
2. At the UPD> prompt, exit from the update utility, and answer y to the
manual update prompt. Enter update RMC to update the firmware.
UPD> exit
Do you want to do a manual update [y/(n)] y
***** Loadable Firmware Update Utility *****
------------------------------------------------------------Function
Description
------------------------------------------------------------Display
Displays the systems configuration table.
Exit
Done exit LFU (reset).
List
Lists the device, revision, firmware name, and
update revision.
Readme
Lists important release information.
Update
Replaces current firmware with loadable data
image.
Verify
Compares loadable and hardware images.
? or Help
Scrolls this function table.
----------------------------------------------------------UPD> update RMC
.
.
.
NOTE: For more information on the LFU, see the Firmware Updates Web site:
http://ftp.digital.com/pub/digital/Alpha/firmware/
3-33
Chapter 4
SRM Console Diagnostics
4.1
Diagnostic commands are used to test the system and help diagnose
failures. Table 41 gives a summary of the SRM diagnostic commands
and related commands. See Chapter 6 for a list of SRM environment
variables, and see Appendix A for a list of SRM commands most
commonly used for the ES40 system.
Function
buildfru
cat el
Displays the console event log. Same as more el, but scrolls
rapidly. The most recent errors are at the end of the event
log and are visible on the terminal screen.
clear_error
crash
deposit
examine
exer
floppy_write
grep
hd
info
4-2
Function
kill
kill_diags
more el
Same as cat el, but displays the console event log one
screen at a time.
memexer
memtest
net -ic
net -s
nettest
set sys_serial_
num
show error
show fru
show_status
sys_exer
sys_exer -lb
Runs console loopback tests for the COM2 serial port and
the parallel port during the sys_exer test sequence.
test
test -lb
Runs loopback tests for the COM2 serial port and the
parallel port in addition to verifying the configuration of
devices.
4.2
buildfru
2
Example 41 buildfru
P00>>>
P00>>>
P00>>>
P00>>>
buildfru
buildfru
buildfru
buildfru
-s smb0.mmb0.dim1 80 47 46 45 44 43 42 41
Building of the FRU descriptor on a DIMM with the -s qualifier, pass offset
80, and value of 45
Building of the FRU descriptor on a DIMM with the -s qualifier, pass offset
80, and many sequential data bytes
4-4
The information supplied on the buildfru command line includes the console
name for the FRU, part number, serial number, model number, and optional
information. The buildfru command facilitates writing the FRU information to
the EEPROM on the device.
Use the show fru command to display the FRU table created with buildfru.
Use the show error command to display FRUs that have errors logged to them.
Typically, you only need to use buildfru in Field Service if you replace a device
for which the information displayed with the show fru command is inaccurate
or missing. After replacing the device, use buildfru to build the new FRU
descriptor.
NOTE: Be sure to enter the FRU information carefully. If you enter incorrect
information, the callout used by Compaq Analyze will not be accurate.
Three areas of the EEPROM can be initialized: the FRU generic data, the FRU
specific data, and the system specific data. Each area has its own checksum,
which is recalculated any time that segment of the EEPROM is written.
When the buildfru command is executed, the FRU EEPROM is first flooded
with zeros and then the generic data, the system specific data, and EEPROM
format version information are written and checksums are updated. For certain
FRUs, such as CPU modules, additional FRU specific data can be entered
using the -s option. This data is written to the appropriate region, and its
corresponding checksum is updated.
FRU Assembly Hierarchy
Alpha-based systems can be decomposed into a collection of FRUs. Some FRUs
carry various levels of nested FRUs. For instance, the system motherboard is a
FRU that carries a number of child FRUs. A child, such as a memory
motherboard (MMB), may carry a number of its own children, DIMMs. The
naming convention for FRUs represents the assembly hierarchy.
The following is the general form of a FRU name:
<frun>[.<frun>[.<frun>]]
The fru is a placeholder for the appropriate FRU type at that level and n is the
number of that FRU instance on that branch of the system hierarchy.
The ES40 FRU assembly hierarchy has three levels. The FRU types from the
top to the bottom of the hierarchy are as follows:
Level
FRU Type
Meaning
First Level
SMB
JIO
OCP
PWR (02)
FAN
System motherboard
I/O connector module (junk I/O)
Operator control panel
Power supplies
Fans
Second Level
CPU (03)
MMB (03)
CPB
CPUs
Memory motherboards
PCI backplane
Third Level
DIM (18)
PCI (09)
SBM (01)
Memory DIMMs
PCI slots
SCSI backplane
To build a FRU descriptor for a lower level FRU, point back to the higher level
FRUs to which it is associated. For example, to build a descriptor for a DIMM,
point back to the MMB on which it resides and then to the system motherboard.
All fields are automatically set to uppercase before writing to EEPROM. See
Example 41.
If you enter the buildfru data correctly for a device that has an EEPROM to
program, nothing is displayed after you enter the command. If you enter
incorrect data or the device does not have an EEPROM to program, an error
message similar to the following is displayed:
P00>>>
P00>>> buildf fan4 54-12345-01.a001 ay84412345
Device FAN4 does not support setting FRU values
P00>>>
Syntax
buildfru ( <fru_name> <part_num> <serial_num> [<misc> [<other>]]
or
-s <fru_name> <offset> <byte> [<byte>...] )
4-6
Arguments
<fru_name>
Console name for this FRU. This name reflects the position
of the FRU in the assembly hierarchy.
<part_num>
<serial_num>
<misc>
<other>
<offset>
<byte>...
Options
-s
4.3
The cat el and more el commands display the contents of the console
event log.
In Example 42, the console reports that CPU 1 did not power up and fans 1
and 2 failed.
Example 42 more el
>>> more el
*** Error - CPU 1 failed powerup diagnostics ***
Secondary start error
EV6 BIST
= 1
STR status
= 1
CSC status
= 1
PChip0 status
= 1
PChip1 status
= 1
DIMx status
= 0
TIG Bus status
= 1
DPR status
= 0
CPU speed status = 0
CPU speed
= 0
Powerup time
= 00-00-00 00:00:00
CPU SROM sync
= 0
*** Error - Fan 1 failed ***
*** Error - Fan 2 failed ***
4-8
CPU 1 failed.
4.4
clear_error
Example 43 clear_error
P00>>> clear_error smb0
P00>>>
Clears all errors logged in the FRU EEPROM on the system motherboard
(SMB0).
The clear_error command clears TDD, SDD, and checksum errors. Hardware
failures and unreadable EEPROM errors are not cleared. See Table 42.
Syntax
clear_error
<fruname>
clear_error all
See the show error command for information on the types of errors that might
be logged to the FRU EEPROMs.
4-10
4.5
crash
The SRM crash command forces a crash dump to the selected device for
UNIX and OpenVMS systems.
P00>>> crash
CPU 0 restarting
DUMP: 19837638 blocks available for dumping.
DUMP: 118178 wanted for a partial compressed dump.
DUMP: Allowing 2060017 of the 2064113 available on 0x800001
device string for dump = SCSI 1 1 0 0 0 0 0.
DUMP.prom: dev SCSI 1 1 0 0 0 0 0, block 2178787
DUMP: Header to 0x800001 at 2064113 (0x1f7ef1)
device string for dump = SCSI 1 1 0 0 0 0 0.
DUMP.prom: dev SCSI 1 1 0 0 0 0 0, block 2178787
DUMP: Dump to 0x800001: .......: End 0x800001
device string for dump = SCSI 1 1 0 0 0 0 0.
DUMP.prom: dev SCSI 1 1 0 0 0 0 0, block 2178787
DUMP: Header to 0x800001 at 2064113 (0x1f7ef1)
succeeded
halted CPU 0
halt code = 5
HALT instruction executed
PC = fffffc0000568704
P00>>>
Use the crash command when the system has hung and you are able to halt it
with the Halt button or the RMC halt in command. The crash command
restarts the operating system and forces a crash dump to the selected device.
See the OpenVMS Alpha System Dump Analyzer Utility Manual for
information on how to interpret OpenVMS crash dump files.
See the Guide to Kernel Debugging for information on using the Tru64
UNIX Krash Utility.
4.6
examine
P00>>> e dpr:34f0 -l -n 5
dpr:
34F0 00000000
dpr:
34F4 00000000
dpr:
34F8 00000000
dpr:
34FC 00000000
dpr:
3500 204D5253
dpr:
3504 352E3558
P00>>>
4-12
Deposit
The deposit command stores data in the location specified. If no options are
given, the system uses the options from the preceding deposit command.
If the specified value is too large to fit in the data size listed, the console ignores
the command and issues an error. If the data is smaller than the data size, the
higher order bits are filled with zeros.
In Example 44:
Examine
The examine command displays the contents of a memory location, a register,
or a device.
If no options are given, the system uses the options from the preceding
examine command. If conflicting address space or data sizes are specified, the
console ignores the command and issues an error.
For data lengths longer than a longword, each longword of data should be
separated by a space.
In Example 44:
Examine the DPR starting at location 34f0 and continuing through the
next 5 locations, and display the data size in longwords.
Syntax
deposit [-{b,w,l,q,o,h}] [-{n value, s value}] [space:] address data
examine [-{b,w,l,q,o,h}] [-{n value, s value}] [space:] address
-b
-w
-l (default)
-q
-o
-h
-d
-n value
-s value
dev_name
eerom
fpr
gpr
ipr
pcicfg
pciio
pcimem
pt
pmem
vmem
Virtual memory.
offset
data
Data to be deposited.
4-14
4.7
exer
A read operation reads from a device that you specify into a buffer.
The exer command uses two buffers, buffer1 and buffer2, to carry out the
operations. A read or write operation can be performed using either buffer. A
compare operation uses both buffers.
Example 45 exer
P00>>> exer dk*.* -p 0 -secs 36000
Read SCSI disks for the entire length of each disk. Repeat this until 36000
seconds, 10 hours, have elapsed. All disks will be read concurrently. Each block
read will occur at a random block number on each disk.
4-16
P00>>> ls -l dk*.*
r--dk
0/0
0
P00>>> exer dk*.* -bc 10 -sec 20 -m -a r
dka0.0.0.0.0 exer completed
packet
IOs
8192 3325
27238400
0
166
dka0.0.0.0.0
1360288
elapsed idle
20
19
A destructive write test over block numbers 0 through 100 on disk dka0. The
packet size is 2048 bytes. The action string specifies the following sequence of
operations:
1. Set the current block address to a random block number on the disk
between 0 and 97. A four block packet starting at block numbers 98, 99, or
100 would access blocks beyond the end of the length to be processed so 97 is
the largest possible starting block address of a packet.
2. Write a packet of hex 5as from buffer1 to the current block address.
3. Set the current block address to what it was just prior to the previous write
operation.
4. From the current block address read a packet into buffer2.
5. Compare buffer1 with buffer2 and report any discrepancies.
6. Repeat steps 1 through 5 until enough packets have been written to satisfy
the length requirement of 101 blocks.
The packet size, also known as the I/O size, which is the number of bytes
read or written in one I/O operation
Syntax
exer ( [-sb start_block>] [-eb end_block>] [-p pass_count>]
[-l blocks>] [-bs block_size>] [-bc block_per_io>]
[-d1 buf1_string>] [-d2 buf2_string>] [-a action_string>]
[-sec seconds>] [-m] [-v] [-delay milliseconds>]
device_name>... )
Arguments
device_name
Options
-sb <start_block>
-eb <end_block>
-p <pass_count>
-l <blocks>
4-18
-bs <block_size>
-bc <block_per_io>
-d1 <buf1_string>
-d2 <buf2_string>
-a <action_string>
-a <action_string>
(continued)
?
Seek to a random block offset within the
specified range of blocks. exer calls the program,
random, to deal each of a set of numbers once. exer
chooses a set that is a power of two and is greater
than or equal to the block range. Each call to random
results in a number that is then mapped to the set of
numbers that are in the block range and exer seeks
to that location in the filestream. Since exer starts
with the same random number seed, the set of
random numbers generated will always be over the
same set of block range numbers.
Zero buffer 1
Zero buffer 2
-sec <seconds>
-m
-v
-delay <millisecs>
4-20
4.8
floppy_write
Example 46 floppy_write
P00>>> floppy_write
Destructive Test of the Floppy started
P00>>> show_status
ID
Program
Device
Pass
-------- ------------ ------------ -----00000001 idle system
0
00000c37 exer_kid
dva0.0.0.100 0
The floppy_write script uses exer to run a write test on the floppy. The test
runs in the background. Use the show_status command to display the progress of the test. Use the kill or kill_diags command to terminate the test.
4.9
grep
Example 47 grep
P00>>> show fru
SMB0.CPB0.PCI1
SMB0.CPB0.PCI4
SMB0.CPB0.PCI5
P00>>>
|
0
0
0
grep PCI
DE500-BA Network Cont
DEC PowerStorm
NCR 53C895
In Example 47 the output of the show fru command is piped into grep (the
vertical bar is the piping symbol), which filters out only lines with PCI.
Grep supports the following metacharacters:
^
[]
Repeated matching; when placed after a pattern, indicates that the pattern
should match any number of times. For example, [a-z][0-9]* matches a lowercase
letter followed by zero or more digits.
Repeated matching; when placed after a pattern, indicates that the pattern
should match one or more times [0-9]+ matches any non-empty sequence of
digits.
Optional matching; indicates that the pattern can match zero or one times. [az][0-9]? matches lowercase letter alone or followed by a single digit.
Quote character; prevent the character that follows from having special meaning.
4-22
Syntax
grep ( [-{c|i|n|v}] [-f <file>] [<expression>] [<file>...] )
Arguments
<expression>
<file>...
Options
-c
-i
-n
-v
-f <file>
4.10 hd
The hd command dumps the contents of a file (byte stream) in
hexadecimal and ASCII.
Example 48 hd
P00>>> hd
block 0
00000000
00000010
00000020
00000030
00000040
00000050
00000060
00000070
00000080
00000090
000000a0
000000b0
000000c0
000000d0
000000e0
000000f0
00000100
00000110
00000120
00000130
00000140
00000150
00000160
00000170
00000180
00000190
000001a0
000001b0
000001c0
000001d0
000001e0
000001f0
P00>>>
4-24
-eb 0 dpr:2b00
48
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
48
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
45
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
45
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4C
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4C
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4C
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4C
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4F
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
4F
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
3A
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
3A
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
HELLO...........
................
................
...............:
................
................
................
................
................
................
................
................
................
................
................
................
HELLO...........
................
................
...............:
................
................
................
................
................
................
................
................
................
................
................
................
Syntax
hd [-{byte|word|long|quad}] [-{sb|eb} <n>] <file>[:<offset>].
Arguments
<file>[:<offset>]
Options
-byte
-word
-long
-quad
-sb <n>
Start block
-eb <n>
End block
4.11 info
The info command displays registers and data structures. You can
enter the command by itself or followed by a number (0, 1, 2, 3, or 4). If
you do not specify a number, a list of selections is displayed and you
are prompted to enter a selection.
Example 49 info 0
P00>>>info
0. HWRPB MEMDSC
1. Console PTE
2. GCT/FRU 5
3. Dump System CSRs
4. IMPURE area (abbreviated)
Enter selection: 0
HWRPB: 2000
MEMDSC:2d40
Cluster count: 5
4-26
For information about the data displayed by the info commands, see the
following documents:
For info 0, info 1, and info 4, see the Alpha System Reference Manual,
Third Edition (EY-W938E-DP), available from Digital Press, an imprint of
Butterworth-Heinemann.
For info 2, see the Galaxy Console and Alpha Systems V5.0 FRU
Configuration Tree Specification.
info 0
info 1
Displays the page table entries (PTE) used by the console and operating
system to map virtual to physical memory. Valid data is displayed only
after a boot operation.
info 2
info 3
Dumps the contents of the system control status registers (CSRs) for the
C-chip, D-chip, and P-chips.
info 4
Displays the per CPU impure area in abbreviated form. The console uses
this scratch area to save processor context.
000000003FFA8000
000000003FFA8008
000000003FFA8010
000000003FFA8018
000000003FFA8020
000000003FFA8028
000000003FFA8030
000000003FFA8038
000000003FFA8040
000000003FFA8048
000000003FFA8050
000000003FFA8058
000000003FFA8060
000000003FFA8068
000000003FFA8070
000000003FFA8078
000000003FFA8080
000000003FFA8088
000000003FFA8090
000000003FFA8098
000000003FFA80A0
000000003FFA80A8
000000003FFA80B0
000000003FFA80B8
000000003FFA80C0
000000003FFA80C8
000000003FFA80D0
000000003FFA80D8
000000003FFA80E0
000000003FFA80E8
0000000100001101
0000000200001101
0000000300001101
0000000400001101
0000000500001101
0000000600001101
0000000700001101
0000000800001101
0000000900001101
0000000A00001101
0000000B00001101
0000000C00001101
0000000D00001101
0000000E00001101
0000000F00001101
0000001000001101
0000001100001101
0000001200001101
0000001300001101
0000001400001101
0000001500001101
0000001600001101
0000001700001101
0000001800001101
0000001900001101
0000001A00001101
0000001B00001101
0000001C00001101
0000001D00001101
0000001E00001101
.
.
.
4-28
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
va
0000000010000000
0000000010002000
0000000010004000
0000000010006000
0000000010008000
000000001000A000
000000001000C000
000000001000E000
0000000010010000
0000000010012000
0000000010014000
0000000010016000
0000000010018000
000000001001A000
000000001001C000
000000001001E000
0000000010020000
0000000010022000
0000000010024000
0000000010026000
0000000010028000
000000001002A000
000000001002C000
000000001002E000
0000000010030000
0000000010032000
0000000010034000
0000000010036000
0000000010038000
000000001003A000
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
pa
0000000000002000
0000000000004000
0000000000006000
0000000000008000
000000000000A000
000000000000C000
000000000000E000
0000000000010000
0000000000012000
0000000000014000
0000000000016000
0000000000018000
000000000001A000
000000000001C000
000000000001E000
0000000000020000
0000000000022000
0000000000024000
0000000000026000
0000000000028000
000000000002A000
000000000002C000
000000000002E000
0000000000030000
0000000000032000
0000000000034000
0000000000036000
0000000000038000
000000000003A000
000000000003C000
1de000
c0b531e5309ee27d
8000
5
2
1
0
GCT_ROOT_NODE
Root->lock
Root->transient_level
Root->Current_level
Root->console_req
Root->min_alloc
Root->min_align
Root->base_alloc
Root->base_align
Root->max_phys_addr
Root->mem_size
Root->platform_type
Root->platform_name
Root->primary_instance
Root->first_free
Root->high_limit
Root->lookaside
Root->available
Root->max_partition
Root->partitions
Root->communities
Root->max_plat_partition
Root->max_frag
Root->max_desc
Root->galaxy_id
Root->bindings
ffffffff
1
1
200000
100000
100000
2000000
2000000
800000000
80000000
140500000022
200
0
0
7d40
0
0
1
100
140
2
10
4
1de108
180
1
2740 cnt 1
2800 cnt 1
28c0 cnt 1
.
.
.
dump each node ? (Y/<N>)
CSRs:
801a0000000
0000000800000000
002140869A19796F
00000F6414000125
0000000060005005
0000000040006105
0000000070005005
0000000000007105
E084001000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0300000000000000
0000000000001333
F37FF37FF37FF37F
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
0080
0000
0040
0100
0140
0180
01c0
0200
0240
0600
0640
0280
02c0
0680
06c0
0300
0580
05c0
DCHIP
DSC
DSC2
STR
DREV
CSRs:
801b0000000
7F7F7F7F7F7F7F7F
7F7F7F7F7F7F7F7F
3939393939393939
0101010101010101
:
:
:
:
0800
08c0
0840
0880
PCHIP 0 CSRs:
PERR0
WSBA0
WSBA1
WSBA2
WSBA3
WSM0
WSM1
WSM2
WSM3
TBA0
TBA1
TBA2
TBA3
PCTL
PLAT
MASK
80180000000
0008000000000000
0000000000800000
0000000080000001
0000000000000000
0000000000000000
0000000000700000
000000003FF00000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000224301440091
0000000000000000
0000000000000F77
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
03c0
0000
0040
0080
00c0
0100
0140
0180
01c0
0200
0240
0280
02c0
0300
0340
0400
PCHIP 1 CSRs:
PERR1
80380000000
0008000000000000 : 03c0
4-30
WSBA0
WSBA1
WSBA2
WSBA3
WSM0
WSM1
WSM2
WSM3
TBA0
TBA1
TBA2
TBA3
PCTL
PLAT
MASK
TIG
TRR
SMIR
CPUIR
PSIR
MIR
CSLEEP
SMCR
EVHR
FRAR0
FRAR1
CSRs:
0000000000800000
0000000080000001
0000000000000000
0000000000000000
0000000000700000
000000003FF00000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000624301440091
0000000000000000
0000000000000F77
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
0000
0040
0080
00c0
0100
0140
0180
01c0
0200
0240
0280
02c0
0300
0340
0400
c00000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
:
:
:
:
:
:
:
:
:
:
0000
0001
0002
0003
0004
000d
000e
0017
001c
001d
4-32
cpu00
00004200
00000001
00000000
00000000
00000000
000001c8
00000000
00000000
8ff00000
00152000
00000000
00000000
00000000
00400405
00000000
20000000
00000020
00000000
00000000
20000000
00000020
00001fc0
00000000
00008000
00000000
0f340387
00000000
00000000
00000000
00000004
00000000
00000000
00000000
00000000
00000000
000000b0
00000000
00000020
00000000
000000c3
00000000
00000000
00000000
7bbefcc1
00000007
cpu01
cpu02
cpu03
00004800 00004e00 00005400
00000001 00000001 00000001
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
000001c8 000001c8 000001c8
00000000 00000000 00000000
00000000 00000000 00000000
8ff00000 8ff00000 8ff00000
fe00385f fe00385f fe00385f
00000801 00000801 00000801
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
20000000 20000000 20000000
00000020 00000020 00000020
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
00000020 00000020 00000020
00001fc0 00001fc0 00001fc0
00000000 00000000 00000000
00008000 00008000 00008000
00000000 00000000 00000000
0f340387 0f340387 0f340387
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
00000004 00000004 00000004
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
000000e1 000000e1 000000e1
00000000 00000000 00000000
00000020 00000020 00000020
00000000 00000000 00000000
000000c3 000000c3 000000c3
00000000 00000000 00000000
00000000 00000000 00000000
00000000 00000000 00000000
7bbefcc1 7bbefcc1 7bbefcc1
00000007 00000007 00000007
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
0000
0004
0008
000c
0210
0214
0318
031c
0320
0324
0328
032c
0330
0334
0338
033c
0340
0344
0348
034c
0350
0354
0358
035c
0360
0364
0368
036c
0370
0374
0378
037c
0380
0384
0388
038c
0390
0394
0398
039c
03a0
03a4
03a8
03ac
Device
Pass Hard/Soft Bytes Written Bytes Read
------------ ------ --------- ------------- ----------system
0
0
0
0
0
memory
12
0
0
6719275008
6719275008
memory
12
0
0
6689914880
6689914880
memory
11
0
0
6689914880
6689914880
dka0.0.0.2.1
0
0
0
0
8612352
dka100.1.0.2
0
0
0
0
8649728
dka200.2.0.2
0
0
0
0
8649728
dqa0.0.0.15.
0
0
0
0
3544064
dfa0.0.0.2.1
84
0
0
0
8619520
dfb0.0.0.102 1066
0
0
0
109256192
dva0.0.0.100
0
0
0
0
980992
ewa0.0.0.4.1 362
0
1
1018720
1018496
IOs
IOs
112
elapsed idle
bytes read bytes written
28672
28672
secs
16
4.13 memexer
The memexer command runs a specified number of memory exercisers
in the background. Nothing is displayed unless an error occurs. Each
exerciser tests all available memory in twice the backup cache size
blocks for each pass.
The following example shows no errors.
Device
Pass Hard/Soft Bytes Written Bytes Read
------------ ------ --------- ------------- ----------system
0
0
0
0
0
memory
12
0
0
6719275008
6719275008
memory
12
0
0
6689914880
6689914880
memory
11
0
0
6689914880
6689914880
dka0.0.0.2.1
0
0
0
0
8612352
dka100.1.0.2
0
0
0
0
8649728
dka200.2.0.2
0
0
0
0
8649728
dqa0.0.0.15.
0
0
0
0
3544064
dfa0.0.0.2.1
84
0
0
0
8619520
dfb0.0.0.102 1066
0
0
0
109256192
dva0.0.0.100
0
0
0
0
980992
ewa0.0.0.4.1 362
0
1
1018720
1018496
The following example shows a memory compare error indicating bad DIMMs.
In most cases, the failing bank and DIMM position are specified in the error
message.
P00>>> memexer 3
*** Hard Error - Error #41 - Memory compare error
Diagnostic Name
memtest
Expected value:
Received value
Failing addr:
ID
00000193
25c07
35c07
a11848
Device Pass
brd0
114
4-34
Test
1
Hard/Soft
0
11-FEB-1999
12:00:01
If the memory configuration is very large, the console might not test all of the
memory. The upper limit is 1 GB.
Use the show_status command to display the progress of the tests. Use the
kill or kill_diags command to terminate the test.
Syntax
memexer [number]
Arguments
[number]
4.14 memtest
The memtest command exercises a specified section of memory.
Typically memtest is run from the built-in console script. Advanced
users may want to use the specific options described here.
Base Address
Intlv Mode
---------------- ---------0000000060000000
2-Way
0000000040000000
2-Way
0000000070000000
2-Way
0000000000000000
2-Way
ID
00000118
Device Pass
brd0
1
fffffffe
ffffffff
400004
4-36
Test
1
Hard/Soft
1
0
2-JAN-2000
12:00:01
Starting address
Length of the section to test in bytes
Passcount. In this example, the test will run for 10 passes.
The test detected a failure on DIMM 3, which is located on MMB 2.
Use the show_status command to display the progress of the test. Use the kill
or kill_diags command to terminate the test.
Memtest provides a graycode memory test. The test writes to memory and
then reads the previously written value for comparison. The section of memory
that is tested has its data destroyed. The -z option allows testing outside of the
main memory pool. Use caution because this option can overwrite the console.
Memtest may be run on any specified address. If the -z option is not included
(default), the address is verified and allocated from the firmwares memory
zone. If the -z qualifier is included, the test is started without verification of the
starting address.
When a starting address is specified, the memory is allocated beginning at the
starting address -32 bytes for the length specified. The extra 32 bytes that are
allocated are reserved for the allocation header information. Therefore, if a
starting address of 0xa00000 and a length of 0x100000 is requested, the area
from 0x9fffe0 through 0xb00000 is reserved. This may be confusing if you try to
begin two memtest processes simultaneously with one beginning at 0xa00000
for a length of 0x100000 and the other at 0xb00000 for a length of 0x100000.
The second memtest process will send a message that it is Unable to allocate
memory of length 100000 at starting address b00000. Instead, the second
process should use the starting address of 0xb00020.
NOTE: If memtest is used to test large sections of memory, testing may take a
while to complete. If you issue a Ctrl/C or kill PID in the middle of
testing, memtest may not abort right away. For speed reasons, a check
for a Ctrl/C or kill is done outside of any test loops. If this is not
satisfactory, you can run concurrent memtest processes in the
background with shorter lengths within the target range.
Memtest Test 1 Graycode Test
Memtest Test 1 uses a graycode algorithm to test a specified section of memory.
The graycode algorithm used is: data = (x>>1)^x, where x is an incrementing
value.
Three passes are made of the memory under test.
The first pass writes alternating graycode inverse graycode to each four
longwords. This causes many data bits to toggle between each 16-byte write.
For example graycode patterns for a 32 byte block would be:
Graycode(0) 00000000 Graycode(1) 00000001 Graycode(2) 00000003
Graycode(3) 00000002 Inverse Graycode(4) FFFFFFF9 Inverse Graycode(5)
FFFFFFF8 Inverse Graycode(6) FFFFFFFA Inverse Graycode(7)
FFFFFFFB
The second pass reads each location, verifies the data, and writes the
inverse of the data, one longword at a time. This causes all data bits to be
written as a one and zero.
You can specify the -f (fast) option so that the explicit data verify sections of the
second and third loops are not performed. This does not catch address shorts
but stresses memory with a higher throughput. The ECC/EDC logic can be used
to detect failures.
4-38
Syntax
memtest ( [-sa <start_address>] [-ea <end_address>] [-l <length>]
[-bs <block_size>] [-i <address_inc>] [-p <pass_count>]
[-d <data_pattern>] [-rs <random_seed>] [-ba <block_address>]
[-t <test_mask>] [-se <soft_error_threshold>]
[-g <group_name>] [-rb] [-f] [-m] [-z] [-h] [-mb] )
Options
-sa
-ea
-l
Length of section to test in bytes, default is the zone size with the
-rb option and the block_size for all other tests. -l has precedence
over -ea.
-bs
Block (packet) size in bytes in hex, default 8192 bytes. This is used
only for the random block test. For all other tests the block size
equals the length.
-i
-p
-t
-g
Group name
-se
-f
Options
-m
Timer. Prints out the run time of the pass. Default = off .
-z
-d
Used only for march test (2). Uses this pattern as test pattern.
Default = 5s
-h
-rs
Used only for random test (3). Uses this data as the random seed
to vary random data patterns generated. Default = 0.
-rb
-mb
Memory barrier flag. Used only in the -f graycode test. When set
an mb is done after every memory access. This guarantees serial
access to memory.
-ba
Used only for block test (4). Uses the data stored at this address to
write to each block.
4-40
4.15 net
The net command performs maintenance operations on a specified
Ethernet port. Net -ic initializes the MOP counters for the specified
Ethernet port, and net -s displays the current status of the port,
including the contents of the MOP counters.
MOP BLOCK:
Network list size: 0
MOP COUNTERS:
Time since zeroed (Secs): 3
TX:
Bytes: 0 Frames: 0
Deferred: 0 One collision: 0 Multi collisions: 0
TX Failures:
Excessive collisions: 0 Carrier check: 0 Short circuit: 0
Open circuit: 0 Long frame: 0 Remote defer: 0
Collision detect: 0
RX:
Bytes: 0 Frames: 0
Multicast bytes: 0 Multicast frames: 0
RX Failures:
Block check: 0 Framing error: 0 Long frame: 0
Unknown destination: 0 Data overrun: 0 No system buffer: 0
No user buffers: 0
P00>>>
Syntax
net [-ic]
net [-s]
Arguments
<port_name>
4-42
4.16 nettest
The nettest command tests the network ports using MOP loopback.
Typically nettest is run from the built-in console script. Advanced
users may want to use the specific options and environment variables
described here.
e*
Nettest performs a network test. It can test the ei* or ew* ports in internal
loopback, external loopback, or live network loopback mode.
Nettest contains the basic options to run MOP loopback tests. Many
environment variables can be set from the console to customize nettest before
nettest is started. The environment variables, a brief description, and their
default values are listed in the syntax table in this section. Each variable name
is preceded by e*a0_ or e*b0_ to specify the desired port.
You can change other network driver characteristics by modifying the port
mode. See the -mode option.
Use the show_status display to determine the process ID when terminating an
individual diagnostic test. Use the kill or kill_diags command to terminate
tests.
4-44
Syntax
nettest ( [-f <file>] [-mode <port_mode>] [-p <pass_count>]
[-sv <mop_version>] [-to <loop_time>] [-w <wait_time>]
[<port>] )
Arguments
<port>
Options
-f <file>
-mode <port_mode>
-p <pass_count>
-sv <mop_version>
-to <loop_time>
-w <wait_time>
Environment
Variables
e*a*_loop_count
e*a*_loop_inc
e*a*_loop_patt
loop_size
4-46
SMB0
001f8408
SMB0
001f8408
001f8418
001f8428
001f8438
001f8448
001f8458
SMB0
001f8408
001f8418
001f8428
001f8438
SMB0
001f8408
001f8418
001f8428
001f8438
SMB0
001f8408
001f8418
001f8428
001f8438
001f8408
001f8418
001f8428
001f8438
SMB0
P00>>>
4-48
15
................
................
................
................
................
................
........
....S...........
................
................
...............Y
J........54-1234
5-01.A001 ...D.
4Q.AAAAAAAAAAAAA
................
................
................
................
................
................
................
................
.............J!.
The output of the show error command is based on information logged to the
serial control bus EEPROMs on the system FRUs. Both the operating system
and the ROM-based diagnostics log errors to the EEPROMs. This functionality
allows you to generate an error log from the console environment. No errors are
displayed for fans or the OCP because these components do not have an
EEPROM.
Syntax
show error
All FRUs with errors are displayed. If no errors are logged, nothing is displayed
and you are returned to the SRM console prompt.
Example 420 shows TDD, SDD, checksum, and sys_serial_num mismatch
errors logged to the EEPROM on the system motherboard (SMB0). Table 42
shows a reference to these errors. The bit masks correspond to the bit masks
that would be displayed in the E field of the show fru command.
FRU to which errors are logged; in this example the system motherboard,
SMB0.
A TDD error has been logged. TDDs (test-directed diagnostics) test specific
functions sequentially. Typically, nothing else is running during the test.
TDDs are performed in SROM or XSROM or early in the console power-up
flow.
Text Message
01
02
04
08
<fruname> EEPROM
Unreadable
Reserved.
10
20
40
80
<fruname> SYS_SERIAL_NUM
Mismatch
4-50
FRUname
SMB0
SMB0.CPU0
SMB0.CPU1
SMB0.CPU2
SMB0.CPU3
SMB0.MMB0
SMB0.MMB0.DIM1
SMB0.MMB0.DIM2
SMB0.MMB0.DIM3
SMB0.MMB0.DIM4
SMB0.MMB0.DIM5
SMB0.MMB0.DIM6
SMB0.MMB1
SMB0.MMB1.DIM1
SMB0.MMB1.DIM2
SMB0.MMB1.DIM3
SMB0.MMB1.DIM4
SMB0.MMB1.DIM5
SMB0.MMB1.DIM6
SMB0.MMB2
SMB0.MMB2.DIM1
SMB0.MMB2.DIM2
SMB0.MMB2.DIM3
SMB0.MMB2.DIM4
SMB0.MMB2.DIM5
SMB0.MMB2.DIM6
SMB0.MMB3
SMB0.MMB3.DIM1
SMB0.MMB3.DIM2
SMB0.MMB3.DIM3
SMB0.MMB3.DIM4
SMB0.MMB3.DIM5
SMB0.MMB3.DIM6
SMB0.CPB0
SMB0.CPB0.PCI4
SMB0.CPB0.PCI5
SMB0.CPB0.PCIA
SMB0.CPB0.SBM0
PWR0
PWR1
FAN1
FAN2
E
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
Part#
Serial#
Misc.
54-25385-01.E01
AY94412345
54-30158-A5
NI90260078
54-30158-A5
NI90260073
54-30158-A5
NI90260056
54-30158-A5
NI90260071
54-25582-01.B02
AY90112345
54-24941-EA.A01CPQ
NI90202001
54-24941-EA.A01CPQ
NI90200102
54-24941-EA.A01CPQ
NI90200103
54-24941-EA.A01CPQ
NI90200104
54-24941-EA.A01CPQ
NI90202005
54-24941-EA.A01CPQ
NI90202006
54-25582-01.B02
AY90112301
54-25053-BA.A01CPQ
NI90112341
54-25053-BA.A01CPQ
NI90112342
54-25053-BA.A01CPQ
NI90112343
54-25053-BA.A01CPQ
NI90112344
54-25053-BA.A01CPQ
NI90112345
54-25053-BA.A01CPQ
AY80112346
54-25582-01.B02
AY80012302
54-25053-BA.A01CPQ
NI90112331
54-25053-BA.A01CPQ
AY80112332
54-25053-BA.A01CPQ
AY80112333
54-25053-BA.A01CPQ
AY80112334
54-25053-BA.A01CPQ
AY80112335
54-25053-BA.A01CPQ
AY80112336
54-25582-01.B02
AY90112303
54-25053-BA.A01CPQ
AY80112341
54-25053-BA.A01CPQ
AY80112342
54-25053-BA.A01CPQ
AY80112343
54-25053-BA.A01CPQ
AY80112344
54-25053-BA.A01CPQ
AY80112345
54-25053-BA.A01CPQ
AY80112346
54-25573-01
AY80100999
ELSA GLoria Synergy
NCR 53C895
DE500-BA Network Cont
30-49448-01.A02
2P90700557 API-7850
30-49448-01.A02
2P90700558 API-7850
70-40073-01
Fan
70-40073-01
Fan
Other
FAN3
FAN4
FAN5
FAN6
JIO0
OCP0
00
00
00
00
00
00
70-40072-01
70-40071-01
70-40073-02
70-40074-01
54-25575-01
70-33894-0x
Fan
Fan
Fan
Fan
Junk I/O
OCP
P00>>>
FRUname
Part #
Serial #
Model/Other
Alias/Misc
4-52
Table 43 lists bit assignments for failures that could potentially be listed in the
E (error) field of the show fru command. Because the E field is only two
characters wide, bits are ored together if the device has multiple errors. For
example, the E field for a FRU with both TDD (02) and SDD (04) errors would
be 06:
010 | 100 = 110 (6)
Meaning
01
Hardware failure
02
04
08
Reserved
10
20
40
80
4.20 show_status
The show_status command displays the progress of diagnostics. The
command reports one line of information per executing diagnostic.
Many of the diagnostics run in the background and provide
information only if an error occurs.
ID
Program
Device
Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ----------00000001
idle system
0
0
0
0
0
0000125e
memtest memory
12
0
0
6719275008
6719275008
00001261
memtest memory
12
0
0
6689914880
6689914880
00001268
memtest memory
11
0
0
6689914880
6689914880
0000126f
exer_kid dka0.0.0.2.1
0
0
0
0
8612352
00001270
exer_kid dka100.1.0.2
0
0
0
0
8649728
00001271
exer_kid dka200.2.0.2
0
0
0
0
8649728
00001278
exer_kid dqa0.0.0.15.
0
0
0
0
3544064
00001280
exer_kid dfa0.0.0.2.1
84
0
0
0
8619520
00001281
exer_kid dfb0.0.0.102 1066
0
0
0
109256192
0000128e
exer_kid dva0.0.0.100
0
0
0
0
980992
00001381
nettest ewa0.0.0.4.1 362
0
1
1018720
1018496
P00>>>
4-54
Process ID
4.21 sys_exer
The sys_exer command exercises the devices displayed with the show
config command. Tests are run concurrently and in the background.
Nothing is displayed after the initial test startup messages unless an
error occurs.
4-56
4.22 test
The test command verifies all the devices in the system. This command
can be used on all supported operating systems.
4-58
To run a complete diagnostic test using the test command, the system
configuration must include:
Read-only tests: DK* disks, DR* disks, DQ* disks, MK* tapes, DV* floppy.
NOTE: You must install media to test disks, tapes, and the floppy drive. Since
no write tests are performed, it is safe to test disks and tapes that
contain data.
3.
Console loopback tests if -lb argument is specified: COM2 serial port and
parallel port.
4.
VGA console tests: These tests are run only if the console environment
variable is set to serial. The VGA console test displays rows of the word
compaq.
5.
Chapter 5
Error Logs
This chapter tells how to interpret error logs reported by the operating system.
The following topics are covered:
Machine Checks/Interrupts
Error Logs
5-1
5.1
5-2
5.1.1
Error Logs
5-3
5.1.2
When you invoke the Compaq Analyze GUI, the node localhost opens
by default for all operating systems. The localhost is the system on
which CA is running. If an event has occurred, it is listed under
localhost Events. See Figure 51.
5-4
Error Logs
5-5
5.1.3
After you select the Problem Found report and click on Display
Information, a full description of the error is displayed and probable
FRUs and their location are called out. Figure 53 shows the beginning
of a Compaq Analyze problem found report.
Managed Entity
The Managed Entity designator includes the system host name (typically a
computer name for networking purposes), the type of computer system
(Compaq AlphaServer ES40), and the error event identification. The error
event identification uses new common event header Event_ID_Prefix and
Event_ID_Count components. The Event_ID_Prefix refer to a OS specific
5-6
identification for this event type. The Event_ID_Count indicates the number
this event is of this event type.
Brief Description
The Brief Description designator indicates whether the error event is related to
the CPU, system (PCI, storage, and so on), or environmental subsystem.
Callout ID
The last 12 characters of the Callout ID designator can be used to determine the
revision level of the analysis rule-set that is being used.
Severity
The Severity designator indicates the severity of the problem.
Severity
Level
Service
Relevance
1
2
3
Critical
Major
Minor
Information
Unknown
Comments
Not currently used.
Fatal event that typically requires service.
Non-Fatal or Redundant warning event that
typically requires future service, but system still
operates normally.
System service event such as enclosure PCI or fan
door is open and requires closing.
Not currently used.
Reporting Node
The Reporting Node designator is synonymous with the Managed Entity host
name when Compaq Analysis is used to diagnose problems on the system on
which it is running. For future implementations, the reporting node may be a
system server reporting about a client within an enterprise computing
environment.
Full Description
The Full Description designator provides detailed error information, which can
include a description of the detected fault or error condition, the specific address
or data bit where this fault or error occurred, the probable FRU list, and service
related information.
Error Logs
5-7
5-8
FRU List
The FRU List designator lists the most probable defective FRUs. This list
indicates that service needs to be administered to one or more of these FRUs.
The information typically include the FRU probability, manufacturer, system
device type, system physical location, part number, serial number, and
firmware revision level (if applicable).
In Figure 54 the most probable failing FRU is DIMM 3 on MMB1. The next
less probable is the system motherboard, and the least probable is MMB1.
Error Logs
5-9
5-10
Evidence
The Evidence designator provides information that leads Compaq Analyze to
identify the failing FRU and its location. A portion of the Evidence designator
is shown in Figure 55. The evidence provided depends on the type of error that
is detected. The error types are:
CPU Correctable Error (630)
CPU Uncorrectable Error (670)
System Correctable Error (620)
System Uncorrectable Error (660)
System Correctable Environmental (680)
Brief descriptions of the errors in these categories are given in Section 5.3. See
Appendix D for the source data Compaq Analyze uses to isolate to the FRUs.
The Evidence designator provides a hex dump of the error event information
that triggered the indictment. The evidence is broken into segments and
described as follows:
Error Logs
5-11
5.2
3.
Logs double error halt error frames into the flash ROM
For single halts, logs the uncorrectable logout frame into the DPR.
5-12
Memory DIMMs
Error Logs
5-13
5.3
Machine Checks/Interrupts
The exceptions that result from hardware system errors are called
machine checks/interrupts. They occur when a system error is detected
during the processing of a data request.
During the error-handling process, errors are first handled by the appropriate
PALcode error routine and then by the associated operating system error
handler. PALcode transfers control to the operating system through the PAL
handler.
Table 52 lists the machine checks/interrupts that are related to error events.
The designations 630, 670, 620, 660, and 680 indicate a system control
block (SCB) offset to the fatal system error handler for Tru64 UNIX and
OpenVMS.
Error Descriptions
5-14
Error Descriptions
Error Logs
5-15
5.3.1
5-16
63
56
55
48
47
40
39
32
31
24
23
16
15
ech0000
NEW COMMON OS HEADER
ech+nnnn
lfh0000
lfh+nnnn
lfev60000
lfev6+nnnn
lfctt_A0[u]
lfctt_A8[u]
lfctt_B0[u]
lfctt_B8[u]
lfctt_C0[u]
SESF<15:0>=
0002(hex)
lfett_C8[u]
Pchip1 Extended Tsunami/Typhoon System Packet
lfett_138[u]
eelcb_140
eelcb_190
eelcb_1E0
eelcb_230
eelcb_280
eelcb_2D0
2D8
Error Logs
5-17
5.4
door_open
0000000000000004
temp_warning
0000000000000000
fan_ctrl_fault 0000000000000000
power_down_code 0000000000000000
reserved_1
0000000000000000
5-18
P00>>>
*** unexpected system event through vector 680 on CPU 0
os_flags
0000000000000000
cchip_dirx
0004000000000000
tig_smir
0000000000000008
tig_cpuir
000000000000000f
tig_psir
0000000000000003
lm78_isr
0000000000000000
door_open
0000000000000040
temp_warning
0000000000000000
fan_ctrl_fault 0000000000000000
power_down_code 0000000000000000
reserved_1
0000000000000000
Error Logs
5-19
Chapter 6
System Configuration and Setup
This chapter describes how to configure and set up ES40 systems. The
following topics are covered:
System Consoles
Configuring Devices
Booting Linux
OpenVMS Galaxy
6-1
6.1
System Consoles
SRM Console
Systems running the Tru64 UNIX or OpenVMS operating systems are
configured from the SRM console, a command-line interface (CLI). From the
CLI you can enter commands to configure the system, view the system
configuration, boot the system, and run ROM-based diagnostics.
AlphaBIOS Console
AlphaBIOS is an enhanced BIOS graphical user interface for Compaq Alpha
platforms. AlphaBIOS is used to run AlphaBIOS-compliant utilities. From
the Utilities menu on the Setup screen, you can select options to run maintenance programs and display error frames for hardware errors logged to the
flash ROM.
RMC CLI
The remote management console (RMC) provides a command-line interface
(CLI) for controlling the system. You can use the CLI either locally or remotely
(modem connection) to power the system on and off, halt or reset the system,
and monitor the system environment. You can also use the dump, env, and
status commands to help diagnose errors. See Chapter 7 for details.
6-2
6.1.1
If console is set to graphics, the SRM console expects to find a VGA card
connected to PCI 0 and, if so, displays power-up information on the VGA
monitor after VGA initialization has been completed.
You can verify the display device with the SRM show console command and
change the display device with the SRM set console command. If you change
the display device setting, you must reset the system (with the Reset button or
the init command) to put the new setting into effect.
In the following example, the user displays the current console device (a
graphics device) and then resets it to a serial device. After the system
initializes, output will be displayed on the serial terminal.
P00>>> show console
console
graphics
P00>>> set console serial
P00>>> init
.
.
.
6-3
6.1.2
6-4
6.2
show config
show device
show fru
show memory
6-5
6.1
To check the setting for a specific environment variable, enter the show
envar command, where the name of the environment variable is
substituted for envar.
To reset an environment variable, use the set envar command, where the
name of the environment variable is substituted for envar.
6-6
set envar
The set command sets or modifies the value of an environment variable. It can
also be used to create a new environment variable if the name used is unique.
Environment variables pass configuration information between the console and
the operating system. Their settings determine how the system powers up, boots
the operating system, and operates. The syntax is:
set envar value
envar
value
New values for the following environment variables take effect only after you
reset the system by pressing the Reset button or issuing the init command.
auto_action
console
cpu_enabled
os_type
pk*0_fast
pk*0_host_id
pk*0_soft_term
console_memory_allocation
show envar
The show envar command displays the current value (or setting) of an
environment variable. The syntax is:
show envar
envar
Table 61 summarizes the SRM environment variables used most often on the
ES40 system.
6-7
Attributes
1
Description
auto_action
NV,W
bootdef_dev
NV,W
boot_file
NV,W
boot_osflags
NV,W
NVNonvolatile. The last value saved by system software or set by console commands
is preserved across cold bootstraps (when the system goes through a full initialization),
and long power outages.
WWarm nonvolatile. The last value set by system software is preserved across warm bootstraps
(UNIX shutdown -r command, OpenVMS REBOOT command, or a crash and reboot; not all of
the SRM initialization is run) and restarts.
6-8
Attributes
Description
boot_osflags
(continued)
NV,W
6-9
Attributes
Description
DFull dump; implies s as well. By default, if
Tru64 UNIX crashes, it completes a partial
memory dump. Specifying D forces a full dump at
system crash.
boot_osflags
(continued)
NV,W
com2_baud
NV,W
com1_flow
com2_flow
NV,W
com1_mode
6-10
NV
Attributes
Description
com1_modem
com2_modem
NV,W
console
NV
console_memory
_allocation
NV
NV
ei*0_inet_init or
ew*0_inet_init
NV
6-11
Attributes
Description
ei*0_mode or
ew*0_mode
NV
ei*0_protocols or
ew*0_protocols
NV
heap_expand
NV
kbd_hardware
type
6-12
NV
Attributes
Description
kzpsa_host_id
language
NV
memory_test
NV
ocp_text
NV
os_type
NV
password
NV
pci_parity
NV
6-13
Attributes
Description
pk*0_fast
NV
pk*0_host_id
NV
pk*0_soft_term
NV
6-14
Attribute
Description
sys_serial_num
NV
tt_allow_login
NV
6-15
6.2
Tru64 UNIX and OpenVMS systems are factory set to halt in the SRM
console. You can change these defaults, if desired.
Systems can boot automatically (if set to autoboot) from the default boot device
under the following conditions:
6-16
6.2.1
With the boot setting, the operating system boots automatically after the
SRM init command is issued or the Reset button is pressed.
With the restart setting, the operating system boots automatically after the
SRM init command is issued or the Reset button is pressed, and it also
reboots after an operating system crash.
To set the default action to boot, enter the following SRM commands:
P00>>> set auto_action boot
P00>>> init
See the User Interface Guide for more information.
6-17
6.3
You can change the default boot device with the set bootdef_dev
command.
You can designate a default boot device. You change the default boot device by
using the set bootdef_dev SRM console command. For example, to set the
boot device to the IDE CD-ROM, enter commands similar to the following:
P00>>> show bootdef_dev
bootdef_dev
dka400.4.0.1.1
P00>>> set bootdef_dev dqa500.5.0.1.1
P00>>> show bootdef_dev
bootdef_dev dqa500.5.0.1.1
See the User Interface Guide for more information.
6-18
6.4
Depending upon the type of hardware you have, you may have to run
hardware configuration utilities. Hardware configuration diskettes
are shipped with your system or with options that you order.
Typical configuration utilities include:
6-19
6.4.1
F1=Help
ESC=Exit
PK0954a
6-20
4. In the Run Maintenance Program dialog box, type the name of the program
to be run in the Program Name field. Then Tab to the Location list box, and
select the hard disk partition, floppy disk, or CD-ROM drive from which to
run the program.
5. Press Enter to execute the program.
Figure 62 Run Maintenance Program Dialog Box
AlphaBIOS Setup
Display System Configuration...
Upgrade AlphaBIOS
Hard Disk Setup...
CMOS S
Run Maintenance Program
Networ
Instal
Utilit 1 Program Name: arccf.exe
About
Location: A:
ENTER=Execute
A:
CD:
Disk 0, Partition 1
Disk 0, Partition 2
Disk 1, Partition 1
PK0929
6-21
6.4.2
6-22
6.4.3
Utilities are run from a serial terminal the same way as from a VGA
monitor. The menus are the same, but some key mappings are different.
VTxxx Key
F1
Ctrl/A
F2
Ctrl/B
F3
Ctrl/C
F4
Ctrl/D
F5
Ctrl/E
F6
Ctrl/F
F7
Ctrl/P
F8
Ctrl/R
F9
Ctrl/T
F10
Ctrl/U
Insert
Ctrl/V
Delete
Ctrl/W
Backspace
Ctrl/H
Escape
Ctrl/[
6-23
1.
2.
3.
4.
In the Run Maintenance Program dialog box, type the name of the program
to be run in the Program Name field. Then tab to the Location list box, and
select the hard disk partition, floppy disk, or CD-ROM drive from which to
run the program.
5.
6-24
6.4.4
Issue the alphabios command to start AlphaBIOS Setup. (If the system has
a VGA monitor, you can set the SRM console environment variable to
graphics.)
2.
3.
In the Run Maintenance Program dialog box, type arccf in the Program
Name: field.
4.
Press Enter to execute the program. The Main menu displays the following
options:
[01.View/Update Configuration]
02.Automatic Configuration
03.New Configuration
04.Initialize Logical Drive
05.Parity Check
06.Rebuild
07.Tools
08.Select Controller
09.Controller Setup
10.Diagnostics
6-25
6.5
The set password and set secure commands set SRM security. The
login command turns off security for the current session. The clear
password command returns the system to user mode.
The SRM console has two modes, user mode and secure mode.
User mode allows you to use all SRM console commands. User mode is the
default mode.
Secure mode allows you to use only the boot and continue commands. The
boot command cannot take command-line parameters when the console is
in secure mode. The console boots the operating system using the environment variables stored in NVRAM (boot_file, bootdef_dev, boot_flags).
6-26
Setting a password. If a password has not been set and the set password
command is issued, the console prompts for a password and verification.
The password and verification are not echoed.
Changing a password. If a password has been set and the set password
command is issued, the console prompts for the new password and verification, then prompts for the old password. The password is not changed if the
validation password entered does not match the existing password stored in
NVRAM.
The set secure command console puts the console into secure mode. A
password must be set before you can issue set secure. Once the console is
secure, only the boot and continue commands can be used. The boot
command cannot take command-line parameters.
Entering the login command turns off security features for the current
console session. This allows the operator to enter any SRM commandin
this case, a boot command with command-line parameters.
6-27
6-28
6.6
Configuring Devices
6.6.1
CPU Configuration
PK0228
6-29
CPU 3
CPU 2
CPU 1
CPU 0
PK0229
A CPU must be installed in slot 0. The system will not power up without a
CPU in slot 0.
2.
3.
6-30
6.6.2
Memory Configuration
Fill sets in numerical order. Populate all 4 slots in Set 0, then populate Set
1, and so on.
An array is one set for systems that support 16 DIMMs and two sets for
systems that support 32 DIMMs.
DIMMs in an array must be the same capacity and type. For example,
suppose you have populated Sets 0, 1, 2, and 3. When you populate Set 4,
the DIMMs must be the same capacity and type as those installed in Set 0.
Similarly, Set 5 must be populated with DIMMs of the same capacity and
type as are in Set 1, and so on, as indicated in the following table.
Array
Model 2 System
(Supports 32 DIMMs)
Model 1 System
(Supports 16 DIMMs)
Set 0
Set 1
Set 2
Set 3
6-31
Unstacked DIMMs
Stacked DIMMs
PK1209
6-32
Base Address
---------------0000000060000000
0000000040000000
0000000070000000
0000000000000000
Intlv Mode
---------2-Way
2-Way
2-Way
2-Way
6-33
Sets
6
6
4
4
2
2
0
0
Sets
7
7
5
5
3
3
1
1
MMB 2
MMB 0
Array 1
Sets 1 & 5
Array 3
Sets 3 & 7
Array 0
Sets 0 & 4
MMB 3
Sets
6
6
4
4
2
2
0
0
Array 2
Sets 2 & 6
MMB 1
J8
J7
J6
J5
J4
J3
J2
J1
PK0202
6-34
J8
J7
J6
J5
J4
J3
J2
J1
MMB 1
Sets
3
3
5
5
7
MMB 3
0
Sets
2
2
4
4
6
MMB 0
Sets
3
3
5
5
7
MMB 2
Array 1
Sets 1 & 5
Array 0
Sets 0 & 4
Array 3
Sets 3 & 7
Array 2
Sets 2 & 6
PK0203
6-35
6.6.3
PCI Configuration
10-Slot
System
6-Slot
System
1
2
3
4
5
6
7
8
9
10
1
2
3
8
9
10
PK0226
The PCI slots are split across two independent 64-bit, 33 MHz PCI buses: PCI0
and PCI1. These buses correspond to Hose 0 and Hose 1 in the system logical
configuration. The slots on each bus are listed below.
System Variant
Slots on PCI 0
Slots on PCI 1
Six-slot system
13
810
Ten-slot system
14
510
Some PCI options require drivers to be installed and configured. These options
come with a floppy or a CD-ROM. Refer to the installation document that came
with the option and follow the manufacturer's instructions.
6-36
1 2 3 4 5 6 7 8 9 10
6-Slot System
1 2 3
8 9 10
PK0227
Restrictions
OpenVMS: If you have a KZPAC RAID controller, it must be installed in a slot
on PCI bus 1. It cannot be installed on PCI bus 0.
Tru64 UNIX: Multifunction PCI options cannot be installed in PCI bus 0, slot
1 or slot 2. Multifunction options currently include:
6-37
6.6.4
Tower
0
1
2
PK0207A
6-38
Two Power Supplies. Two power supplies are required if the system has
more than two CPUs or if the system has a second storage cage.
Redundant Power Supply. If one power supply fails, the redundant supply
provides power and the system continues to operate normally. A second power
supply adds redundancy for an entry-level system such as the system described
under Single Power Supply. A third power supply adds redundancy for a
system that requires two power supplies.
Recommended Installation Order. Generally, power supply 0 is installed
first, power supply 1 second, and power supply 2 third, but the supplies can be
installed in any order. See Figure 610. The power supply numbering corresponds to the numbering displayed by the SRM show power command.
6-39
6.7
1
7
Left Cage
2
Right Cage
Left
Cage
J10
J2
8
4
J2
J9
10
Right Cage
PK0299
6-40
Shut down the operating system and turn off power to the system. Unplug
the power cord from each power supply.
2.
Remove enclosure panels and remove the cover from the PCI card cage.
3.
4.
Unscrew the four screws securing the disk cage filler plate and set them
aside. Discard the filler plate.
5.
Set the jumper (J10) to the parked position (one pin only).
6.
Slide the cage into the system chassis, and replace the four screws.
7.
8.
Plug one end of the 68-conductor SCSI cable (17-04867-01) into the SCSI
controller . Route it through the opening in the PCI cage. Snap open
the cable management clip , route the cable through, and close the clip.
Plug the other end of the cable into the storage backplane.
9.
Plug the 16-position end of the 29-inch cable (17-04914-01) into the PCI
backplane. Route the cable through the opening in the PCI cage and plug
the 14-position end into the J2 connector on the storage cage.
6-41
10. Replace the PCI card cage cover and enclosure panels.
11. Install hard drives.
Installing the Left Cage (or Bottom Cage)
1.
Shut down the operating system and turn off power to the system. Unplug
the power cord from each power supply.
2.
Remove enclosure panels and remove the cover from the PCI card cage.
3.
Pull out fans 3 and 4, which are blocking access to the cabling.
4.
5.
Unscrew the four screws securing the disk cage filler plate and set them
aside. Discard the filler plate.
6.
7.
8.
9.
Plug one end of the 68-conductor SCSI cable (17-04867-01) into the SCSI
controller . Route it through the opening in the PCI cage. Snap open
the cable management clip , route the cable through, and close the clip.
Plug the other end of the cable into the storage backplane.
10. Plug the end of the 6-inch cable (17-04960-01) marked out into the J9
connector on the back of the right (or top) cage, and plug the end marked
in into the J2 connector on the left (or bottom) cage.
NOTE: Cable 17-04914-01 and cable 17-04960-01 are mutually exclusive.
11. Slide the cage the rest of the way into the system chassis and replace the
four screws set aside previously.
12. Replace the fans.
13. Replace the PCI card cage cover and enclosure panels.
14. Install hard drives.
6-42
Verification
1. Turn on power to the system.
2. When the system powers up to the P00>>> prompt, enter the SRM show
device command to determine the device name. For example, look for dq,
dk, ew, and so on.
6-43
6.8
Booting Linux
boot_dev
dka0.0.0.0.0
boot_file
2/boot/vmlinux.gz
boot_osflags
root=/dev/sda2
boot_reset
OFF
bootdef_dev
dka0.0.0.0.0
booted_dev
booted_file
booted_osflags
P00>>> boot
6-44
Enter the show boot* command to verify the boot settings. Example 65
shows boot parameters for Red Hat. The boot file for SuSE is
2/boot/vmlinuz.
6-45
6.9
OpenVMS Galaxy
6-46
CAUTION: The file structures of the operating systems are incompatible. When
you switch between operating systems, you cannot read the data off
disks associated with the operating system that was running
previously.
Be sure to remove the system and data disks for the operating
system you will not be using. Otherwise, you risk corrupting data
on the system disk.
6-47
CAUTION: Before switching operating systems, make a note of the boot path
and location of the system disk (controller, SCSI ID number, and so
on) of the operating system you are removing so that you can restore
that operating system at a later date.
1.
View and save the boot parameters for the operating system you are
removing.
2.
Shut down the operating system and power off the system. Unplug the
power cord from each power supply.
3.
4.
Remove any options that are not supported on the operating system you are
installing and replace them with supported options.
5.
Remove the system disk and data disks and insert the system and data
disks for the operating system you are installing.
6.
7.
8.
6-48
Chapter 7
Using the Remote
Management Console
You can manage the system through the remote management console (RMC).
The RMC is implemented through an independent microprocessor that resides
on the system motherboard. The RMC also provides access to the repository for
all error information in the system.
This chapter explains the operation and use of the RMC. Sections are:
RMC Overview
Operating Modes
Terminal Setup
Troubleshooting Tips
7-1
7.1
RMC Overview
Monitors thermal sensors on the CPUs, the PCI backplane, and the power
supplies
Controls the operator control panel (OCP) display and writes status
messages on the display
Shuts down the system if any fatal conditions exist. For example:
Provides a command-line interface (CLI) for the user to control the system.
From the CLI you can power the system on and off, halt or reset the system,
and monitor the system environment.
Passes error log information to the DPR so that this information can be
accessed by the system.
7-2
7-3
7.2
Operating Modes
DUART
COM1
COM1 Port
UART
RMC PIC
Processor
Modem Port
UART
RMC Modem
Port (Remote)
Modem
RMC COM1
Port (Local)
Modem
RMC>
RMC>
7-4
Through Mode
Through mode is the default operating mode. The RMC routes every character
of data between the internal system COM1 port and the active external port,
either the local COM1 serial port (MMJ) or the 9-pin modem port. If a modem
is connected, the data goes to the modem. The RMC filters the data for a
specific escape sequence. If it detects the escape sequence, it connects to the
RMC CLI.
Figure 71 illustrates the data flow in Through mode. The internal system
COM1 port is connected to one port of the DUART chip, and the other port is
connected to a 9-pin external modem port, providing full modem controls. The
DUART is controlled by the RMC microprocessor, which moves characters
between the two UART ports. The local MMJ port is always connected to the
internal UART of the microprocessor. The escape sequence signals the RMC to
connect to the CLI. Data issued from the CLI is transmitted between the RMC
microprocessor and the active port that connects to the RMC CLI.
NOTE: The internal system COM1 port should not be confused with the
external COM1 serial port on the back of the system. The internal
COM1 port is used by the system software to send data either to the
COM1 port on the system or to the RMC modem port if a modem is
connected.
Local Mode
You can set a Local mode in which only the local channel can communicate with
the system COM1 port. In Local mode the modem is prevented from sending
characters to the system COM1 port, but you can still connect to the RMC CLI
from the modem.
7-5
7.2.1
Bypass Modes
For modem connection, you can set the operating mode so that data
and control signals partially or completely bypass the RMC. The
bypass modes are Snoop, Soft Bypass, and Firm Bypass.
DUART
COM1
COM1 Port
UART
RMC PIC
Processor
Bypass
Modem Port
UART
RMC Modem
Port (Remote)
RMC COM1
Port (Local)
Modem
Modem
RMC>
RMC>
7-6
Figure 72 shows the data flow in the bypass modes. Note that the internal
system COM1 port is connected directly to the modem port.
NOTE: You can connect a serial terminal to the modem port in any of the
bypass modes.
The local terminal is still connected to the RMC and can still connect to the
RMC CLI to switch the COM1 mode if necessary.
Snoop Mode
In Snoop mode data partially bypasses the RMC. The data and control signals
are routed directly between the system COM1 port and the external modem
port, but the RMC taps into the data lines and listens passively for the RMC
escape sequence. If it detects the escape sequence, it connects to the RMC CLI.
The escape sequence is also passed to the system on the bypassed data lines. If
you decide to change the default escape sequence, be sure to choose a unique
sequence so that the system software does not interpret characters intended for
the RMC.
In Snoop mode the RMC is responsible for configuring the modem for dial-in as
well as dial-out alerts and for monitoring the modem connectivity.
Because data passes directly between the two UART ports, Snoop mode is
useful when you want to monitor the system but also ensure optimum COM1
performance.
Soft Bypass Mode
In Soft Bypass mode all data and control signals are routed directly between the
system COM1 port and the external modem port, and the RMC does not listen
to the traffic on the COM1 data lines. The RMC is responsible for configuring
the modem and monitoring the modem connectivity. If the RMC detects loss of
carrier or the system loses power, it switches automatically into Snoop mode. If
you have set up the dial-out alert feature, the RMC pages the operator if an
alert is detected and the modem line is not in use.
Soft Bypass mode is useful if management applications need the COM1 channel
to perform a binary download, because it ensures that RMC does not
accidentally interpret some binary data as the escape sequence.
7-7
After downloading binary files, you can set the com1_mode environment
variable from the SRM console to switch back to Snoop mode or other modes for
accessing the RMC, or you can hang up the current modem session and
reconnect it.
Firm Bypass Mode
In Firm Bypass mode all data and control signals are routed directly between
the system COM1 port and the external modem port. The RMC does not
configure or monitor the modem. Firm Bypass mode is useful if you want the
system, not the RMC, to fully control the modem port and you want to disable
RMC remote management features such as remote dial-in and dial-out alert.
You can switch to other modes by resetting the com1_mode environment
variable from the SRM console, but you must then set up the RMC again from
the local terminal.
7-8
7.3
Terminal Setup
You can use the RMC from a modem hookup or the serial terminal
connected to the system. As shown in Figure 73, a modem is
connected to the dedicated 9-pin modem port and a terminal is
connected to the COM1 serial port/terminal port (MMJ) .
VT
PK0934
7-9
7.4
You type an escape sequence to connect to the RMC CLI. You can
connect to the CLI from any of the following: a modem, the local serial
console terminal, the local VGA monitor, or the system. The system
includes the operating system, SRM, AlphaBIOS, or an application.
You can connect to the RMC CLI from the local terminal regardless of the
current operating mode.
You can connect to the RMC CLI from the modem if the RMC is in Through
mode, Snoop mode, or Local mode. In Snoop mode the escape sequence is
passed to the system and displayed.
7-10
7-11
7.5
Sets the baud rate of the COM1 serial port and the
modem port. The default is 9600.
com1_flow
com1_mode
com1_modem
See the ES40 User Interface Guide for information on setting SRM environment
variables.
7-12
7.6
7-13
Command Conventions
Observe the following conventions for entering RMC commands:
For commands consisting of two words, enter the entire first word and at
least one letter of the second word. For example, you can enter disable a
for disable alert.
For commands that have parameters, you are prompted for the parameter.
7-14
7.6.1
Use the set com1_mode command from SRM or RMC to define the
COM1 data flow paths.
You can set com1_mode to one of the following values:
through
All data passes through RMC and is filtered for the escape
sequence. This is the default.
snoop
Data partially bypasses RMC, but RMC taps into the data lines
and listens passively for the escape sequence.
soft_bypass
firm_bypass
local
Changes the focus of the COM1 traffic to the local MMJ port if
RMC is currently in one of the bypass modes or is in Through
mode with an active remote session.
NOTE: For more details, see the ES40 User Interface Guide.
7-15
7.6.2
Example 72 status
RMC> status
PLATFORM STATUS
On-Chip Firmware Revision: V1.0
Flash Firmware Revision: V1.2
Server Power: ON
System Halt: Deasserted
RMC Power Control: ON
Escape Sequence: ^[^[RMC
Remote Access: Enabled
RMC Password: set
Alert Enable: Disabled
Alert Pending: YES
Init String: AT&F0E0V0X0S0=2
Dial String: ATXDT9,15085553333
Alert String: ,,,,,,5085553332#;
Com1_mode: THROUGH
Last Alert: CPU door opened
Logout Timer: 20 minutes
User String:
7-16
Meaning
On-Chip Firmware
Revision:
Flash Firmware
Revision:
Server Power:
ON = System is on.
OFF = System is off.
System Halt:
Escape Sequence:
Remote Access:
RMC Password:
Alert Enable:
Alert Pending:
Init String:
Dial String:
Alert String:
Com1_mode:
Last Alert:
Logout Timer:
User String:
7-17
7.6.3
command
provides
snapshot
of
the
system
Example 73 env
RMC> env
System Hardware Monitor
Temperature (warnings at 45.0C, power-off at 50.0C)
CPU0: 26.0C
Zone0: 29.0C
Fan RPM
Fan1: 2295
Fan4: 2235
CPU1: 26.0C
Zone1: 30.0C
Fan2: 2295
Fan5: OFF
CPU2: 27.0C
CPU3: 26.0C
Zone2: 31.0C
Fan3: 2205
Fan6: 2518
CPU1: +2.192V
CPU2: +2.192V
CPU3: +2.192V
CPU1: +1.488V
CPU2: +1.488V
CPU3: +1.488V
7-18
Fan RPM. With the exception of Fan 5, all fans are powered as long as the
system is powered on. Fan 5 is OFF unless Fan 6 fails.
CPU CORE voltage and CPU I/O voltage. In a healthy system, the core
voltage for all CPUs should be the same, and the I/O voltage for all CPUs
should be the same.
Bulk power supply voltage. The Vterm and Cterm voltage regulators are
located on the system motherboard.
7-19
7.6.4
Example 74 dump
RMC> dump
Address: 10
Count: ee
0010:03
0020:00
0030:00
0040:01
0050:00
0060:00
0070:00
0080:00
0090:00
00A0:00
00B0:00
00C0:00
00D0:00
00E0:00
00F0:00
RMC>
7-20
31
00
00
80
00
00
00
00
00
00
00
00
00
00
00
07
00
00
01
00
00
00
00
00
00
00
00
00
00
00
28
00
00
01
00
00
00
00
00
00
00
00
00
00
00
01
00
00
01
00
00
00
00
00
00
00
00
00
00
00
09
00
00
01
00
00
00
00
00
00
00
00
00
00
00
00
00
00
01
00
00
00
00
00
00
00
00
00
00
00
00
00
00
01
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
00
00
00
00
00
00
00
00
1D
00
BA
00
22
00
00
00
00
00
00
00
00
00
00
00
FF
00
00
00
00
00
00
00
00
00
00
00
00
00
19
FF
00
00
00
00
00
00
00
00
00
00
00
00
00
18
FA
00
00
00
00
0A
00
00
00
00
00
00
00
00
19
FA
00
00
00
00
03
00
00
00
00
00
00
00
00
00
3B
00
00
00
00
0A
DPR address
Bytes 10:15 are the time stamp. See Appendix C for the meaning of other
locations.
Number of bytes dumped (in hex). In the example the dump command
dumps EF bytes from address 10.
The dump command allows you to dump data from the DPR. You can use this
command locally or remotely if you are not able to access the SRM console
because of a system crash.
The dump command accepts two arguments:
Address:
Count:
7-21
7.6.5
The RMC power {on, off}, halt {in, out}, and reset commands perform
the same functions as the buttons on the operator control panel.
If the system has been powered off with the Power button, the RMC cannot
power the system on. If you enter the power on command, the message
Power button is OFF is displayed, indicating that the command will have
no effect.
If the system has been powered on with the Power button, and the power
off command is used to turn the system off, you can toggle the Power
button to power the system back on.
When you issue the power on command, the terminal exits RMC and
reconnects to the servers COM1 port.
7-22
in
to COM port
out
to COM port
The halt out command cannot release the halt if the Halt button is latched in.
If you enter the halt out command, the message Halt button is IN is
displayed, indicating that the command will have no effect. Toggling the Power
button on the operator control panel overrides the halt in condition.
Reset
The RMC reset command restarts the system. The terminal exits RMC and
reconnects to the servers COM1 port.
Example 77 reset
RMC> reset
Returning to COM port
7-23
7.6.6
Before you can dial in through the RMC modem port or enable the
system to call out in response to system alerts, you must configure RMC
for remote dial-in.
Connect your modem to the 9-pin modem port and turn it on. Connect to the
RMC CLI from either the local serial terminal or the local VGA monitor to set
up the parameters.
7-24
Enables remote access to the RMC modem port by configuring the modem
with the setting stored in the initialization string.
Verifies the settings. Check that the Remote Access field is set to Enabled.
Dialing In
The following example shows the screen output when a modem connection is
established.
ATDT915085553333
RINGING
RINGING
CONNECT 9600/ARQ/V32/LAPM
RMC Password: *********
Welcome to RMC V1.2
P00>>> ^[^[rmc
RMC>
1. At the RMC> prompt, enter commands to monitor and control the remote
system.
2. When you have finished a modem session, enter the hangup command to
cleanly terminate the session and disconnect from the server.
7-25
7.6.7
When you are not monitoring the system from a modem connection,
you can use the RMC dial-out alert feature to remain informed of
system status. If dial-out alert is enabled, and the RMC detects alarm
conditions within the managed system, it can call a preset pager
number.
You must configure remote dial-in for the dial-out feature to be enabled. See
Section 7.6.6.
To set up the dial-out alert feature, connect to the RMC CLI from the local
serial terminal or local VGA monitor.
The RMC dials your pager and sends a message identifying the system.
You connect to the RMC CLI, check system status with the env command,
and, if the situation requires, power down the managed system.
When the problem is resolved, you power up and reboot the system.
7-26
The elements of the dial string and alert string are shown in Table 72. Paging
services vary, so you need to become familiar with the options provided by the
paging service you will be using. The RMC supports only numeric messages.
Sets the string to be used by the RMC to dial out when an alert condition
occurs. The dial string must include the appropriate modem commands to
dial the number.
Sets the alert string, typically the phone number of the modem connected
to the remote system. The alert string is appended after the dial string,
and the combined string is sent to the modem when an alert condition is
detected.
Forces an alert condition. This command is used to test the setup of the
dial-out alert function. It should be issued from the local serial terminal or
local VGA monitor. As long as no one connects to the modem and there is
no alert pending, the alert will be sent to the pager immediately. If the
pager does not receive the alert, re-check your setup.
Clears the current alert so that the RMC can capture a new alert. The last
alert is stored until a new event overwrites it. The Alert Pending field of
the status command becomes NO after the alert is cleared.
Verifies the settings. Check that the Alert Enable field is set to Enabled.
Clears any alert that may be pending. This ensures that the send alert
command will generate an alert condition.
NOTE: If you do not want dial-out paging enabled at this time, enter the
disable alert command after you have tested the dial-out alert
function. Alerts continue to be logged, but no paging occurs.
7-27
AT = Attention.
X = Forces the modem to dial blindly (not seek the dial
tone). Enter this character if the dial-out line modifies its dial
tone when used for services such as voice mail.
D = Dial
T = Tone (for touch-tone)
9,
The number for an outside line (in this example, 9). Enter the
number for an outside line if your system requires it.
, = Pause for 2 seconds.
15085553333
Alert String
,,,,,,
5085553332#
7-28
7.6.8
CAUTION: Be sure to record the new escape sequence. Restoring the default
sequence requires moving a jumper on the system motherboard.
7-29
7.7
1 2 3
J24
J25
J26
J31
1 2
J3
J2
J1
PK0211
7-30
7-31
7.8
Troubleshooting Tips
Possible Cause
Suggested Solution
7-32
Possible Cause
Suggested Solution
On AC power-up, RMC
defers initializing the
modem for 30 seconds to
allow the modem to
complete its internal
diagnostics and
initializations.
During a remote
connection, you see a
+++ string on the
screen.
The message
unknown command
is displayed when you
enter a carriage return
by itself.
7-33
Chapter 8
FRU Removal
and Replacement
This chapter describes the procedures for removing and replacing FRUs on
ES40 systems.
Unless otherwise specified, install a FRU by reversing the steps shown in the
removal procedures.
NOTE: If you are installing or replacing CPU cards, memory DIMMs, or PCI
cards, become familiar with the location of the card slots and
configuration rules. See Chapter 6.
IMPORTANT!
8-1
8.1
FRUs
Description
Cables
17-04787-01
17-04785-01
17-04786-01
17-03971-07
17-04678-02
17-03970-04
17-04400-06
17-04867-01
17-03971-08
17-04914-01
Fans
70-40074-01
Fan 6
70-40073-01
Fans 1 and 2
70-40073-02
Fan 5
70-40072-01
Fan 3
70-40071-01
Fan 4
8-2
Description
CPU Modules
54-30158-03
54-30158-05
54-30158-A3
54-30158-A5
54-30362-01
Memory DIMMs
54-25053-BA
54-24941-EA
54-24941-FA
54-24941-JA
OCP
54-25582-01
54-25582-02
70-31349-01
Speaker assembly
30-50802-02
54-25385-01
System motherboard
54-25575-01
54-25573-01
54-25573-02
8-3
Description
30-49448-01
SN-LKQ46-Ax
Keyboard, OpenVMS
SN-LKQ47-Ax
SN-PBQWS-WA
Mouse, 3-button
12-37977-02
3X-RRD32-AC
3R-A0284-AA
RX23L-AC
Floppy drive
8-4
8.1.1
Power Cords
Country
Length
BN26J-1K
75 in.
3X-BN46F-02
Japan
2.5 m
BN19H-2E
2.5 m
BN19C-2E
Central Europe
2.5 m
BN19A-2E
UK, Ireland
2.5 m
BN19E-2E
Switzerland
2.5 m
BN19K-2E
Denmark
2.5 m
BN19M-2E
Italy
2.5 m
BN19S-2E
2.5 m
8-5
8.1.2
FRU Locations
CPU Cards
Fans
OCP
PCI
Backplane
Fans
Secondary
Drive Cage
Floppy Drive
Primary
Drive Cage
8-6
CD-ROM Drive
PK0285
Speaker
Power Harness
Access Cover
Power
Supplies
System
Motherboard
PK0286
8-7
8.1.3
The system must be shut down before you replace most FRUs. The
exceptions are power supplies, individual fans, and hard drives. After
replacing FRUs you must clear the system error information repository
with the SRM clear_error all command.
Tools
You need the following tools to remove or replace FRUs.
Hot-Plug FRUs
The following are hot-plug FRUs. You can replace them while the system is
operating.
Power supplies
Individual fans
8-8
8-9
8.2
Open and remove the front door. Loosen the captive screws that allow
you to remove the top and side panels.
3
PK0221
8-10
2.
To remove the top panel, loosen the top left and top right captive screws .
Slide the top panel back and lift it off the system.
3.
To remove the left panel, loosen the captive screw at the top and the
captive screw at the bottom. Slide the panel back and then tip it
outward. Lift it off the system.
8-11
PK0234
8-12
2.
To remove the top enclosure panel, loosen top left and top right captive
screws . Slide the top panel back and lift it off the system.
3.
To remove the right enclosure panel, loosen the captive screw shown in .
Slide the panel back and then tip it outward. Lift the panel from the three
tabs.
8-13
8.3
WARNING: Pull out the stabilizer bar and extend the leveler
foot to the floor before you pull out the system. This precaution
prevents the cabinet from tipping over.
3
2
PK0288
8-14
2.
Pull out the stabilizer bar at the bottom of the cabinet until it stops.
3.
Extend the leveler foot at the end of the stabilizer bar to the floor.
4.
5.
Remove and set aside the two screws (one per side), if present, that
secure the system to the cabinet.
6.
PK1211
8-15
8.4
The system chassis has three covers: the fan cover, the system card
cage cover, and the PCI card cage cover. Remove a cover by loosening
the quarter-turn captive screw, pulling up on the ring, and sliding the
cover from the system chassis.
V @ >240VA
8-16
Figure 87 and Figure 88 show the location and removal of covers on the tower
and pedestal/rackmount systems, respectively. The numbers in the illustrations
correspond to the following:
System card cage cover. This area contains CPUs, memory DIMMs,
MMBs, and system motherboard. To remove the system card cage cover,
you must first remove the fan area cover . An interlock switch shuts the
system down when you remove the system card cage cover.
PCI card cage cover. This area contains PCI cards, the PCI backplane, and
four fans.
Fan area cover. This area contains the 6.75-in main system fan and a
redundant fan.
8-17
2
1
1
4
PK0216
8-18
1
2
3
2
5
PK0215
8-19
8.5
Power Supply
PK0232a
8-20
2.
Loosen the three Phillips screws that secure the power supply bracket.
(Do not remove the screws.) Remove the bracket .
3.
Loosen the captive screw on the latch and swing the latch to unlock the
power supply.
4.
NOTE: When installing an additional supply, remove the screw and blank
cover on the slot into which you are installing the supply.
Verification
1.
Plug the AC power cord into the supply. Wait a few seconds for the POK
LED to light.
2.
8-21
8.6
Fans
6
Unlock
Lock
3
4
PK0208
8-22
The fans are hot-plug components. You can replace individual fans while the
system is running.
WARNING: Contact with moving fan can cause
severe injury to fingers. Avoid contact or remove
power prior to access.
Replacing Fans
1. Remove the cover from the fan area (fans and ) or the PCI card cage
(fans ,,, and ).
2. Pull the pop-up latch to unlock it, and lift the fan out of the system. Fan
has no pop-up latch. It is held in place by fan .
3. Install the new fan, taking care to align it as it slides in. Press the pop-up
latch to lock the fan in place.
4. Replace the cover to the fan area or the PCI card cage.
Verification RMC
1.
2.
8-23
8.7
8-24
PK0938a
Shut the system down before removing and replacing a hard drive.
2.
3.
Push the button to release the plastic handle on the front of the drive
carrier. Pull out the plastic handle toward you and slide the drive out.
NOTE: Remove the blank cover from the next available slot before installing an
additional hard disk drive.
8-25
8.8
CPUs
PK0240a
V @ >240VA
8-26
2.
During power-up, observe the screen display. The newly installed CPU
should appear in the display.
3.
Issue the show config command. The new CPU should be listed as one of
the processors.
8-27
8.9
Memory DIMMs
1
Pedestal/Rack
1
3
Tower
2
J1
J2
J3
J4
J5
J6
J8
J7
PK0278
8-28
V @ >240VA
8-29
PK0953a
8-30
4. Install the new DIMM. Align the notches on the gold fingers with the
connector keys (Figure 814) and secure the DIMM with the clips on the
MMB slot.
5. Reinstall the MMB and secure it to the system backplane with the clips.
Verification
1.
2.
3.
Issue the show memory command to display the total amount of memory
in the system.
8-31
PK0245
V @ >240VA
8-32
2.
If installing a new card, remove and discard the bulkhead filler plate
from the PCI slot.
3.
4.
5.
Verification
1. Turn on power to the system.
2. During power-up, observe the screen display for PCI information. The new
option should be listed in the display.
3. Issue the SRM show config command. Examine the PCI bus information
in the display to make sure that the new option is listed.
4. Enter the SRM show device command to display the device name of the
new option.
8-33
PK0282
8-34
8-35
2
1
3
4
PK0287
8-36
2.
Unplug the signal cable and power cable from all devices except the
floppy.
3.
Remove and set aside the four screws that secure the removable media
cage.
4.
5.
Unplug the signal cable and power cable from the floppy.
6.
Remove the four screws that secure the device and set aside the screws.
Slide the device out of the storage slot.
NOTE: When installing a removable media device, remove the blank bezel from
the next available slot. For installation instructions, see the ES40
Owners Guide.
For information on installing disk cages, see the Owners Guide.
8-37
2
1
3
4
5
PK0281
8-38
2.
Unplug the signal cable and power cable from all devices except the
floppy.
3.
Remove and set aside the four screws that secure the removable media
cage.
4.
5.
Unplug the signal cable and power cable from the floppy.
6.
Remove the four screws that secure the floppy drive, and slide the drive
out.
7.
Remove the mounting brackets (two screws in each bracket) from the
drive.
8-39
PK0284
8-40
8-41
5
6
7
PK0279
Connecting Cable
17-04785-01
17-03970-04
17-04786-01
70-31349-01
17-04678-02
17-03971-07
17-04914-01 (if present)
17-04400-06
V @ >240VA
8-42
Connects To:
Fans
Floppy
Cover sensors
Speaker
CD-ROM
OCP
Storage disk cage
I/O controller module
8-43
1
PK0280
8-44
NOTE: When installing a new PCI backplane, align the backplane on the guide
pins , and press the board firmly until it is seated. Seating the PCI
backplane requires considerable pressure. When seating the PCI
backplane in a cabinet, a second person should brace the chassis to
ensure that no excessive stress is placed on the rails.
8-45
8
2
7
5
4
9
6
1
4
9
8-46
PK1207
CAUTION: When removing the system motherboard, be careful not to flex the
board. Flexing the board may damage the BGA component
connections.
NOTE: Removing the system motherboard requires the removal of other FRUs.
Review the removal procedures for the fans, MMBs, CPUs, and drive
cage before beginning the system motherboard removal procedure.
1.
2.
3.
Record the positions of the MMBs and CPUs, and remove the MMBs and
CPUs.
4.
5.
Loosen the three captive Phillips screws holding the middle support
bracket . The screws pop up when sufficiently loosened. Pull the bracket
straight out.
6.
Remove the drive cage (left cage in pedestal/rack, bottom cage in tower), if
installed, or the blank panel.
7.
Remove the two Phillips flat-head screws that secure the small cover to
the left side (pedestal/rack) or bottom (tower) of the system and remove the
panel. Set aside the screws. (Removing the small cover provides better
access to the power harness bracket.)
8.
8-47
9.
10. Using a nut driver, loosen the three nuts (7 mm) on the flange over the
intermodule connector so that it can move freely. Move the flange up from
the connector and tighten one of the flange nuts to keep the flange out of the
way.
NOTE: After replacing the motherboard, loosen the flange nut and push the
flange down to the intermodule connector. Retighten the nuts on the
flange. While tightening the nuts, put pressure on the flange to
compress it into the connector.
11. Remove the three Phillips screws that secure the system motherboard.
12. A white plastic ejector and two holes in the sheet metal under it are used
to help disengage the motherboard. Insert a screwdriver through the hole
in the ejector into the closest hole and pry the system motherboard away
from the PCI backplane. Insert the screwdriver into the second hole that is
now exposed and pry again to fully disengage the system motherboard
connector from the PCI backplane.
13. Extract the system motherboard.
8-48
2.
3.
Enter the set sys_serial_num command to set the system serial number.
(The serial number is on a sticker on the back of the system.) For example:
P00>>> set sys_serial_num NI900100022
The serial number will be propagated to all FRU devices that have EEPROMs.
8-49
7
8
2
5
1
Front
8
7
Back
8-50
PK1208
NOTE: Removing the power harness requires the removal of other system
FRUs. Review the removal procedures for the power supplies, fans, and
drive cage before beginning the harness removal procedure.
1.
Remove the power supplies and any blank power supply panels.
2.
3.
4.
Unplug the connectors to each removable media device (except the floppy).
5.
Remove the four screws that secure the removable media cage. Slide out
the cage to access the floppy power connector. Disconnect the floppy power
connector and slide the cage back in.
6.
7.
8.
Remove the drive cage (left cage in pedestal/rack, bottom cage in tower), if
installed, or the blank panel.
9.
Remove the two Phillips flat-head screws that secure the small cover to
the left side (pedestal/rack) or bottom (tower) of the system and remove the
panel. Set aside the screws. (Removing the small cover provides better
access to the power harness bracket.)
10. Remove the power harness bracket as follows: Push up on the spring
latch to release the bracket, slide the bracket forward, and remove it .
11. Unplug the five connectors on the bottom of the system motherboard.
12. Remove the two screws and two plastic bushings on each of the three
power supply connectors . The screws are located deep inside the power
supply cavity. Set aside the screws and bushings for reinstallation.
13. Starting with the left connector (as viewed from the rear of the system), pull
the connector to the right and angle it so that you can push the left end out
through the opening.
14. Remove the power harness.
8-51
Appendix A
SRM Console Commands
This appendix lists the SRM console commands that are most frequently used
with the ES40 family of systems.
Table A1 SRM Commands Used on ES40 Systems
Command
Function
alphabios
boot
buildfru
cat el
Displays the console event log. Same as more el, but scrolls
rapidly. The most recent errors are at the end of the event log and
are visible on the terminal screen.
clear error
continue
crash
deposit
edit
Invokes the console line editor on a RAM file or on the user powerup script, nvram, which is always invoked during the power-up
sequence.
examine
A-1
Function
exer
floppy_write
galaxy
Same as lpinit.
grep
hd
help command
info
init
kill
kill_diags
lpinit
man
memexer
memtest
more el
Same as cat el, but displays the console event log one screen at a
time.
net -ic
net -s
nettest
prcache
A-2
Function
rmc
set envar
show envar
show config
show device
show error
show fru
show memory
show pal
show power
show_status
show version
sys_exer
sys_exer -lb
Runs console loopback tests for the COM2 serial port and the
parallel port during the sys_exer test sequence.
test
test -lb
Runs loopback tests for the COM2 serial port and the parallel port
in addition to verifying the configuration of devices.
A-3
Appendix B
Jumpers and Switches
This chapter lists and describes the configuration jumpers and switches on the
system motherboard and PCI board. Sections are as follows:
Setting Jumpers
B.1
The RMC jumpers can be used to override the RMC defaults. For
example, if a high-speed modem is connected to COM1, you can disable
J31 to prevent RMC from receiving characters that might cause
interference. The SPC jumpers are reserved.
1 2 3
J24
J25
J26
J31
1 2
J3
J2
J1
SC0032
B-2
Description
J24
J25
J26
J31
J1
J2
J3
B.2
TIG/SROM jumpers allow you to load the TIG if flash RAM is corrupted
or load the fail-safe loader (FSL) if SRM firmware is corrupted.
J21
J20
J22
J23
1 2 3 1 2 3 1 2 3 1 2 3
E296
1 2 3 4 5 6 7 8 9 10
ON
OFF
SC0033
B-4
Description
J21
J20
J22
J23
Meaning
000
Normal
001
010
111
Switchpack E296 sets the clock speed for the system motherboard. The settings
should not be changed.
SW1
SW2
SW3
SW4
SW5
SW6
SW7
SW8
SW9
SW10
SYS_EXT_DELAY1 (off)
SYS_EXT_DELAY0 (on)
SYS_FILL_DELAY (off)
CPU_CFWD_PSET (off)
PCI_CLK_DIV_IN1 (off)
PCI_CLK_DIV_IN0 (on)
Y_DIV3 (on)
Y_DIV2 (on)
Y_DIV1 (off)
Y_DIV0 (off)
B.3
OFF
ON
1
E16
2
3
4
5
6
7
8
9
10
SC0034
B-6
M0 (on)
SW2
M1 (on)
SW3
M2 (on)
SW4
M3 (off)
SW5
M4 (on)
SW6
M5 (off)
SW7
M6 (on)
SW8
N0 (off)
SW9
N1 (on)
SW10
XTAL_SEL (OFF)
B.4
You can set J31 on the PCI board to force DTR so that a modem will not
be disconnected if the system is power cycled. Check J13 if the system
is losing time or the operating system comes up with a very inaccurate
time.
9 10
SC0044
B-8
Description
J31
J20
J21
J13
NOTE: The operating systems use different algorithms for system time. If you
switch between operating systems(for example, between UNIX and
OpenVMS), be sure to reset the time at the operating system level.
B.5
Setting Jumpers
Setting Jumpers
1. Shut down the operating system.
2. Shut down power on all external options connected to the system.
3. Turn off power to the system.
4. Unplug the power cord from each power supply.
5. Remove enclosure panels and chassis covers to gain access to the system
motherboard or PCI board.
If you are setting RMC jumpers, remove CPU 1 to gain access to the
jumpers.
If you are setting PCI jumpers, you typically do not need to remove any
PCI cards. However, if you have a full-length card in slot 10, remove it.
6. Locate the jumper you need to set. Refer to the illustrations in this chapter.
Set the jumpers as needed.
7. Reinstall any modules you removed.
8. Reinstall the chassis covers and enclosure panels.
Plug the power cords into the supplies.
B-10
Appendix C
DPR Address Layout
This appendix shows the address layout of the dual-port RAM (DPR). Use the
SRM examine dpr:address command (where address is the offset from the
base of the DPR) or use the RMC dump command to view locations in the DPR.
See Appendix D for definitions of locations written when environmental error
events occur.
C-1
Location Logical
(Hex)
Indicator
Written
By
0
1
2
3
4
0
1
2
3
4
SROM
SROM
SROM
SROM
SROM
SROM
6
7
8
6
7
8
SROM
SROM
SROM
9
A
9
A
SROM
SROM
B
C
D:F
10:15
B
C
-
SROM
SROM
SROM
C-2
Used For
EV6 BIST status
1=good 0=bad
Bit[7]=Master Bits[0,1]=CPU_ID
Test STR status
1=good 0=bad
Test CSC status
1=good 0=bad
Test Pchip 0 PCTL status
1=good
0=bad
Test Pchip 1 PCTL status
1=good
0=bad
Test DIMx status
1=good 0=bad
Test TIG bus status
Dual-Port RAM test DD= started
Status of DPR test
1=good 0=bad
Status of CPU speed function FF=good
0=bad
Lower byte of CPU speed in MHz
Upper byte of CPU speed in MHz
Reserved
Power On Time Stamp for CPU 0written
as BCD
Byte 10 = Hours (0-23)
Byte 11 = Minutes (0-59)
Byte 12 = Seconds (0-59)
Byte 13 = Day of Month (1-31)
Byte 14 = Month (1-12)
Byte 15 = Year (0-99)
Table C1
17:1D
1E
1F
20:3F
40:5F
60:7F
80
SROM
SROM
SROM
20
20
20
80
SROM
Used For
SROM Power On Error Indication for CPU is
alive. For example; 0 = no error, 2 = Secondary
time-out Error, 3 = Bcache Error
Unused
Last sync state reached; 80=Finished GOOD
Size of Bcache in MB
Repeat for CPU1 of CPU0 0-1F
Repeat for CPU2 of CPU0 0-1F
Repeat for CPU3 of CPU0 0-1F
Array 0 (AAR 0) Configuration
Bits<7:4>
Bits<3:0>
4 = non split 0 = Configured lower set only
Lowest array
5 = split 1 = Configured lower set only
Next lowest array
9 = split 2 = Configured upper set only
Second highest
D = split array
8 DIMMs
3 = Configured F = Twice split Highest array
8 DIMMs
4 = Misconfigured Missing DIMM(s)
8 = Miconfigured Illegal DIMM(s)
C = Misconfigured Incompatible
DIMM(s)
C-3
Table C1
81
SROM
82
83
84
85
86
87
88:8B
82
83
84
85
86
87
SROM
SROM
SROM
SROM
SROM
SROM
SROM
8C:8F
8C-8F
SROM
90
91
92
90
91
92
RMC
RMC
RMC
C-4
Used For
Array 0 (AAR 0)Size (x64 Mbytes)
0 = no good memory
1 = 64 Mbyte
2 = 128 Mbyte
4 = 256 Mbyte
8 = 512 Mbyte
10 = 1 Gbyte
20 = 2 Gbyte
40 = 4 Gbyte
80 = 8 Gbyte
Array 1 (AAR 1) Configuration
Array 1 (AAR 1) Size (x64 Mbytes)
Array 2 (AAR 2) Configuration
Array 2 (AAR 2) Size (x64 Mbytes)
Array 3 (AAR 3) Configuration
Array 3 (AAR 3) Size (x64 Mbytes)
Byte to define failed DIMMs for MMBs
88 - MMB 0
89 - MMB 1
8A - MMB 2
8B - MMB 3
Bit set indicates failure.
Bit definitions ( bit 0 = DIMM 1, bit 1 = DIMM2,
bit 2 = DIMM 3, bit 7 = DIMM 8)
Byte to define misconfigured DIMMs for MMBs
8C MMB 0
8D MMB 1
8E MMB 2
8F MMB 3
Bit definitions ( bit 0 = DIMM 1, bit 1 = DIMM2,
bit 2 = DIMM 3, bit 7 = DIMM 8)
Power Supply/VTERM present
Power Supply PS_POK bits
AC input value from Power Supply
Table C1
93
97
9A
A0
RMC
RMC
RMC
RMC
AA
RMC
AB
RMC
AC
AD
AE
AF
RMC
RMC
RMC
RMC
B0
RMC
B1
RMC
Used For
Temperature from CPU(x) in BCD
Temperature Zone(x) from 3 PCI temp sensors
Fan Status; Raw Fan speed value
Failure registers used as part of the 680 machine
check logout frame. See Appendix D.
Fan status (bit 0 = fan 1, bit 1 = fan 2,
1- indicates good; 0 indicates fan failure
2
Status of RMC to read I C bus of MMB0 DIMMs
Definition:
Bit 7 - DIMM 8 0=OK 1=Fail
Bit 6 - DIMM 7
Bit 5 - DIMM 6
Bit 0 - DIMM 1
2
Status of RMC to read I C bus of MMB1 DIMMs
2
Status of RMC to read I C bus of MMB2 DIMMs
2
Status of RMC to read I C bus of MMB3 DIMMs
2
Status of RMC to read MMB and CPU I C buses
Definition:
Bit 7 - MMB3 0=OK 1=Fail
Bit 6 - MMB2
Bit 5 - MMB1
Bit 4 - MMB0
Bit 3 - CPU3
Bit 2 - CPU2
Bit 1 - CPU1
Bit 0 - CPU0
2
Status of RMC to read CPB (PCI backplane) I C
EEROM
0=OK 1 = fail
2
Status of RMC to read CSB (motherboard) I C
EEROM
0=OK 1 = fail
C-5
Table C1
RMC
B3:B9
Unused
BA
BB
RMC
RMC
BC
BD
BE
RMC
RMC
RMC
BF
C0:D8
D9
DA
RMC
RMC
TIG
DB:E3
E4:EC
ED:F5
F6:F8
F9
FA:FB
RMC
RMC
RMC
Unused
Firmware
Firmware
C-6
FA
Used For
Status of RMC to read SCSI backplane
Definition:
Bit 0 SCSI backplane 0
Bit 1 SCSI backplane 1
Bit 4 Power supply 0
Bit 5 Power supply 1
Bit 6 Power supply 2
Unused
2
I C done, BA = finished
RMC Power on Error indicates error during
power-up (1=Flash Corrupted)
RMC flash update error status
Copy of PS input Value. See Appendix D.
Copy of the byte from the I/O expanders on the
SPC loaded by the RMC on fatal errors. See
Appendix D.
Reason for system failure. See Appendix D.
Unused
Baud rate
Indicates TIG finished loading its code (0xAA
indicates done)
Fan/Temp info from PS1
Fan/Temp info from PS2
Fan/Temp info from PS3
Unused
Buffer Size (0-0xFF) or 1 to 256 bytes
Command address qualifier
FA = lower byte, FB = upper byte
Table C1
Location
(Hex)
FC
FC
RMC
FD
FD
RMC
FE
FE
Firmware
FF
FF
Firmware
100:1FF
100
RMC
200:2FF
300:3FF
400:4FF
500:5FF
600:7FF
700:7FF
800:8FF
900:9FF
A00:AFF
B00:BFF
C00:CFF
D00:DFF
E00:EFF
F00:FFF
200
300
400
500
600
700
800
900
A00
B00
C00
D00
E00
F00
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
Used For
Command status associated with the RMC
response to a request from the firmware
0 = successful completion
80 = unsuccessful completion
81 = invalid command code
82 = invalid command qualifier
Command ID associated with the RMC
response to a request from the firmware
Command Code associated with a command
sent to the RMC
2
1 = update I C EEROM
2 = update baud rate
3 = display to OCP
F0 = update RMC flash
Command ID associated with a command
sent to the RMC
Copy of EEROM on MMB0 J1 DIMM 1,
2
initially read on I C bus by RMC when 5
volts supply turned on. Written by Compaq
Analyze after error diagnosed to particular
FRU
Copy of EEROM on MMB0 J2 DIMM 2
Copy of EEROM on MMB0 J3 DIMM 3
Copy of EEROM on MMB0 J4 DIMM 4
Copy of EEROM on MMB0 J5 DIMM 5
Copy of EEROM on MMB0 J6 DIMM 6
Copy of EEROM on MMB0 J7 DIMM 7
Copy of EEROM on MMB0 J8 DIMM 8
Copy of EEROM on MMB1 J1 DIMM 1
Copy of EEROM on MMB1 J2 DIMM 2
Copy of EEROM on MMB1 J3 DIMM 3
Copy of EEROM on MMB1 J4 DIMM 4
Copy of EEROM on MMB1 J5 DIMM 5
Copy of EEROM on MMB1 J6 DIMM 6
Copy of EEROM on MMB1 J7 DIMM 7
C-7
Table C1
Location
(Hex)
1000:10FF
1100:11FF
1200:12FF
1300:13FF
1400:14FF
1500:15FF
1600:16FF
1700:17FF
1800:18FF
1900:19FF
1A00:1AFF
1B00:1BFF
1C00:1CFF
1D00:1DFF
1E00:1EFF
1F00:1FFF
2000:20FF
2100:21FF
2200:22FF
2300:23FF
2400:24FF
2500:25FF
2600:26FF
2700:27FF
2800:28FF
2900:29FF
2A00:2AFF
2B00:2BFF
C-8
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
RMC
Used For
Copy of EEROM on MMB1 J8 DIMM 8
Copy of EEROM on MMB2 J1 DIMM 1
Copy of EEROM on MMB2 J2 DIMM 2
Copy of EEROM on MMB2 J3 DIMM 3
Copy of EEROM on MMB2 J4 DIMM 4
Copy of EEROM on MMB2 J5 DIMM 5
Copy of EEROM on MMB2 J6 DIMM 6
Copy of EEROM on MMB2 J7 DIMM 7
Copy of EEROM on MMB2 J8 DIMM 8
Copy of EEROM on MMB3 J1 DIMM 1
Copy of EEROM on MMB3 J2 DIMM 2
Copy of EEROM on MMB3 J3 DIMM 3
Copy of EEROM on MMB3 J4 DIMM 4
Copy of EEROM on MMB3 J5 DIMM 5
Copy of EEROM on MMB3 J6 DIMM 6
Copy of EEROM on MMB3 J7 DIMM 7
Copy of EEROM on MMB3 J8 DIMM 8
Copy of EEROM from CPU0
Copy of EEROM from CPU1
Copy of EEROM from CPU2
Copy of EEROM from CPU3
Copy of MMB 0 J5 FRU EEROM
Copy of MMB 1 J7 FRU EEROM
Copy of MMB 2 J6 FRU EEROM
Copy of MMB 3 J8 FRU EEROM
Copy of EEROM on CPB (PCI backplane)
Copy of EEROM on CSB (motherboard)
Last EV6 Correctable ErrorASCII
character string that indicates correctable
error occurred, type, FRU, and so on. Backed
up in CSB (motherboard) EEROM. Written
by Compaq Analyze
Table C1
Location
(Hex)
Logical
Written
Indicator By
2C00:2CFF
2C00
RMC
2D00:2DFF
2D00
RMC
2E00:2FFF
2E00
RMC
3000:3008
3009:300B
SROM
RMC
300C:300E
RMC
300F:3010
3011:30FF
3100:31FF
3200:32FF
3300:33FF
3400
3401
300F
RMC
Unused
RMC
RMC
RMC
SROM
SROM
3402
3403:340F
SROM
SROM/SRM
3410:3417
SROM/SRM
Used For
Last Redundant FailureASCII
character string that indicates redundant
failure occurred, type, FRU, and so on.
Backed up in system CSB (motherboard)
EEROM. Written by Compaq Analyze
Last System FailureASCII character
string that indicates system failure
occurred, type, FRU, and so on. Backed
up in CSB (motherboard) EEROM.
Written by Compaq Analyze.
Uncorrectable machine logout frame (512
bytes)
SROM Version (ASCII string)
Rev Level of RMC first byte is letter Rev
[x/t/v] second 2 bytes are major/minor.
This is the rev level of the RMC on-chip
code.
Rev Level of RMC first byte is letter Rev
[x/t/v] second 2 bytes are major/minor.
This is the rev level of the RMC flash
code.
Revision Field of the DPR Structure
Unused
Copy of PS0 EEROM (first 256 bytes)
Copy of PS1 EEROM (first 256 bytes)
Copy of PS2 EEROM (first 256 bytes)
Size of Bcache in MB
Flash SROM is valid flag; 8 = valid,
0 = invalid
Systems errors determined by SROM
Reserved for future SROM/SRM
communication
Jump to address for CPU0
C-9
Table C1
Location
(Hex)
3418
3419
SROM/SRM
SROM
341A:341E
SROM
341F
SROM/SRM
3420:342F
3430:343F
3440:344F
3450:349F
SROM/SRM
SROM/SRM
SROM/SRM
SROM/
RMC
34A0:34A7
SROM
34A8:34AF
SROM
34B0:34B7
SROM
34B8:34CF
SROM
34C0:34FF
C-10
34C0
SROM
Used For
Waiting to jump to flag for CPU0
Shadow of value written to EV6 DC_CTL
register.
Shadow of most recent writes to EV6
CBOX Write-many chain.
Reserved for future SROM/SRM
communication
Repeat for CPU1 of CPU0 3410-341F
Repeat for CPU2 of CPU0 3410-341F
Repeat for CPU3 of CPU0 3410-341F
Reserved for SROM mini-console via
RMC communication area. Future
design.
Array 0 to DIMM ID translation
Bits<4:0>
Bits<7:5>
0 = Exists, No Error
Bits <2:0> =
1 = Expected Missing DIMM + 1 (1-8)
2 = Error - Missing
DIMM(s)
Bits <4:3> =
4 = Error - Illegal
MMB (0-3)
DIMM(s)
6 = Error Incompatible
DIMM(s)
Repeat for Array 1 of Array 0
34A0:34A7
Repeat for Array 2 of Array 0
34A0:34A7
Repeat for Array 3 of Array 0
34A0:34A7
Used as scratch area for SROM
Table C1
Location
(Hex)
3500:35FF
3600:36FF
3700:37FF
3800:3AFF
3B00:3BFF
3C00:3CFF
3D00:3DFF
3E00:3EFF
3F00:3FFF
Firmware
3600
SRM
SRM
RMC
RMC
RMC
RMC
RMC
RMC
Used For
Used as the dedicated buffer in which
SRM writes OCP or FRU EEROM data.
Firmware will write this data, RMC will
only read this data.
Reserved
Reserved
RMC scratch space
First SCSI backplane EEROM
Second SCSI backplane EEROM
PS0 second 256 bytes
PS1 second 256 bytes
PS2 second 256 bytes
C-11
Appendix D
Registers
This appendix describes 21264 (EV6 and EV67) internal processor registers;
21272 (Tsunami/Typhoon) system support chipset registers; and dual-port RAM
(DPR) registers that are related to general logout frame errors. It also provides
CPU and system uncorrectable and correctable machine logout frames and error
state bit definitions of all the platform logout frame registers.
21264 (EV6) Registers
Ibox Status Register (I_STAT)
Memory Management Status Register (MM_STAT)
Dcache Status Register (DC_STAT)
Cbox Read Register
Exception Address Register (EXC_ADDR)
Interrupt Enable and Current Processor Mode Register (IER_CM)
Interrupt Summary Register (ISUM)
PAL Base Register (PAL_BASE)
Ibox Control Register (I_CTL)
Process Context Register (PCTX)
21272 (Tsunami/Typhoon) System Registers
21272-CA Cchip Miscellaneous Register (MISC)
21272-CA Device Interrupt Request Register (DIRn, n=0,1,2,3)
21272-CA Pchip Error Register (PERROR)
21272-CA Array Address Registers
DPR Registers
DPR Registers (for 680 correctable error state capture)
2
DPR Registers (for I C bus)
2
DPR Registers (power supply status from I C bus)
DPR 680 Fatal Registers (for 680 uncorrectable error state capture)
Registers
D-1
D.1
The Ibox Status Register (I_STAT) is read only by PAL code and is an
element in the CPU or system uncorrectable and correctable machine
check error logout frame.
63
32
31 30 29 28
DPE
TPE
D-2
FM-05854.AI8
Bits
Type
Description
Reserved
<63:31>
RO
DPE
<30>
W1C
TPE
<29>
W1C
Reserved
<28:0>
RO
Registers
D-3
D.2
63
31
32
11 10 9
4 3 2 1
DC_TAG_PERR
OPCODE[5:0]
FOW
FOR
ACV
WR
FM-05862.AI4
D-4
Bits
Reserved
<63:11>
Type
Description
Reserved for Compaq.
DC_TAG_ <10>
PERR
RO
OPCODE <9:4>
RO
FOW
<3>
RO
FOR
<2>
RO
ACV
<1>
RO
WR
<0>
RO
Registers
D-5
D.3
The Dcache Status Register (DC_STAT) is read only by PAL code and is
an element in the CPU or system uncorrectable and correctable
machine check error logout frame.
63
31
32
5 4 3 2 1 0
SEO
ECC_ERR_LD
ECC_ERR_ST
TPERR_P1
TPERR_P0
FM-05865.AI4
D-6
Bits
Type
Description
Reserved
<63:5>
SEO
<4>
W1C
ECC_ERR_LD
<3>
W1C
ECC_ERR_ST
<2>
W1C
TPERR_P1
<1>
W1C
TPERR_P0
<0>
W1C
Registers
D-7
D.4
The Cbox Read Register is read only by PAL code and is an element in
the CPU or system uncorrectable and correctable machine check error
logout frame.
Description
C_SYNDROME_1<7:0>
C_SYNDROME_0<7:0>
C_STAT<4:0>
Bits
Error Status
00000
00001
00010
00011
DSTREAM_MEM_ERR
00100
DSTREAM_BC_ERR
00101
DSTREAM_DC_ERR
0011X
PROBE_BC_ERR
01000
Reserved
01001
Reserved
01010
Reserved
01011
ISTREAM_MEM_ERR
D-8
Description
C_STAT<4:0>
(continued)
Bits
Error Status
01100
ISTREAM_BC_ERR
01101
Reserved
0111X
Reserved
10011
DSTREAM_MEM_DBL
10100
DSTREAM_BC_DBL
11011
ISTREAM_MEM_DBL
11100
ISTREAM_BC_DBL
C_STS<3:0>
C_ADDR<6:42>
Status of Block
Reserved
Parity
Valid
Dirty
Shared
Registers
D-9
D.5
32
PC[63:32]
31
2 1
PC[31:2]
PAL
FM-06384.AI4
D-10
Registers
D-11
D.6
63
39 38
33 32
EIEN[5:0]
SLEN
31 30 29 28
14 13 12
4 3
CREN
PCEN[1:0]
SIEN[15:1]
ASTEN
CM[1:0]
FM-05846.AI4
D-12
Table D5
Name
Extent
Type
Description
Reserved
[63:39]
EIEN[5:0]
[38:33]
RW
SLEN
[32]
RW
CREN
[31]
RW
PCEN[1:0]
[30:29]
RW
SIEN[15:1]
[28:14]
RW
ASTEN
[13]
RW
Reserved
[12:5]
CM[1:0]
[4:3]
Reserved
RW
Current Mode
00
Kernel
01
Executive
10
Supervisor
11
User
[2:0]
Registers
D-13
D.7
39 38
33 32
EI[5:0]
SL
31 30 29 28
14 13
11 10 9 8 7 6 5 4 3 2
CR
PC[1:0]
SI[15:1]
ASTU
ASTS
ASTE
ASTK
FM-05849.AI4
D-14
Extent
Type
Description
Reserved
[63:39]
EI[5:0]
[38:33]
RO
External Interrupts
SL
[32]
RO
CR
[31]
RO
PC[1:0]
[30:29]
RO
SI[15:1]
[28:14]
Reserved
[13:11]
ASTU, ASTS
[10],[9]
RO
Software Interrupts
RO
AST Interrupts
For each processor mode, the bit is
set if an associated AST interrupt is
pending. This includes the modes
ASTER and ASTRR bits and
whether the processor mode value
held in the IER_CM register is
greater than or equal to the value
for the mode.
Reserved
[8:5]
ASTE, ASTK
[4],[3]
RO
AST Interrupts
For each processor mode, the bit is
set if an associated AST interrupt is
pending. This includes the modes
ASTER and ASTRR bits and
whether the processor mode value
held in the IER_CM register is
greater than or equal to the value
for the mode.
Reserved
[2:0]
Registers
D-15
D.8
63
44 43
32
PAL_BASE[43:32]
31
15 14
PAL_BASE[31:15]
FM-05852.AI4
D-16
Extent
Type
Description
Reserved
[63:44]
RO, 0
PAL_BASE[43:15]
[43:15]
RW
Reserved
[14:0]
RO, 0
Registers
D-17
D.9
48 47
32
SEXT(VPTB[47])
VPTB[47:32]
31 30 29
23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5
3 2 1 0
VPTB[31:30]
CHIP_ID[5:0]
BIST_FAIL
TB_MB_EN
MCHK_EN
CALL_PAL_R23
PCT1_EN
PCT0_EN
SINGLE_ISSUE_H
VA_FORM_32
VA_48
SL_RCV
SL_XMIT
HWE
BP_MODE[1:0]
SBE[1:0]
SDE[1:0]
SPE[2:0]
IC_EN[1:0]
SPCE[0]
FM-05853.AI8
D-18
Extent
Type
Description
SEXT(VPTB[47])
[63:48]
RW,0
VPTB[47:30]
[47:30]
RW,0
CHIP_ID[5:0]
[29:24]
RO
BIST_FAIL
[23]
RO,0
TB_MB_EN
[22]
RW,0
MCHK_EN
[21]
RW,0
CALL_PAL_R23
[20]
RW,0
PCT1_EN
[19]
RW,0
Registers
D-19
Extent
Type
Description
PCT0_EN
[18]
RW,0
SINGLE_ISSUE_H
[17]
RW,0
VA_FORM_32
[16]
RW,0
VA_48
[15]
RW,0
SL_RCV
D-20
[14]
RO
Extent
Type
Description
SL_XMIT
[13]
WO
HWE
[12]
RW,0
BP_MODE[1:0]
[11:10]
RW,0
SBE[1:0]
[9:8]
RW,0
SDE[1:0]
[7:6]
RW,0
Registers
D-21
Extent
Type
Description
SPE[2:0]
[5:3]
RW,0
IC_EN[1:0]
[2:1]
RW,3
SPCE
[0]
RW,0
D-22
47 46
39 38
32
ASN[7:0]
31
13 12
9 8
5 4 3 2 1
ASTRR[3:0]
ASTER[3:0]
FPE
PPCE
FM-05855.AI4
Registers
D-23
The following table lists the correspondence between IPR index bits and register
fields.
IPR Index Bit
Register Field
ASN
ASTER
ASTRR
PPCE
FPE
D-24
Extent
Type
Description
Reserved
ASN[7:0]
Reserved
ASTRR[3:0]
[63:47]
[46:39]
[38:13]
[12:9]
RW
RW
ASTER[3:0]
[8:5]
RW
Reserved
FPE
[4:3]
[2]
RW,1
PPCE
[1]
RW
Registers
D-25
Address
Access
RW
44 43
63
40 39
32
reserved
DEVSUP
REV
31
29 28 27
25 24 23
20 19
16 15
000
12
11
4 3 2 1 0
00
NXM
NXS
ACL
ABT
ABW
IPREQ
IPINTR
ITINTR
CPUID
PK1417-99
D-26
Bits
Type
Initial
State Description
RES
<63:44>
MBZ, RAZ
DEVSUP
<43:40>
WO
REV
<39:32>
RO
NXS
<31:29>
RO
NXM
<28>
R, W1C
RES
<27:25>
MBZ, RAZ
Reserved.
ACL
<24>
WO
Arbitration clearwriting a 1
to this bit clears the ABT and
ABW fields.
ABT
<23:20>
R, W1S
Arbitration trywriting a 1 to
these bits sets them.
ABW
<19:16>
R, W1S
Arbitration wonwriting a 1 to
these bits sets them unless one
is already set, in which case the
write is ignored.
IPREQ
<15:12>
WO
Interprocessor interrupt
requestwrite a 1 to the bit
corresponding to the CPU you
want to interrupt. Writing a 1
here sets the corresponding bit
in the IPINTR.
Reserved.
Registers
D-27
Bits
Type
Initial
State Description
IPINTR
<11:8>
R, W1C
Interprocessor interrupt
pendingone bit per CPU. Pin
irq<3> is asserted to the CPU
corresponding to a 1 in this
field.
ITINTR
<7:4>
R, W1C
RES
<3:2>
MBZ, RAZ
Reserved.
CPUID
<1:0>
RO
D-28
Address
Access
RO
63
32
58 57 56 55
00
31
Registers
D-29
Bits
Type
Initial
State Description
ERR
<63:58>
RO
RES
<57:56>
RO
NXS
<55:0>
RO
D-30
Address
Registers
D-31
Access
63
RW
56 55
52 51 50
44 43
40 39
32
ADDR
INV
CMD
SYN
31
16 15
12 11 10 9 8 7 6 5 4 3 2 1
ADDR
RES
CRE
UECC
RES
NDS
RDPE
TA
APE
SGE
DCRTO
PERR
SERR
LOST
PK1419-99
D-32
Bits
Type
Initial
State Description
SYN
<63:56>
RO
CMD
<55:52>
RO
INV
ADDR
<51>
<50:16>
RO Rev1
RAZ Rev0
RO
Value
Command
0000
0001
0011
Others
DMA read
DMA read-modify-write
SGTE read
Reserved
Mode
0
1
Registers
D-33
RES
<15:12> MBZ,
RAZ
Reserved
CRE
<11>
R, WIC
UECC
<10>
R, WIC
RES
<9>
MBZ,
RAZ
Reserved.
NDS
<8>
R, WIC
RDPE
<7>
R,W1C
TA
<6>
R, W1C
APE
<5>
R, W1C
SGE
<4>
R, W1C
DCRTO <3>
R, W1C
PERR
<2>
R, W1C
SERR
<1>
R, W1C
LOST
<0>
R, W1C
D-34
Type
Initial
State Description
Name
Bits
Type
Init
RES
ADDR
<63:35>
<34:24>
MBZ,RAZ 0
RW
0
RES
DBG
<23:17>
16
MBZ,RAZ 0
RW
0
ASIZ
<15:12>
RW
RES
TSA
SA
<11:10>
<9>
<8>
MBZ,RAZ 0
RW
0
RW
0
Description
Reserved.
Base address Bits <34:24> of the physical
byte address of the first byte in the array.
(<34:32> are used in Typhoon only; <34:28>
are valid)
Reserved.
Enables this memory port to be used as a debug
interface.
Array size (<15> is used in Typhoon only).
Value
Size
0000
0 (bank disabled)
0001
16MB
0010
32MB
0011
64MB
0100
128MB
0101
256MB
0110
512MB
0111
1GB
1000
2GB (Typhoon only)
1001
4GB (Typhoon only)
1010
8GB (Typhoon only)
1011 1111 Reserved.
Reserved.
Twice-split array (Typhoon only)
Split array.
Continued on next page
Registers
D-35
Bits
Type
RES
ROWS
<7:4>
<3:2>
MBZ,RAZ 0
RW
0
BNKS
<1:0>
RW
D-36
Init
Description
Reserved.
Number of row bits in the SDRAMs.
Value
Number of Bits
0
11
1
12
2
13
3
Reserved
Number of bank bits in the SDRAMs
Value
Number of Bits
0
1
1
2
2
3 (Typhoon only)
3
Reserved
Table D14
DPR
Location
Description
A0
A1
Bit 0
1
2
3
4
5
6
7
Bit 0
2
Registers
D-37
Table D14
DPR
Location
A2
A3
Reserved
A4
A5
Bit 7
Bit 6
Bit 5
Bit 4
Bit 3
Bit 2
Bit 1-0
D-38
Table D14
DPR
Location
A6
A7
A8
A9
Fan 1
Fan 2
Fan 3
Fan 4
Fan 5
Fan 6
Registers
D-39
Definition
DB/E4/ED
PS_ID0_L
PS_ID1_L
Reserved (Pulled up so bit is always enabled)
Thermal_Shutdown_H
Tied to High within PS
DC/E5/EE
DD/E6/EF
DE/E7/F0
DF/E8/F1
Fan_Speed (0x8B = 7 V)
E0/E9/F2
E1/EA/F3
Power_supply_internal_temperature (hot)
Byte represents a temp value
1 bit = 0.756 C
E2/EB/F4
Power_supply_inlet_temperature
1 bit = 0.266 C
E3/EC/F5
Spare
NOTE:
D-40
The DPR locations refer to power supplies. For example, DB/E4/ED = power
supply 0/1/2. The same is true for all locations listed in the table.
Definition
BD
BE
BF
Registers
D-41
56 55
48 47
40 39
32 31
24
23
16 15
0 Offset(Hex)
Frame Size(00C8)
NOTE: For CPU uncorrectable offsets B0B8 will be zeroed and system
uncorrectable offsets 1898 will be zeroed.
D-42
00000000
00000008
00000010
00000018
00000020
00000028
00000030
00000038
00000040
00000048
00000050
00000058
00000060
00000068
00000070
00000078
00000080
00000088
00000090
00000098
000000A0
000000A8
000000B0
000000B8
000000C0
56 55
48 47
40 39
Revision (1)
32 31
24 23
16 15
8 7
0 Offset (Hex)
Type (3)
Class (12)
Length (80)
Processor WHAMI
Retryable/Second Error Flags
Frame Size 0070)
1
System Area Offet(0020)
EV6 Area Offset(0020 )
Machine Check Frame Revision
Machine Check Code (206)
Software Error Summary Flags
Cchip CPUx Device Interrupt Request Register (DIRx System Primary CPU
Fault Watcher)
Environ_QW_1 (TIG System Management Information Register (SMIR))
Environ_QW_2 (TIG CPU Information Register (CPUIR))
Environ_QW_3 (TIG Power Supply Information Register (PSIR))
Environ_QW_4 (System_PS/Temp/Fan_Fault - LM78_ISR )
Environ_QW_5 (System_Doors)
Environ_QW_6(System_Temperature_Warning)
Environ_QW_7(System_Fan_Control_Fault)
Environ_QW_8(Fatal_Power_Down_Codes)
Environ_QW_9(Environmental Reserved 1)
00000000
00000008
00000010
00000018
00000020
00000028
00000030
00000038
00000040
00000048
00000050
00000058
00000060
00000068
00000070
00000078
NOTE: Only Environ_QW_8 contains valid error state capture. All other
Environ_QW_1-7, 9 will be zeroed.
Registers
D-43
56 55
48 47
40 39
32 31
24 23
16 15
NOTE: For CPU correctable offsets 6878 will be zeroed and system
uncorrectable offsets 1850 will be zeroed.
D-44
Offset
0 (Hex)
00000000
00000008
00000010
00000018
00000020
00000028
00000030
00000038
00000040
00000048
00000050
00000058
00000060
00000068
00000070
00000078
56 55
48 47
40 39
32 31
24 23
16 15
8 7
0 Offset (Hex)
00000000
00000008
00000010
00000018
00000020
00000028
00000030
00000038
00000040
00000048
00000050
00000058
00000060
00000068
NOTE: Only Environ_QW_17 contain valid error state capture. All other
Environ_QW_8,9 will be zeroed.
Registers
D-45
D-46
Bit Field
C_SYNDROME_0
<7:0>
Data Bit
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
<7:0>(Hex)
4F
4A
52
54
57
58
5B
5D
A2
A4
A7
A8
AB
AD
B0
B5
8F
8A
92
94
97
98
9B
9D
62
64
67
68
6B
6D
Data Bit
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
Registers
D-47
Bit Field
C_SYNDROME_0
(continued)
C_SYNDROME_1
<7:0>
C_STAT
<4:0>
C_STS
<7:4>
<3:0>
C_ADDR
<42:6>
SNGL: Single-bit error leading to correctable error; DBL: double-bit error leading to
uncorrectable error.
D-48
Bit Field
<63:41>
<40>
<39>
<38>
<37:34>
<33>
<32:30>
<29>
Reserved
ProfileMe Mispredict Trap
ProfileMe Trap
ProfileMe Load-Store Order Trap
ProfileMe Trap Types
ProfileMe Icache Miss
ProfileMe Counter 0 Overcount
Set = icache encountered a parity error on instruction fetch
and a reply trap is performed which generates a correctable
read interrupt.
Reserved
<28:0>
DC_STAT
<4:0>
MM_STAT
<3:0>
<10>
<9:4>
Registers
D-49
Bit Field
EXC_ADDR
<0>
<63:2>
IER_CM
<4:3>
I_SUM
<13>
<28:14>
<30:29>
<31>
<32>
<38:33>
<4:3>
<10:9>
PAL_BASE
D-50
<28:14>
<32>
<31>
<30:29>
<38:33>
<43:15>
Bit Field
<2:1>
<7:6>
<13>
<14>
<15>
<20>
<21>
<29:24>
<47:30>
PCTX
<0>
<1>
<2>
<4:3>
<8:5>
<12:9>
<38:13>
<46:39>
<63:47>
Software Error
Summary Flags
<0>
<1>
<2>
<63:3>
Registers
D-51
Bit Field
MISC
<43:40>
Suppress IRQ1 interrupts to 1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for
CPU2, and 8(Hex) for CPU3 Cchip
Cchip Revision Level : 00-07(Hex) for C2, 08-0F(Hex) for C4
0(Hex) for CPU0, 1(Hex) for CPU1, 2(Hex) for CPU2, 3(Hex) for CPU3,
4(Hex) for Pchip0, 5(Hex) for Pchip1, as device (source) which caused the
NXM
Set = NXM address detected, <31:29> are locked, DRIR <63> is set
Write 1 = Arbitration Clear
=1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for CPU2, and 8(Hex) for
CPU3 Arbitration Trying
=1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for CPU2, and 8(Hex) for
CPU3 Arbitration Won
=1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for CPU2, and 8(Hex) for
CPU3 to set interprocessor interrupt request.
=1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for CPU2, and 8(Hex) for
CPU3 interprocessor interrupt (IRQ<3>) pending
=1(Hex) for CPU0, 2(Hex) for CPU1, 4(Hex) for CPU2, and 8(Hex) for
CPU3 interval timer interrupt (IRQ<2>) pending
=00(Bin) for CPU0, 01(Bin) for CPU1, 10(Bin) for CPU2, 11(Bin) for CPU3
ID performing the read.
<39:32>
<31:29>
<28>
<24>
<23:20>
<19:16>
<15:12>
<11:8>
<7:4>
<1:0>
D-52
Bit Field
DIRx
<63>
<62>
<61>
<60>
<59>
<58>
<57:56>
<55>
<54>
<53>
<52>
<51>
<50>
<49>
<48>
<47:44>
<43:40>
<39:36>
<35:32>
<31:28>
<27:24>
<23:20>
<19:16>
<15:12>
<11:8>
<7:0>
Registers
D-53
Bit Field
<63:56>
<55:52>
<51>
<50:16>
<15:12>
<11>
<10>
<9>
<8>
<7>
<6>
<5>
<4>
<3>
<2>
<1>
<0>
D-54
CPUIR
(Environ_QW_2)
PSIR
(Environ_QW_3)
Bit Field
<7>
<6>
<5>
<4>
<3>
<2>
<1>
<0>
<7>
<6>
<5>
<4>
<3>
<2>
<1>
<0>
<7>
<6>
<5>
<4>
<3>
<2>
<1>
<0>
Registers
D-55
Bit Field
<0>
<1>
<2>
<3>
<4>
<5>
<6>
<7>
<8>
<9>
<10>
<15:11>
<16>
<17>
<18>
<19>
<20>
<21>
<22>
<23>
<31:24>
<32>
<33>
<34>
<35>
<36>
<37>
<38>
<39>
<41:40>
<42>
<43>
<44>
<45>
<46>
<47>
<63:48>
D-56
System_Temperature_Warning
(Environ_QW_6)
Bit Field
<0>
<1>
<2>
<3>
<4>
<5>
<6>
<7>
<63:8>
<0>
<1>
<2>
<3>
<4>
Unused
Set = System CPU door is open
Set = System Fan door is open
Set = System PCI door is open
Unused
Set = System CPU door is closed
Set = System Fan door is closed
Set = System PCI door is closed
Unused
Set = CPU0 temperature warning fault has occurred
Set = CPU1 temperature warning fault has occurred
Set = CPU2 temperature warning fault has occurred
Set = CPU3 temperature warning fault has occurred
Set = System temperature zone 0 warning fault has
occurred
Set = System temperature zone 1 warning fault has
occurred
Set = System temperature zone 2 warning fault has
occurred
Unused
Set = System Fan 1 is not responding to RMC Commands
Set = System Fan 2 is not responding to RMC Commands
Set = System Fan 3 is not responding to RMC Commands
Set = System Fan 4 is not responding to RMC Commands
Set = System Fan 5 is not responding to RMC Commands
Set = System Fan 6 is not responding to RMC Commands
Unused
Set = CPU fans 5/6 at maximum speed
Set = CPU fans 5/6 reduced speed from maximum
Set = PCI fans 1-4 at maximum speed
Set = PCI fans 1-4 reduced speed from maximum.
<5>
<6>
System_Fan_Control_Fault
(Environ_QW_7)
<63:7>
<0>
<1>
<2>
<3>
<4>
<5>
<7:6>
<8>
<9>
<10>
<11>
Registers
D-57
Bit Field
Fatal_Power_Down_Codes
(Environ_QW_8)
<0>
<1>
<2>
<3:7>
<8>
<9>
<10>
<11>
<12>
<13>
<14>
<15>
<16>
<17>
<18>
<19>
<20>
<21>
<22>
<23>
<63:24>
D-58
Appendix E
Isolating Failing DIMMs
This appendix explains how to manually isolate a failing DIMM from the failing
address and failing data bits. It also covers how to isolate single-bit errors. The
following topics are covered:
E-1
E.1
Memory Addresses
801.A000.0000
801.A000.0100
801.A000.0140
801.A000.0180
801.A000.01C0
DPR Locations
DPR:80
DPR:82
DPR:84
DPR:86
Memory Addresses
801.1000.2000
801.1000.2080
801.1000.2100
801.1000.2180
E-2
E.2
Find the failing array by using the failing address and the Array Address
Registers (AARssee Appendix D). Use the AAR base address and size to
create an Address range for comparing the failing address.
For example if AAR1 base address was 40000000 (1 GB) and its size was
10000000 (256 MB), the address range would be 400000004FFFFFFF
(44.25 GB). This range would be used to compare against the failing
address.
2.
Original
Array 0
Original
Array 1
Original
Array 2
Original
Array 3
00
01
Real Array 0
Real Array 1
Real Array 1
Real Array 0
Real Array 2
Real Array 3
Real Array 3
Real Array 2
10
11
Real Array 2
Real Array 3
Real Array 3
Real Array 2
Real Array 0
Real Array 1
Real Array 1
Real Array 0
E-3
3.
After finding the real array, determine whether it is the lower array set or
the upper array set. Use DPR locations 80, 82, 84, and 86 listed in
Table E1. Table E3 shows the description of these locations.
82
84
86
E-4
Description
Array 0 (AAR 0) Configuration
Bits<7:4>
Bits<3:0>
4 = non splitlower set
0 = ConfiguredLowest array
only
1 = ConfiguredNext lowest array
5 = splitlower set only
2 = ConfiguredSecond highest
9 = splitupper set only
array
D = split8 DIMMs
3 = ConfiguredHighest array
F = Twice split
4 = MisconfiguredMissing DIMM(s)
8 DIMMs
8 = MiconfiguredIllegal DIMM(s)
C = Misconfigured
Incompatible DIMM(s)
Array 1 (AAR 1) configuration
Array 2 (AAR 2) configuration
Array 3 (AAR 3) configuration
4.
Array
Size
256MB
Lower Set
4&5
9
Upper Set
D&F
Bit <27> == 0 Lower Set, 1 Upper Set
512MB
Lower Set
Upper Set
1GB
Lower Set
Upper Set
2GB
Lower Set
Upper Set
4GB
Lower Set
Upper Set
8GB
Lower Set
Upper Set
5.
Now that you have the real array, the failing Data/Check bits, and the
correct set, use Table E4 to find the failing DIMM or DIMMs.
The table shows data bits 0255 and check bits 031. These data bits indicate a
single-bit error. An SROM compare error would yield address and data bits
from 063. When you convert the address to be in the correct range, the failing
data would be somewhere between 0 and 255.
E-5
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
10
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
11
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
12
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
13
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
14
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
15
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
16
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
17
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
18
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
19
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
20
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
21
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
22
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
23
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
24
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
25
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
26
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
27
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
28
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
29
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
30
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
31
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
E-6
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
32
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
33
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
34
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
35
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
36
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
37
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
38
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
39
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
40
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
41
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
42
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
43
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
44
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
45
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
46
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
47
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
48
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
49
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
50
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
51
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
52
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
53
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
54
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
55
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
56
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
57
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
58
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
59
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
60
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
61
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
62
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
E-7
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
63
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
64
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
65
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
66
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
67
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
68
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
69
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
70
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
71
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
72
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
73
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
74
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
75
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
76
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
77
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
78
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
79
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
80
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
81
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
82
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
83
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
84
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
85
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
86
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
87
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
88
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
89
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
90
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
91
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
92
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
93
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
E-8
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
94
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
95
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
96
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
97
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
98
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
99
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
100
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
101
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
102
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
103
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
104
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
105
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
106
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
107
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
108
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
109
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
110
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
111
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
112
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
113
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
114
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
115
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
116
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
117
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
118
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
119
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
120
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
121
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
122
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
E-9
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
123
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
124
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
125
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
126
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
127
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
128
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
129
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
130
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
131
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
132
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
133
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
134
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
135
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
136
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
137
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
138
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
139
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
140
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
141
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
142
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
143
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
144
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
145
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
146
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
147
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
148
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
149
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
150
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
151
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
E-10
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
152
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
153
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
154
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
155
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
156
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
157
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
158
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
159
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
160
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
161
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
162
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
163
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
164
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
165
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
166
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
167
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
168
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
169
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
170
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
171
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
172
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
173
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
174
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
175
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
176
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
177
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
178
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
179
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
180
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
181
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
E-11
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
182
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
183
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
184
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
185
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
186
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
187
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
188
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
189
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
190
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
191
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
192
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
193
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
194
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
195
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
196
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
197
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
198
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
199
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
200
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
201
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
202
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
203
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
204
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
205
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
206
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
207
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
208
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
209
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
210
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
E-12
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
211
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
212
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
213
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
214
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
215
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
216
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
217
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
218
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
219
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
220
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
221
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
222
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
223
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
224
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
225
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
226
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
227
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
228
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
229
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
230
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
231
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
232
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
233
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
234
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
235
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
236
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
237
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
238
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
239
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
240
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
241
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
E-13
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
242
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
243
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
244
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
245
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
246
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
247
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
248
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
249
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
250
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
251
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
252
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
253
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
254
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
255
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
E-14
Array 0
Lower
Upper
Set
Set
Array 1
Lower
Upper
Set
Set
Array 2
Lower
Upper
Set
Set
Array 3
Lower
Upper
Set
Set
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
10
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
11
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
12
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
13
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
14
M:0 D:1
M:0 D:5
M:2 D:1
M:2 D:5
M:0 D:3
M:0 D:7
M:2 D:3
M:2 D:7
15
M:1 D:1
M:1 D:5
M:3 D:1
M:3 D:5
M:1 D:3
M:1 D:7
M:3 D:3
M:3 D:7
16
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
17
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
18
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
19
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
20
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
21
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
22
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
23
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
24
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
25
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
26
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
27
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
28
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
29
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
30
M:0 D:2
M:0 D:6
M:2 D:2
M:2 D:6
M:0 D:4
M:0 D:8
M:2 D:4
M:2 D:8
31
M:1 D:2
M:1 D:6
M:3 D:2
M:3 D:6
M:1 D:4
M:1 D:8
M:3 D:4
M:3 D:8
E-15
E.3
The procedure for detection down to the set of DIMMs for a single-bit
error is very similar to the procedure described in the previous
sections. However, you cannot isolate down to a specific data or check
bit.
The 21264 (EV6) chip detects and reports a C_ADDR<42:6> failing address that
is accurate to the cache block (64 bytes). The syndrome registers (Table E5)
detect data syndrome information, providing isolation down to the low or high
quadword of the target octaword that the fault has been detected within. Each
of the syndrome registers is able to report 64 data bits (the quadword) and 8
check bits (memory data bus ECC bits).
Table E5 shows the syndrome hexadecimal to physical data or check bit
decoding. For example, if you have an EV6 single-bit C_Syndrome_0 hexadecimal error value equal to 23, the second column indicates the decoded physical
data or check bit for this encoding. Use these physical data bits in conjunction
with the previously described isolation procedure to isolate the failing DIMMs.
C_Syndrome 0
C_Syndrome 1
CE
CB
D3
D5
D6
D9
DA
DC
23
25
26
29
2A
2C
E-16
C_Syndrome 0
C_Syndrome 1
31
34
0E
0B
13
15
16
19
1A
1C
E3
E5
E6
E9
EA
EC
F1
F4
4F
4A
52
54
57
58
5B
5D
A2
A4
A7
A8
AB
AD
E-17
C_Syndrome 0
C_Syndrome 1
B0
B5
8F
8A
92
94
97
98
9B
9D
62
64
67
68
6B
6D
70
75
01
02
04
08
10
20
40
80
E-18
Index
A
AAR memory addresses, E-2
Acceptance testing, 2-10
Alpha System Reference Manual, 4-27
AlphaBIOS console
boot screen, 6-2
running in serial mode, 6-23
AlphaBIOS utilities, 6-19
Architecture, 1-2
Auto start, 6-16
UNIX or OpenVMS, 6-17
auto_action environment variable, 6-8,
6-17
auto_action environment variable,
SRM, 6-7
Autoboot, 6-16
AUX_5V LED, 1-25
AUX_5V power supply, 1-20
Auxiliary power supply, RMC, 7-3
B
Beep codes, 3-20
Boot device, setting, 6-18
Boot problems, 2-7
Boot screen, AlphaBIOS, 6-2
boot_file environment variable, 6-8
boot_osflags environment variable, 6-8
bootdef_dev environment variable, 6-8
Booting Linux, 6-44
buildfru command, 4-4
Bypass modes, 7-6
Bypassing the RMC, 7-6
C
Cables, 8-2
cat el command, 4-8
CCAT, 2-11
C-chip, 1-3
CD-ROM drive, 1-6
part number, 8-4
Chassis
accessing in a cabinet, 8-14
front components, 1-6
rear components, 1-7
removing covers from, 8-16
Checksum error, 3-22
Chipset, 1-3
clear password command, 6-27
clear_error all command, 4-10, 8-1, 8-9
clear_error command, 4-10, 4-50
Clearing checksum errors, 4-50
Clearing errors, 4-10
Clock generator settings, B-6
COM1 data flow, defining, 7-15
COM1 environment variables, 7-12
COM1 MMJ port, 1-9
com1_ modem environment variable,
6-11
com1_baud environment variable, 6-10
com1_flow environment variable, 6-10
com1_mode environment variable, 6-10,
7-4
COM2 and parallel port
loopback tests, 4-56
COM2 port, 1-9
com2_baud environment variable, 6-10
com2_flow environment variable, 6-10
com2_modem environment variable,
6-11
Command conventions, RMC, 7-14
Compaq Analyze, 2-9
and SDD errors, 4-50
and TDD errors, 4-50
documentation, 5-3
event screen, 5-5
evidence designator, 5-10
Index-1
Index-2
removing, 8-26
CPU correctable error (630), 5-14
CPU uncorrectable error (670), 5-14
cpu_enabled environment variable, 6-11
crash command, 4-11
Crash dumps, 2-10, 4-11
Crashes, troubleshooting, 2-8
D
Data buses, 1-17
Data structures, displaying, 4-26
DC_STAT, D-6
Dcache Status Register, D-6
D-chips, 1-3
DEC VET, 2-10
DECevent, 5-2
deposit and examine commands, 4-12
Devices, configuring, 6-29
Devices, verifying, 4-58
Diagnostic commands
buildfru, 4-4
cat el, 4-8
clear_error, 4-10
clear_error all, 4-10
crash, 4-11
deposit and examine, 4-12
exer, 4-16
floppy_write, 4-21
grep, 4-22
hd, 4-24
info, 4-26
kill, 4-33
kill_diags, 4-33
memexer, 4-34
memtest, 4-36
more el, 4-8
net, 4-41
net -ic, 4-41
net -s, 4-41
nettest, 4-43
set sys_serial_num, 4-47
show error, 4-48
show fru, 4-51
show_status, 4-54
sys_exer, 4-56
test, 4-58
E
ECC logic, 5-13
ei*0_inet_init environment variable,
6-11
ei*0_mode environment variable, 6-12
F
Fail-safe loader, 2-13, 3-22
activating, 3-30, 3-31
jumpers, 3-30
Fans, 1-26
part numbers, 8-2
replacing, 8-22
Index-3
G
Galaxy, OpenVMS, 6-46
Graphics mode, 6-19
grep command, 4-22
Greycode test, 4-37, 4-38
H
Halt button, 1-11
with login command, 6-28
halt in/out commands (RMC), 1-11, 7-23
Halt LED, 1-11
Halt, remote, 1-11, 7-23
hangup command (RMC), 7-25
Hard drive, removing, 8-24
Hard drives, 1-29
Hardware configuration
viewing, 6-5
hd command, 4-24
Heap space, resizing, 3-14
heap_expand environment variable,
6-12
Hex dump, 4-24
Index-4
I
I/O connector assembly, removing, 8-40
I/O connectors, 1-8
I/O control logic, 1-18
I/O implementation, 1-19
info 0 command, 4-26
info 1 command, 4-28
info 2 command, 4-29
info 3 command, 4-30
info 4 command, 4-32
info command, 4-26
Information resources, 2-12
Installing disk cages, 6-40
Interlock switch, 8-17
Internal processor registers (21264),
D-1
Interrupts, 5-14
J
Jumpers
PCI, B-8
RMC and SPC, B-2
setting, B-10
TIG/SROM, B-4
Jumpers and switches, B-1
Junk I/O. See I/O connector assembly
K
kbd_hardware_type environment
variable, 6-12
Key mapping, AlphaBIOS, 6-23
Keyboard port, 1-9
Keys, 1-30
kill command, 4-33
kill_diags command, 4-33
KZPAC-xx RAID controllers, 6-25
kzpsa_host_id environment variable,
6-13
L
language environment variable, 6-13
LEDs
M
Machine checks, 5-14
memexer command, 4-34
Memory allocation, SRM, 3-14
Memory architecture, 1-16
Memory buses, 1-3
Memory configuration, 6-31
pedestal, 6-34
tower, 6-35
Memory exercisors, 4-34, 4-36
Memory failure, 3-9
Memory interleaving, 1-17
Memory motherboards. See MMBs
Memory options, 1-17
memory_text environment variable,
6-13
memtest command, 4-36
Memtest test 1, 4-38
Microprocessor, 21264, 1-15
MM_STAT Register, D-4
MMBs, 1-17
location of, 1-12
part number, 8-3
removing, 8-28
Model 1 and Model 2 systems, 1-5
Modem port, 1-9
Modules, processor, 1-12
MOP loopback tests, 4-44
more el command, 3-19, 4-8
N
net command, 4-41
net -ic command, 4-41
net -s command, 4-41
nettest command, 4-43
Network ports, testing, 4-43
No MEM error, 3-24
O
OCP, 1-10
customized message, 6-4
error messages, 3-20
OCP assembly, removing, 8-34
ocp_text environment variable, 6-13
OpenVMS Galaxy, 6-46
Operating system exercisers, 2-10
Operating systems
errors reported by, 2-8
switching between, 6-47
switching to UNIX or OpenVMS, 6-48
Operator control panel. See OCP
Options, supported, 2-14
os_type environment variable, 6-13
P
Pagers, 7-27
PAL handler, 5-12
PALcode
error routines, 5-14
exception/interrupt handling, 5-12
Parallel port, 1-9
password environment variable, 6-13
Patches, 2-13
P-chips, 1-3
PCI backplane, 1-18
cables, 8-42
part numbers, 8-3
removing, 8-44
PCI bus implementation, 1-19
PCI buses, 6-36
PCI card
Index-5
Index-6
Q
quit command (RMC), 7-10
R
RAID utility, running, 6-25
RCM tool, 2-11
Real failed array, finding, E-3
Redundant power supply, 6-39
Register translation, platform logout
frames, D-46
Registers, D-1
Registers (21272)
AAR0-AAR3, D-35
MISC, D-26
PERROR, D-31
Registers (21272) DIRn, D-29
Registers (EV6/EV67)
Cbox Read, D-8
DC_STAT, D-6
EXC_ADDR, D-10
I_CTL, D-18
I_STAT, D-2
IER_CM, D-12
ISUM, D-14
MM_STAT, D-4
PAL_BASE, D-16
PCTX, D-23
Registers, dislaying, 4-26
Remote management console. See RMC
Remote power-on/off, 7-22
Remote system management logic, 1-20
Removable media, 1-28
removing 5.25-inch device, 8-36
Removable media bays, 1-6
Removal and replacement, 8-1
Removing covers from chassis, 8-16
Removing enclosure panels, 8-10
from a pedestal, 8-13
from a tower, 8-11
S
SCB offsets, 5-14
SCSI breakouts, 1-9
SDD errors, 4-49
Security
SRM, 6-26
Serial mode, 6-19
setting up, 6-22
Serial number mismatch, 4-49
Serial terminal, 1-32, 6-3
running utilities from, 6-23
set-up, 6-22
Service tools CD, 2-12
set com1_mode command (RMC), 7-15
set console command, 6-3
set envar command, 6-7
set escape command (RMC), 7-29
set heap_expand command, 3-14
set ocp_text command, 6-4
set password command, 6-26
set secure command, 6-27
set sys_serial_num command, 4-47
Setting environment variables, 6-6
show boot* command, 6-5
show config command, 6-5
show console command, 6-3
show device command, 6-5
show envar command, 6-7
show error command, 4-48
message translation, 4-50
show fru command, 4-51, 6-5
show fru E field, 4-53
show memory command, 6-5
show power command, 6-39
show_status command, 4-54
Single-bit errors (EV6), detecting, E-16
Slot locations, PCI, 6-36
Slot numbers
CPUs, 6-29
PCI, 6-36
Snoop mode, 7-7
Soft Bypass mode, 7-7
Software patches, 2-13
Index-7
T
TDD errors, 4-49
Technical information, on Web, 2-14
Terminal setup (RMC), 7-9
Index-8
U
UART ports, 7-5
Updating RMC, 3-32
USB ports, 1-9
Utilities
AlphaBIOS, 6-19
running from serial terminal, 6-23
running from VGA, 6-20
Utilities menu, 6-20
V
Verifying devices, 4-58
VGA console tests, 4-59
VGA controller, slot for, 6-37
VGA monitor, 1-32, 6-3
VT terminal, 6-3
W
Warning messages, RMC, 3-27
Index-9