FAS Systems SystemLevel Diagnostics Guide PDF
FAS Systems SystemLevel Diagnostics Guide PDF
Contents
Introduction to systemlevel diagnostics .................................................... 4
Requirements for running system-level diagnostics ................................................... 5
How to use online command-line help ........................................................................ 6
Running system installation diagnostics ..................................................... 8
Running system panic diagnostics ............................................................ 12
Running slow system response diagnostics .............................................. 15
Running hardware installation diagnostics ............................................. 19
Running device failure diagnostics ........................................................... 23
Copyright information ............................................................................... 27
Trademark information ............................................................................. 28
How to send your comments ...................................................................... 29
Index ............................................................................................................. 30
4 | System-Level Diagnostics Guide
General requirements
Each system being tested must be on a separate network.
The network interface test assigns unique static IP addresses, beginning with 172.25.150.23, to all
available network interfaces on a storage system. This results in network interface ports on
different storage controllers being assigned the same IP address. If all the systems being tested are
on the same network, then duplicate ip address warning messages appear on the connected
consoles. These warning messages do not affect the test results.
NIC requirements
All adjacent network interface ports on the system must be connected for best performance using
a standard Ethernet cable.
Examples of adjacent ports are e0a and e0b or e2c and e2d.
Attention: e0M and e0P ports cannot be connected together due to an internal switch
connection. In systems with e0M and e0P ports, the most efficient pairings are e0M with e0a
and e0P with e0b.
If there are a number of network interface ports on the system, you may need to run the NIC
system-level diagnostic test several times, limiting each run to no more than two pairs each time.
SAS requirements
When running the SAS system-level diagnostic tests, adjacent SAS ports must be connected for
best performance; storage shelves must be disconnected from the ports.
Note: Connections between adjacent SAS ports is no longer a requirement for systems running
Data ONTAP 8.2; however, only the internal loopback test will be run for systems with
unconnected SAS ports.
6 | System-Level Diagnostics Guide
FC-AL requirements
When running the FC-AL system-level diagnostic tests, you must have loopback hoods on FC-
AL interfaces on the motherboard or expansion adapters for best performance; all other cables for
storage or Fibre Channel networks must be disconnected from the ports.
Note: While the use of loopback hoods on FC-AL interfaces are no longer requirements for
systems running Data ONTAP 8.2, the scope of the test coverage on the interface is also
reduced.
Interconnect requirements
Both platform controller modules in a dual controller system must be in Maintenance mode for
the interconnect system-level diagnostic test to run.
Attention: You will receive a warning message if you attempt to run the interconnect system-
level diagnostic test with other system-level diagnostic tests.
You can also type the question mark at the command line for a list of all the commands that are
available at the current level of administration (administrative or advanced).
The following example shows the result of entering the environment help command at the
storage system command line. The command output displays the syntax help for the environment
commands.
[status] [shelf_power_status] |
[status] [chassis [all | list-sensors | Fan | Power | Temp | Power
Supply | RTC Battery | NVRAM4-temperature-7 | NVRAM4-battery-7]]
8 | System-Level Diagnostics Guide
Steps
1. At the storage system prompt, enter the following command to get to the LOADER prompt:
halt
2. On the node with the replaced component, enter the following command at the LOADER prompt:
boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to
function properly. The boot_diags command starts special drivers designed specifically for
system-level diagnostics.
Important: During the boot_diags process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in an HA configuration you must
ensure that the partner remains down.
You can safely respond y to these prompts.
5. Run all the default selected diagnostic tests on your storage system by entering the following
command:
sldiag device run
Your storage system provides the following output while the tests are still running:
After all the tests are complete, the following response appears by default:
*> <SLDIAG:_ALL_TESTS_COMPLETED>
7. Verify that there are no hardware problems on your new storage system by entering the following
command:
sldiag device status -long -state failed
The example shows that the tests were run without the appropriate hardware:
10 | System-Level Diagnostics Guide
Example
The following example shows how the full status of failures that occurred is displayed:
STATUS: Completed
Running system installation diagnostics | 11
LOOP: 1/1
TEST END --------------------------------------------
STATUS: Completed
Starting test on Fcal Adapter: 0b
Started gathering adapter info.
Adapter get adapter info OK
Adapter fc_data_link_rate: 1Gib
Adapter name: QLogic 2532
Adapter firmware rev: 4.5.2
Adapter hardware rev: 2
ioctl_status.class_type = 0x1
ioctl_status.subclass = 0x3
ioctl_status.info = 0x0
Started INTERNAL LOOPBACK:
INTERNAL LOOPBACK OK
Error Count: 2 Run Time: 70 secs
>>>>> ERROR, please ensure the port has a shelf or plug.
END DATE: Sat Jan 3 23:12:07 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
12 | System-Level Diagnostics Guide
Steps
1. At the storage system prompt, enter the following command to get to the LOADER prompt:
halt
2. On the node with the replaced component, enter the following command at the LOADER prompt:
boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to
function properly. The boot_diags command starts special drivers designed specifically for
system-level diagnostics.
Important: During the boot_diags process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in an HA configuration you must
ensure that the partner remains down.
You can safely respond y to these prompts.
After all the tests are complete, you receive the following default response:
*> <SLDIAG:_ALL_TESTS_COMPLETED>
5. Identify the cause of the system panic by entering the following command:
sldiag device status -long -state failed
The example shows that the tests were run without the appropriate hardware:
Running system panic diagnostics | 13
Example
The following example displays the full status of the failures that occurred:
STATUS: Completed
14 | System-Level Diagnostics Guide
LOOP: 1/1
TEST END --------------------------------------------
STATUS: Completed
Starting test on Fcal Adapter: 0b
Started gathering adapter info.
Adapter get adapter info OK
Adapter fc_data_link_rate: 1Gib
Adapter name: QLogic 2532
Adapter firmware rev: 4.5.2
Adapter hardware rev: 2
ioctl_status.class_type = 0x1
ioctl_status.subclass = 0x3
ioctl_status.info = 0x0
Started INTERNAL LOOPBACK:
INTERNAL LOOPBACK OK
Error Count: 2 Run Time: 70 secs
>>>>> ERROR, please ensure the port has a shelf or plug.
END DATE: Sat Jan 3 23:12:07 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
Steps
1. At the storage system prompt, enter the following command to get to the LOADER prompt:
halt
2. On the node with the replaced component, enter the following command at the LOADER prompt:
boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to
function properly. The boot_diags command starts special drivers designed specifically for
system-level diagnostics.
Important: During the boot_diags process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in an HA configuration you must
ensure that the partner remains down.
You can safely respond y to these prompts.
Your storage system provides the following output while the tests are still running:
After all the tests are complete, the following response appears by default:
*> <SLDIAG:_ALL_TESTS_COMPLETED>
5. Identify the cause of the system sluggishness by entering the following command:
sldiag device status -long -state failed
The example shows that the tests were run without the appropriate hardware:
16 | System-Level Diagnostics Guide
Example
The following example pulls up the full status of failures that occurred:
STATUS: Completed
ib3a: could not set loopback mode, test failed
END DATE: Sat Jan 3 23:11:04 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
STATUS: Completed
Starting test on Fcal Adapter: 0b
Started gathering adapter info.
Adapter get adapter info OK
18 | System-Level Diagnostics Guide
ioctl_status.class_type = 0x1
ioctl_status.subclass = 0x3
ioctl_status.info = 0x0
Started INTERNAL LOOPBACK:
INTERNAL LOOPBACK OK
Error Count: 2 Run Time: 70 secs
>>>>> ERROR, please ensure the port has a shelf or plug.
END DATE: Sat Jan 3 23:12:07 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
Steps
After you issue the command, wait until the system stops at the LOADER prompt.
Important: During the boot process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in a HA configuration you
must ensure that the healthy node remains down.
You can safely respond y to these prompts.
3. On the node with the replaced component, enter the following command at the LOADER prompt:
boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to
function properly. The boot_diags command starts special drivers designed specifically for
system-level diagnostics.
Important: During the boot_diags process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in an HA configuration you must
ensure that the partner remains down.
You can safely respond y to these prompts.
Your storage system provides the following output while the tests are still running:
After all the tests are complete, the following response appears by default:
*> <SLDIAG:_ALL_TESTS_COMPLETED>
6. Verify that no hardware problems resulted from the addition or replacement of hardware
components on your storage system by entering the following command:
sldiag device status [-dev devtype|mb|slotslotnum] [-name device] -long
-state failed
The example shows that the tests were run without the appropriate hardware:
Running hardware installation diagnostics | 21
Example
The following example pulls up the full status of failures resulting from testing a newly installed
FC-AL adapter:
STATUS: Completed
Starting test on Fcal Adapter: 0b
Started gathering adapter info.
Adapter get adapter info OK
Adapter fc_data_link_rate: 1Gib
Adapter name: QLogic 2532
Adapter firmware rev: 4.5.2
Adapter hardware rev: 2
ioctl_status.class_type = 0x1
ioctl_status.subclass = 0x3
ioctl_status.info = 0x0
Started INTERNAL LOOPBACK:
INTERNAL LOOPBACK OK
Error Count: 2 Run Time: 70 secs
>>>>> ERROR, please ensure the port has a shelf or plug.
END DATE: Sat Jan 3 23:12:07 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
Steps
1. At the storage system prompt, enter the following command to get to the LOADER prompt:
halt
2. On the node with the replaced component, enter the following command at the LOADER prompt:
boot_diags
Note: You must enter this command from the LOADER prompt for system-level diagnostics to
function properly. The boot_diags command starts special drivers designed specifically for
system-level diagnostics.
Important: During the boot_diags process, you might see the following prompts:
A prompt warning of a system ID mismatch and asking to override the system ID.
A prompt warning that when entering Maintenance mode in an HA configuration you must
ensure that the partner remains down.
You can safely respond y to these prompts.
4. View the status of the test by entering the following command: sldiag device status
Your storage system provides the following output while the tests are still running:
After all the tests are complete, the following response appears by default:
*> <SLDIAG:_ALL_TESTS_COMPLETED>
5. Identify any hardware problems by entering the following command: sldiag device status
[-dev devtype|mb|slotslotnum] [-name device] -long -state failed
The example shows that the tests were run without the appropriate hardware:
Were completed There are no hardware problems and your storage system returns to the prompt.
without any
a. Clear the status logs by entering the following command: sldiag device
failures
clearstatus [-dev devtype|mb|slotslotnum]
b. Verify that the log is cleared by entering the following command: sldiag
device status [-dev devtype|mb|slotslotnum]
The following default response is displayed:
Example
The following example shows how the full status of failures resulting from testing the FC-AL
adapter are displayed:
DEVTYPE: fcal
NAME: Fcal Loopback Test
START DATE: Sat Jan 3 23:10:56 GMT 2009
STATUS: Completed
Starting test on Fcal Adapter: 0b
Started gathering adapter info.
Adapter get adapter info OK
Adapter fc_data_link_rate: 1Gib
Adapter name: QLogic 2532
Adapter firmware rev: 4.5.2
Adapter hardware rev: 2
ioctl_status.class_type = 0x1
ioctl_status.subclass = 0x3
ioctl_status.info = 0x0
Started INTERNAL LOOPBACK:
INTERNAL LOOPBACK OK
Error Count: 2 Run Time: 70 secs
>>>>> ERROR, please ensure the port has a shelf or plug.
END DATE: Sat Jan 3 23:12:07 GMT 2009
LOOP: 1/1
TEST END --------------------------------------------
Copyright information
Copyright 19942013 NetApp, Inc. All rights reserved. Printed in the U.S.
No part of this document covered by copyright may be reproduced in any form or by any means
graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an
electronic retrieval systemwithout prior written permission of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and
disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,
WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice.
NetApp assumes no responsibility or liability arising from the use of products described herein,
except as expressly agreed to in writing by NetApp. The use or purchase of this product does not
convey a license under any patent rights, trademark rights, or any other intellectual property rights of
NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents,
or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to
restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer
Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).
28 | System-Level Diagnostics Guide
Trademark information
NetApp, the NetApp logo, Network Appliance, the Network Appliance logo, Akorri,
ApplianceWatch, ASUP, AutoSupport, BalancePoint, BalancePoint Predictor, Bycast, Campaign
Express, ComplianceClock, Cryptainer, CryptoShred, CyberSnap, Data Center Fitness, Data
ONTAP, DataFabric, DataFort, Decru, Decru DataFort, DenseStak, Engenio, Engenio logo, E-Stack,
ExpressPod, FAServer, FastStak, FilerView, Flash Accel, Flash Cache, Flash Pool, FlashRay,
FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexSuite, FlexVol, FPolicy, GetSuccessful,
gFiler, Go further, faster, Imagine Virtually Anything, Lifetime Key Management, LockVault, Mars,
Manage ONTAP, MetroCluster, MultiStore, NearStore, NetCache, NOW (NetApp on the Web),
Onaro, OnCommand, ONTAPI, OpenKey, PerformanceStak, RAID-DP, ReplicatorX, SANscreen,
SANshare, SANtricity, SecureAdmin, SecureShare, Select, Service Builder, Shadow Tape,
Simplicity, Simulate ONTAP, SnapCopy, Snap Creator, SnapDirector, SnapDrive, SnapFilter,
SnapIntegrator, SnapLock, SnapManager, SnapMigrator, SnapMirror, SnapMover, SnapProtect,
SnapRestore, Snapshot, SnapSuite, SnapValidator, SnapVault, StorageGRID, StoreVault, the
StoreVault logo, SyncMirror, Tech OnTap, The evolution of storage, Topio, VelocityStak, vFiler,
VFM, Virtual File Manager, VPolicy, WAFL, Web Filer, and XBB are trademarks or registered
trademarks of NetApp, Inc. in the United States, other countries, or both.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. A complete and current list of
other IBM trademarks is available on the web at www.ibm.com/legal/copytrade.shtml.
Apple is a registered trademark and QuickTime is a trademark of Apple, Inc. in the United States
and/or other countries. Microsoft is a registered trademark and Windows Media is a trademark of
Microsoft Corporation in the United States and/or other countries. RealAudio, RealNetworks,
RealPlayer, RealSystem, RealText, and RealVideo are registered trademarks and RealMedia,
RealProxy, and SureStream are trademarks of RealNetworks, Inc. in the United States and/or other
countries.
All other brands or products are trademarks or registered trademarks of their respective holders and
should be treated as such.
NetApp, Inc. is a licensee of the CompactFlash and CF Logo trademarks.
NetApp, Inc. NetCache is certified RealSystem compatible.
29
Index
C after device failures 23
after hardware installations 19
considerations after slow system responses 15
for running system-level diagnostics 5 after system installations 8
after system panics 12
system-level considerations for 5
D
device failures S
running diagnostics after 23
diagnostics slow system response
running after device failure 23 running diagnostics for 15
running after hardware installation 19 system-level diagnostics
running after slow system response 15 considerations for running 5
running after system installation 8 systems
running after system panic 12 running diagnostics after installation failures 8
running diagnostics after panics 12
running diagnostics for slow response 15
F
failures T
running diagnostics after device 23
troubleshooting
considerations for running system-level diagnostics
H 5
hardware installations device failures 23
running diagnostics after 19 hardware installations 19
slow system response 15
system installation 8
O system panics 12
online command-line help 6
R
running diagnostics