Isilon x200 Replace Boot Drive
Isilon x200 Replace Boot Drive
Topic
x200
Selections
Select Node Type: Non-CTO
X200 - Select Non-CTO Node Activity: Replace Boot Drive
REPORT PROBLEMS
If you find any errors in this procedure or have comments regarding this application, send email to
SolVeFeedback@emc.com
Copyright © 2019 Dell Inc. or its subsidiaries. All Rights Reserved.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION (“EMC”)
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-
INFRINGEMENT AND ANY WARRANTY ARISING BY STATUTE, OPERATION OF LAW, COURSE OF
DEALING OR PERFORMANCE OR USAGE OF TRADE. IN NO EVENT SHALL EMC BE LIABLE FOR
ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL,
LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF EMC HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice. Use, copying, and distribution of any EMC software described in this
publication requires an applicable software license.
Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other
trademarks may be the property of their respective owners.
Page 1 of 23
Contents
Preliminary Activity Tasks .......................................................................................................4
Read, understand, and perform these tasks.................................................................................................4
Page 2 of 23
Task 21: Run the ABR script.........................................................................................................19
Task 22: Generate an ABR ...........................................................................................................20
Task 23: Remove the FRU package from the node......................................................................20
Task 24: Update drive firmware ....................................................................................................20
Task 25: Verify a drive firmware update........................................................................................21
Task 26: Gather logs.....................................................................................................................22
Task 27: Returning a failed part to Isilon.......................................................................................22
Task 28: Update the install database............................................................................................22
Where to go for support ..............................................................................................................................22
Page 3 of 23
Preliminary Activity Tasks
This section may contain tasks that you must complete before performing this procedure.
Table 1 List of cautions, warnings, notes, and/or KB solutions related to this activity
2. [ ] This is a link to the top trending service topics. These topics may or not be related to this activity.
This is merely a proactive attempt to make you aware of any KB articles that may be associated with
this product.
Isilon Top Service Topics
Page 4 of 23
General Information for Removing and Installing FRUs
This section describes precautions you must take and general procedures you must follow when
removing, installing, or storing field-replaceable units (FRUs). The procedures in this section apply to FRU
handling during hardware upgrades as well as during general replacement.
FRUs are designed to be powered up at all times. This means you can accomplish FRU replacements
and most hardware upgrades while the cabinet is powered up. To maintain proper airflow for cooling and
to ensure EMI compliance, make sure all front bezels, filler panels, and filler modules are reinstalled after
the FRU replacement or hardware upgrade is completed.
IMPORTANT: These procedures are not a substitute for the use of an ESD kit. You should follow them
only in the event of an emergency.
Before touching any FRU, touch a bare (unpainted) metal surface of the enclosure.
Before removing any FRU from its antistatic bag, place one hand firmly on a bare metal surface of the
enclosure, and at the same time, pick up the FRU while it is still sealed in the antistatic bag. Once you
have done this, do not move around the room or contact other furnishings, personnel, or surfaces
until you have installed the FRU.
When you remove a FRU from the antistatic bag, avoid touching any electronic components and
circuits on it.
Page 5 of 23
If you must move around the room or touch other surfaces before installing a FRU, first place the
FRU back in the antistatic bag. When you are ready again to install the FRU, repeat these
procedures.
Page 6 of 23
This node contains two flash boot drives. The boot drives contain data vital to the health of the node,
including the OneFS operating system and backups of the node journal. The two boot drives mirror each
other, which provides a backup source of data if one of the drives fails.
You must identify which of the two boot drives has failed before you shut down the node. If you
accidentally remove the healthy drive, the data on the node can be lost.
CAUTION: If this procedure is not followed accurately, data loss and severe disruption of cluster
operations can occur. Perform every step in this procedure; if the system does not respond as expected,
contact Isilon Technical Support.
CAUTION: Perform this procedure on only one node at a time. Performing maintenance on multiple
nodes in parallel may lower the protection level of the cluster, put data at risk, and lead to the interruption
of client workflows.
Page 7 of 23
Task 2: Install a drive support package
Before you install a new drive, download and install the drive support package. The drive support
package updates drive configuration information on the node, and contains the latest firmware versions
for Isilon qualified drives.
About this task
The drive support package is only supported by clusters running OneFS 7.1.1 or later. If your cluster is
running an earlier version of OneFS, you can skip this step.
Procedure
1. [ ] Go to the EMC Support page that lists all the available versions of the drive support package.
2. [ ] Click the latest version of the drive support package and download the file.
Note: See the Considerations for installing the latest drive support package section in order to select
the appropriate variant of the package. If you are unable to download the package, contact EMC
Isilon Technical Support for assistance.
3. [ ] Open a secure shell (SSH) connection to any node in the cluster and log in.
4. [ ] Create or check for the availability of the directory structure
/ifs/data/Isilon_Support/dsp.
5. [ ] Copy the downloaded file to the dsp directory through SCP, FTP, SMB, NFS, or any other
supported data-access protocols.
6. [ ] Unpack the drive support package by running the following command:
tar -zxvf Drive_Support_<version>.tgz
7. [ ] Install the package by running the following command:
isi_dsp_install Drive_Support_<version>.tar
Note: You must run the isi_dsp_install command to install the dirve support package. Do not
use the isi pkg command.
Procedure
1. [ ] Download the latest FRU package from ftp://ftp.emc.com/outgoing/Fru_Package/.
2. [ ] Note the name of the FRU package. You will use the name for other commands.
Package names follow this convention:
IsiFru_Package_ <date-time-stamp> .tgz
For example: IsiFru_Package_201507072125.tgz
Page 8 of 23
3. [ ] Place the FRU package on the cluster through a network drop, or by asking someone at the
cluster site to place the package for you. If neither of these options is available to you, contact Isilon
Technical Support for assistance.
The boot drives are listed in the left column. In the previous example, both boot drives are healthy. If
one of the boot drives has failed, the drive will not appear in the output.
Make note of whether the failed boot drive is the ada1 or ada0 device, and then use the following
table to determine the location of the boot drive inside the node.
Make note of the board drive slot that contains the failed boot drive. The ada1 drive is on the right
side of the boot carrier card. The ada0 drive is on the left side of the boot carrier card.
CAUTION: If both drives appear to have failed, do not continue. Contact Isilon Technical Support
immediately.
Page 9 of 23
Earlier than OneFS 8.0
atacontrol list
The following information appears:
ATA channel 0:
Master: no device present
Slave: no device present
ATA channel 1:
Master: no device present
Slave: no device present
ATA channel 2:
Master: ad4 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
Slave: no device present
ATA channel 3:
Master: no device present
Slave: ad7 <SanDisk SSD P4 8GB/SSD 8.10> Serial ATA v1.0 II
ATA channel 4:
Master: no device present
Slave: no device present
ATA channel 5:
Master: no device present
Slave: no device present
The boot drives are listed under ATA channel 2 (master) and ATA channel 3 (slave). In the
previous example, both boot drives are healthy. If one of the boot drives has failed, the display reads
no device present for that drive.
Make note of whether the failed boot drive is the ad4 or ad7 device, and then use the following table
to determine the location of the boot drive inside the node.
Make note of the board drive slot that contains the failed boot drive.
CAUTION: If both drives appear to have failed, do not continue. Contact Isilon Product Support
immediately.
3. [ ] If both drives appear to be healthy, one of the drives may have partially failed. To identify a
partially failed drive, check the status of the individual partition mirrors by typing the following
command:
gmirror status
From left to right, the output displays the name of each mirror, the status of the mirror relationship,
and the component IDs for each boot drive.
The following example shows the boot drive partition layout in a healthy node. The mirrors for each
partition show:
a value of COMPLETE in the Status column.
Page 10 of 23
the component IDs for both boot drives in the Components column. The component IDs are a
combination of the OneFS Drive ID, and the partition number (the number following the letter p).
Both boot drives are listed for each mirror with the exception of the var-crash mirror, which only
lists the slave drive.
Note: If you are running OneFS 8.0 or later, your OneFS Drive IDs will display as ada0 or ada1.The
partition numbers in the display may differ from the following example.
The following example shows the boot drive partition layout as it appears in the event of a failed boot
drive. A failed boot drive forces the mirrors for a partition to show:
a value of DEGRADED in the Status column.
only the component ID of the healthy boot drive in the Components column. The failed boot drive
does not appear.
Note: DEGRADED does not refer to a specific drive, but to the mirror relationship between the drives.
If a drive appears in the Components column next to the DEGRADED status, it is healthy and should
not be removed.
In the previous example, ad7p4 is missing from the degraded partition mirror/root0, and ad7p6 is
missing from the degraded partition mirror/var0. The missing drive, ad7, is the failed drive.
Determine which drive has failed. Use the previous table to determine which board drive slot contains
the failed boot drive and make a note of the number (J3 or J4).
Note: If both drives have failed, do not continue. Contact Isilon Product Support.
Page 11 of 23
Procedure
1. [ ] Connect to an available node in the cluster with a serial cable or network drop.
2. [ ] Determine the IP address of the node you are powering down by typing the command:
isi status -q
3. [ ] From the node that you connected to, open a secure shell (SSH) connection to the node that is
to be shut down by typing the command:
ssh <node_ip_address>
4. [ ] Power down the node by typing the following command:
shutdown -p now
If the node does not respond to the shutdown command, press the Power button on the node three
times, and then wait five minutes. If the node still does not shut down, you are at risk for losing data.
Do not proceed. Contact EMC Isilon Technical support for assistance.
CAUTION: A forced power down should be attempted only if a node is unresponsive. Forcing the
power down of a healthy node can result in data loss.
5. [ ] Verify that the node is powered down by typing the following command:
isi status -q
Confirm that the node has a status of D--R (Down, Read Only). See node 3 in the following example.
ID |IP Address |DASR| In Out Total| Used / Size |Used / Size
---+---------------+----+-----+-----+-----+------------------+-
1|10.53.217.201 | OK | 48M| 0| 48M| 19G/ 6.2T(< 1%)|(No SSDs)
2|10.53.217.202 | OK | 46M| 0| 46M| 23G/ 6.2T(< 1%)|(No SSDs)
3|10.53.217.203 |D--R| n/a| n/a| n/a| n/a/ n/a( n/a)|n/a/n/a( n/a)
Note: If there are transceivers connected to the end of your IB or ethernet cables, make sure to
remove them with the cables. If you are using fiber ethernet cables, you will need to disconnect the
cable from the transceiver, then remove the transceiver from the node.
Page 12 of 23
DANGER: Slide the node out from the rack slowly. Do not extend the rails completely until you
confirm that the node is latched and safely secured to the rails.
WARNING: Properly ground yourself to prevent electrostatic discharge from damaging the node. For
example, attach an ESD strap to your wrist and the node chassis.
Procedure
1. [ ] Loosen the captive screw that secures the node top panel.
2. [ ] Slide the top panel toward the rear of the node, and then lift the top panel to access the node
interior.
Page 13 of 23
Figure 1 Cross bracket
2. [ ] Remove the cross bracket by pressing on both sides of the node chassis where the cross
bracket is connected, then lift the bracket out of the node.
Page 14 of 23
1. Boot drive
Procedure
1. [ ] Locate the two board drive slots that contain the boot drives. The slots are labeled J3 and J4.
Gently pull the failed boot drive from the board drive slot.
Page 15 of 23
1. J3 connector 2. J4 connector
2. [ ] Insert the replacement boot drive into the empty boot drive slot and gently press down to secure
the drive.
WARNING: The cross bracket sits directly above the boot drives. Use caution when installing the cross
bracket so that the boot drives are not dislodged or damaged.
WARNING: The chassis intrusion switch can be damaged if the top panel is slid too far back on the
node.
2. [ ] Tighten the captive top panel screw to secure the top panel to the node.
WARNING: Slide the node slowly so you do not slam the node into the rack and damage the node.
2. [ ] Reconnect the ethernet, InfiniBand, and power cables to the back of the node.
3. [ ] Secure the node to the rack cabinet.
4. [ ] Replace the node front panel.
Page 16 of 23
2. [ ] Disconnect the InfiniBand cables from the back of the node.
3. [ ] Connect directly to the node using a serial cable.
Note: If you are running OneFS 8.0 or later, your OneFS Drive IDs will display as ada0 or ada1.The
partition numbers in the display may differ from the following example.
Confirm that the values in the Status column all read COMPLETE.
2. [ ] Verify boot drive information. Depending on your version of OneFS, run one of the following
commands:
OneFS 8.0 or later
Page 17 of 23
camcontrol devlist | grep ad
The following information appears:
<SanDisk SSD P4 8GB SSD 8.10> at scbus1 target 0 lun 0 (pass13,ada0)
<SanDisk SSD P4 8GB SSD 8.10> at scbus2 target 1 lun 0 (pass14,ada1)
Note: It is not necessary to update CTO and as-built information on non-CTO nodes. If you are
completing maintenance on a non-CTO node, skip all steps related to the FRU package.
Note: If your cluster is running in SmartLock compliance mode with OneFS 7.0.2.10 or later, 7.0.1.4 or
later, or 7.1.1.0 or later you will need to enter the provided compliance mode commands to run the FRU
scripts. If your cluster is running in compliance mode but is not running one of these versions, you will
need to upgrade your OneFS version to support the compliance mode commands. Contact Isilon
Technical Support.
Page 18 of 23
2. [ ] Unpack the FRU package by running the following command:
tar -zxvf IsiFru_Package_<date-time-stamp>.tgz
3. [ ] Type cd to change to the directory containing the FRU tar.
4. [ ] Install the package. Depending on your version of OneFS, run one of the following commands:
OneFS 8.0 or later
isi upgrade patches install IsiFru_Package_<date-time-stamp>_.tar
Earlier than OneFS 8.0
isi pkg install IsiFru_Package_<date-time-stamp>.tar
As the package installs, the following message appears:
Preparing to install the package...
Checking the package for installation...
Installing the package
Committing the installation...
Package is committed.
Note: You must include the period at the end of the command.
Page 19 of 23
Sending an ABR to Isilon with no connectivity
If no external connectivity is available, the As Built Record on a Configure to Order (CTO) node cannot be
automatically delivered to Isilon Technical Support.
If external connectivity is available, the ABR is automatically generated and delivered to Isilon Technical
Support. If there is no external connectivity available, you must generate and copy the ABR from the
node, and then send the ABR to Isilon Technical Support through an alternate connection.
Page 20 of 23
This procedure explains how to update drive firmware on clusters running OneFS 7.1.1 or later. If your
cluster is running an earlier version of OneFS, you must download and install the latest drive firmware
package. For more information, see the latest drive firmware package release notes available on
https://support.emc.com/.
Note: Do not restart or power off nodes while drive firmware is being updated on the cluster.
Procedure
1. [ ] Open a secure shell (SSH) connection to any node in the cluster and log in.
2. [ ] Depending on your version of OneFS, run one of the following commands to update the drive
firmware for your cluster:
OneFS 8.0 or later
To update the drive firmware for your entire cluster, run the following command:
isi devices drive firmware update start all --node-lnn all
To update the drive firmware for a specific node only, run the following command:
isi devices drive firmware update start all --node-lnn <node-number>
OneFS 7.1.1 - OneFS 8.0
For OneFS versions between 7.1.1 - 8.0 you will need to run the following command on each node
that requires drive firmware:
isi devices -a fwupdate
CAUTION: You must wait for one node to finish updating before you initiate an update on the next
node. To confirm that a node has finished updating, run the following command:
CAUTION: isi devices -d <node-number>
Updating the drive firmware of a single drive takes approximately 15 seconds, depending on the drive
model. OneFS updates drives sequentially.
Page 21 of 23
OneFS 8.0 or later
isi devices drive firmware list
Earlier than OneFS 8.0
isi drivefirmware status
If all drives have been updated, the Desired FW column is empty.
3. [ ] Verify that all affected drives are operating in a healthy state by running the following command:
OneFS 8.0 or later
isi devices drive drive list --node-lnn all
Earlier than OneFS 8.0
isi devices
If a drive is operating in a healthy state, [HEALTHY] appears in the status column.
Page 22 of 23
Online Support Live Chat
Create a Service Request
Telephone Support United States: 1-800-SVC-4EMC (800-782-4362)
Canada: 800-543-4782
Worldwide: +1-508-497-7901
For local phone numbers for a specific country, see
EMC Customer Support Centers.
Help with Online Support For questions specific to EMC Online Support
registration or access, email support@emc.com.
Isilon Info Hubs For the list of Isilon info hubs, see the Isilon Info Hubs
page on the EMC Isilon Community Network. Isilon info
hubs organize Isilon documentation, videos, blogs, and
user-contributed content into topic areas, making it easy
to find content about subjects that interest you.
Page 23 of 23