0% found this document useful (0 votes)
1K views238 pages

ONTAP Cluster Fundamentals

Ontap cluster fundamentals -9.0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views238 pages

ONTAP Cluster Fundamentals

Ontap cluster fundamentals -9.0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 238

ONTAP Cluster Fundamentals

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome to ONTAP Cluster Fundamentals.

1
The ONTAP Cluster Fundamentals course:
▪ Is for cluster administrators of any experience level
▪ Is divided into five modules:
▪ Clusters
▪ Management
Welcome ▪ Networking
▪ Storage Virtual Machines
▪ Maintenance

▪ Is followed by a final assessment

The ONTAP Cluster Fundamentals course is written for cluster administrators of any
experience level. The course is divided into five modules, with each module based on
a specific topic. The course is followed by a final assessment.

2
ONTAP Compliance
Solutions Administration
ONTAP Data Protection
Fundamentals
ONTAP Data Protection
Administration

ONTAP SAN ONTAP SAN


Welcome Fundamentals Administration

ONTAP SMB
Administration
ONTAP NAS
Fundamentals
ONTAP NFS
Administration

ONTAP Cluster ONTAP Cluster


Fundamentals Administration

Foundational Intermediate

Each course of the ONTAP 9 Data Management Software training focuses on a


particular topic. You build your knowledge as you progress up the foundational
column, so you should take the fundamentals courses in the order shown. Likewise,
you build your knowledge as you progress up the intermediate column. The
foundational courses are prerequisites for the intermediate courses. The courses are
color coded to enable you to identify the relationships. For example, the ONTAP NAS
Fundamentals, ONTAP NFS Administration, and ONTAP SMB Administration focus
on NAS.

The location marker indicates the course that you are attending. You should complete
this course before you attend the ONTAP Cluster Administration course.

3
How to Complete This Course
ONTAP Cluster Fundamentals Pre-Assessment
▪ If you achieved 80% or greater:
▪ Review any of the ONTAP Cluster Fundamentals modules (optional)
▪ Take the final assessment
Instructions ▪ If you received a list of recommended course modules:
▪ Study the recommended course modules, or study all course modules
▪ Take the final assessment

When you completed the ONTAP Cluster Fundamentals Pre-Assessment, if you


achieved 80% or greater on all the modules, you are welcome to review any of the
ONTAP Cluster Fundamentals modules, or you can go directly to the final
assessment.
If you did not achieve 80% or greater on all the modules, you received a list of
recommended course modules. At a minimum, you should study the recommended
course modules, but you are encouraged to study all five. Then take the final
assessment to complete the course.

4
ONTAP Cluster Fundamentals:
Clusters

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome ONTAP Cluster Fundamentals: Clusters.

5
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules

The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.

This module was written for cluster administrators and provides an introduction to the
concept of a cluster.

6
This module focuses on enabling you to do the following:
▪ Identify the components that make up a cluster
▪ Describe the cluster configurations that are supported
▪ Create and configure a cluster
About This
Module ▪ Describe the physical storage components
▪ Describe the Write Anywhere File Layout (WAFL) file system

This module identifies and describes the components that make up a cluster. The
module also describes the supported cluster configurations and details the steps that
are required to create and configure a cluster. Then the module discusses the
physical storage components and the Write Anywhere File Layout file system, also
known as the WAFL file system.

7
NetApp ONTAP Is the Foundation for Your Data Fabric

Departments or
Remote Offices
Data Mobility

Data Fabric Off-Premises


Clouds
Seamless Data Management
On-Premises
Data Center

Data Fabric powered by NetApp weaves hybrid cloud mobility with uniform data
management.

Data Fabric seamlessly connects multiple data-management environments across


disparate clouds into a cohesive, integrated whole. Organizations maintain control
over managing, securing, protecting, and accessing data across the hybrid cloud, no
matter where the data is located. IT has the flexibility to choose the right set of
resources and the freedom to change the resources whenever necessary. NetApp
works with new and existing partners to continually add to the fabric.

For more information about Data Fabric, see the Welcome to Data Fabric video. A link
to this video is available in the Resources section.

8
Lesson 1
Cluster Components

Lesson 1, Cluster Components.

9
Harness the Power of the Hybrid Cloud

▪ Simplify data management for


any application, anywhere
▪ Accelerate and protect data across
the hybrid cloud
▪ Future-proof your data infrastructure

This lesson introduces NetApp ONTAP 9 data management software and the
components that make up a cluster.

A basic knowledge of the components helps you to understand how ONTAP can
simplify the transition to the modern data center.

10
Clusters

Cluster
interconnect
All Flash
FAS
FAS

For product specifications, see the


Hardware Universe:
hwu.netapp.com

You might be wondering, “What exactly is a cluster?” To answer that question, this
lesson examines the components individually, but begins with a high-level view.

A cluster is one or more FAS controllers or All Flash FAS controllers that run ONTAP.
A controller running ONTAP is called a “node.” In clusters with more than one node, a
cluster interconnect is required so that the nodes appear as one cluster.

A cluster can be a mix of various FAS and All Flash FAS models, depending on the
workload requirements. Also, nodes can be added to or removed from a cluster as
workload requirements change. For more information about the number and types of
nodes, see the Hardware Universe at hwu.netapp.com. A link is provided in the
module resources.

11
Nodes
What a node consists of:
▪ A FAS or All Flash FAS controller running
ONTAP software:
▪ Network ports
▪ Expansion slots
Controller ▪ Nonvolatile memory (NVRAM or NVMEM)

▪ Disks

Disk Shelf For product specifications, see the


Hardware Universe.
Node

A node consists of a FAS controller or an All Flash FAS controller that is running
ONTAP software. The controller contains network ports, expansion slots, and
NVRAM or NVMEM. Disks are also required. The disks can be internal to the
controller or in a disk shelf.

For information about specific controller models, see the product documentation on
the NetApp Support site, or see the Hardware Universe.

12
High-Availability Pairs
FAS8060 with an internal
interconnect ▪ Characteristics of high-availability (HA) pairs:
▪ Two connected nodes that form a partnership
▪ Connections to the same disk shelves
▪ Ability of surviving node to take control of failed
partner’s disks

Nodes 1 and 2 ▪ Components of HA pair connections:


▪ HA interconnect
▪ Multipath HA shelf connectivity
▪ Cluster interconnect connectivity
Disk Shelf 1

Disk Shelf 2

In multinode clusters, high-availability (HA) pairs are used. An HA pair consists of two
nodes that are connected to form a partnership. The nodes of the pair are connected to
the same shelves. Each node owns its disks. However, if either of the nodes fails, the
partner node can control all the disks, its own and its partners.

The controllers in the nodes of an HA pair are connected either through an HA


interconnect that consists of adapters and cables or through an internal interconnect. In
this example, the FAS8060 model uses an internal interconnect. The nodes must be
connected to the same shelves using redundant paths. The nodes also need to be
connected to a cluster interconnect, even if the cluster is composed of only one HA pair.

13
Networks
▪ Cluster interconnect:
▪ Connection of nodes
▪ Private network

▪ Management network:
▪ For cluster administration
▪ Management and data may be on a shared
Ethernet network

▪ Data network:
Management Network ▪ One or more networks that are used for data
access from clients or hosts
▪ Ethernet, FC, or converged network
Data Network

Clusters require one or more networks, depending on the environment.

In multinode clusters, nodes need to communicate with each other over a cluster
interconnect. In a two-node cluster, the interconnect can be switchless. When more
than two nodes are added to a cluster, a private cluster interconnect using switches is
required.

The management network is used for cluster administration. Redundant connections


to the management ports on each node and management ports on each cluster
switch should be provided to the management network. In smaller environments, the
management and data networks might be on a shared Ethernet network.

For clients and host to access data, a data network is also required. The data network
can be composed of one or more networks that are primarily used for data access by
clients or hosts. Depending on the environment, there might be an Ethernet, FC, or
converged network. These networks can consist of one or more switches, or even
redundant networks.

14
Ports and Logical Interfaces

Logical Logical interface (LIF) smv1-mgmt smv1-data1

Virtual LAN (VLAN) a0a-50 a0a-80


Virtual
Interface group a0a

Physical Port
e2a e3a

Nodes have various physical ports that are available for cluster traffic, management
traffic, and data traffic. These ports need to be configured appropriately for the
environment.

Ethernet ports can be used directly or combined by using interface groups. Also,
physical Ethernet ports and interface groups can be segmented by using virtual
LANs, or VLANs. Interface groups and VLANs are called virtual ports, and virtual
ports are treated similarly to physical ports.

A logical interface, or LIF, represents a network access point to a node in the cluster.
A LIF can be associated with a physical port, an interface group, or a VLAN to
interface with the management network or data network.

15
ONTAP Storage Architecture

Dynamic Virtualization Engine


Files and LUNs

Logical Layer FlexVol Volumes

Aggregate

Physical Layer
RAID Groups of Disks

The ONTAP storage architecture uses a dynamic virtualization engine, where data
volumes are dynamically mapped to physical space.

Disks are grouped into RAID groups. An aggregate is a collection of physical disk
space that contains one or more RAID groups. Each aggregate has a RAID
configuration and a set of assigned disks. The disks, RAID groups, and aggregates
make up the physical storage layer.

Within each aggregate, you can create one or more FlexVol volumes. A FlexVol
volume is an allocation of disk space that is a portion of the available space in the
aggregate. A FlexVol volume can contain files or LUNs. The FlexVol volumes, files,
and LUNs make up the logical storage layer.

16
Physical Storage
▪ Disk:
▪ Disk ownership can be assigned to one controller.
▪ A disk can be used as a spare or added to a
RAID group.

▪ RAID group:
▪ A RAID group is a collection of disks.
▪ Data is striped across the disks.

▪ Aggregate:
▪ One or more RAID groups can be used to form
an aggregate.
▪ An aggregate is owned by a one controller.

There are three parts that make up the physical storage on a node.

When a disk enters the system, the disk is unowned. Ownership is automatically or
manually assigned to a single controller. After ownership is assigned, a disk will be
marked as spare until the disk is used to create an aggregate or added to an existing
aggregate.

A RAID group is a collection of disks across which client data is striped and stored.

To support the differing performance and data sharing needs, you can group the
physical data storage resources into one or more aggregates. Aggregates can contain
one or more RAID groups, depending on the desired level of performance and
redundancy. Although aggregates can be owned by only one controller, aggregates
can be relocated to the HA partner for service or performance reasons.

17
Revised Slide 15

Logical Storage
▪ Storage virtual machine (SVM):
▪ Container for data volumes
Data
LIF ▪ Client data is accessed through a LIF

Client Access ▪ Volume:


SVM with FlexVol ▪ Logical data container for files or LUNs
Volumes
▪ ONTAP provides three types of volumes:
FlexVol volumes, FlexGroup volumes, and
Infinite volumes

▪ LIF:
▪ Representation of the network address that
Cluster is associated with a port
▪ Access to client data

A storage virtual machine, or SVM, contains data volumes and logical interfaces, or
LIFs. The data volumes store client data which is accessed through a LIF.

A volume is a logical data container that might contain files or LUNs. ONTAP software
provides three types of volumes: FlexVol volumes, FlexGroup volumes, and Infinite
volumes. Volumes contain file systems in a NAS environment and LUNs in a SAN
environment.

A LIF represents the IP address or worldwide port name (WWPN) that is associated
with a port. Data LIFs are used to access client data.

18
SVM with FlexVol Volumes
▪ FlexVol volume:
Qtree
Q3
Data ▪ Representation of the file system in a
LIF
Q2
Q1
NAS environment
Client Access ▪ Container for LUNs in a SAN environment
LUN Data
LIF ▪ Qtree:
SVM Host Access ▪ Partitioning of FlexVol volumes into
smaller segments
▪ Management of quotas, security style, and
CIFS opportunistic lock (oplock) settings

▪ LUN: Logical unit that represents a


Cluster SCSI disk

An SVM can contain one or more FlexVol volumes. In a NAS environment, volumes
represent the file system where clients store data. In a SAN environment, a LUN is
created in the volumes for a host to access.

Qtrees can be created to partition a FlexVol volume into smaller segments, much like
directories. Qtrees can also be used to manage quotas, security styles, and CIFS
opportunistic lock settings, or oplock settings.

A LUN is a logical unit that represents a SCSI disk. In a SAN environment, the host
operating system controls the reads and writes for the file system.

19
New Slide 17

FlexGroup Volumes
▪ A scale-out NAS container constructed from a group of FlexVol volumes,
which are called “constituents.”
▪ Constituents are placed evenly across the cluster to automatically and
transparently share a traffic load.

FlexGroup volumes provide


the following benefits:
▪ High scalability
Essentially unlimited
▪ Performance
Consistently low latency
▪ Manageability
Visually the same as FlexVol volumes /FlexGroup

In addition to containing FlexVol volumes, an SVM can contain one or more


FlexGroup volumes. A FlexGroup volume is a scale-out NAS container that leverages
the cluster resources to provide performance and scale. A FlexGroup volume
contains a number of constituents that automatically and transparently share a traffic
load.

FlexGroup volumes provide several benefits:


• High scalability: The maximum size for a FlexGroup volume in ONTAP 9.1 and
later is 20 PB, with 400 billion files on a 10-node cluster.
• Performance: FlexGroup volumes can leverage the resources of an entire cluster
to serve high-throughput and low-latency workloads.
• Manageability: A FlexGroup volume is a single namespace container that enables
simplified management that is similar to the management capability provided by
FlexVol volumes.

For more information about FlexGroup volumes, see the Scalability and Performance
Using FlexGroup Volumes Power Guide.

20
SVM with Infinite Volume
▪ Infinite Volume:
▪ One scalable volume that can store up to 2
Infinite billion files and tens of petabytes of data
Data
LIF ▪ Several constituents
Volume
Client Access
▪ Constituent roles:
SVM ▪ The data constituents store data.
▪ The namespace constituent tracks file
names, directories, and the file's physical
data location.
D NS D D D D D ▪ The namespace mirror constituent is a
D D D D D M D
data protection mirror copy of the
Cluster namespace constituent.

An SVM can contain one infinite volume. An infinite volume appears to a NAS client
as a single, scalable volume that can store up to 2 billion files and tens of petabytes of
data. Each infinite volume consists of several, typically dozens, of separate
components called constituents.

Constituents play one of various roles.

The data constituents, shown on the slide in blue, store the file’s physical data.
Clients are not aware of the data constituents and do not interact directly with them.
When a client requests a file from an infinite volume, the node retrieves the file's data
from a data constituent and returns the file to the client.

Each infinite volume has a one namespace constituent, shown on the slide in green.
The namespace constituent tracks file names, directories, and the file's physical data
location. Clients are also not aware of the namespace constituent and do not interact
directly with the namespace constituent.

A namespace mirror constituent, shown on the slide in red, is a data protection mirror
copy of the namespace constituent. It provides data protection of the namespace
constituent and support for incremental tape backup of infinite volumes.

For more information about infinite volumes, see the Infinite Volumes Management
Guide.

21
Knowledge Check
▪ Match each term with the term’s function.

Cluster Provides seamless scalability

Node Controls its physical storage and network resources

Provides availability of partner’s physical resources during


HA pair
a node failover

Aggregate A collection of RAID groups

SVM Owns its logical storage and network resources

FlexVol Volume Represents a filesystem

LIF Provides a network access point to an SVM

Match each term with the term’s function.

22
Knowledge Check
▪ Which three are network types? (Choose three.)
▪ Cluster interconnect
▪ Management network
▪ Data network
▪ HA network

Which three are network types?

23
Lesson 2
Cluster Configurations

Lesson 2, Cluster Configurations.

24
Consolidate Across Environments with ONTAP 9
Simplify data management for any application, anywhere

ONTAP 9
Storage Array Converged Heterogeneous SDS Near Cloud Cloud

Common Data Management


SDS = software-defined storage

ONTAP is mostly known as the data management software that runs on FAS and All
Flash FAS controllers. ONTAP 9 has many deployment options to choose from.

ONTAP can be deployed on engineered systems, which includes FAS and All Flash
FAS; converged systems, which includes FAS and All Flash FAS as part of a FlexPod
solution; third-party or E-Series storage arrays that use FlexArray virtualization
software; or near the cloud with NetApp Private Storage (NPS), which uses FAS or All
Flash FAS systems.

ONTAP can also be deployed on commodity hardware as software-defined storage


using ONTAP Select, or in the cloud using ONTAP Cloud.

Whichever deployment type you choose, you manage ONTAP in much the same
way, for a variety of applications. Although the ONTAP Cluster Fundamentals course
focuses on ONTAP clusters using FAS or All Flash FAS, the knowledge is also
applicable to all the deployment options.

25
Supported Cluster Configurations

Single-Node

Two-Node Switchless Multinode Switched

MetroCluster

NetApp supports single-node configurations, two-node switchless configurations,


multinode switched configurations, and MetroCluster configurations.

26
Revised Slide 24

Single-Node Cluster
▪ Single-node cluster:
▪ Special implementation of a cluster that runs on a
standalone node
▪ Appropriate when your workload requires only one
node and does not need nondisruptive operations
▪ Use case: Data protection for a remote office

▪ Features and operations that are


not supported:
▪ Storage failover and cluster high availability
▪ Multinode operations

A single-node cluster is a special implementation of a cluster running on a standalone


node. You can deploy a single-node cluster if your workload requires only one node
and does not need nondisruptive operations. For example, you could deploy a single-
node cluster to provide data protection for a remote office.

Some features and operations are not supported for single-node clusters. Because
single-node clusters operate in a standalone mode, storage failover and cluster high
availability are not available. If the node goes offline, clients cannot access data
stored in the cluster. Also, any operation that requires more than one node cannot be
performed. For example, you cannot move volumes, perform most copy operations,
or backup cluster configurations to other nodes.

27
Understanding HA Pairs
▪ HA pairs provide hardware redundancy to
do the following:
▪ Perform nondisruptive operations and upgrades
▪ Provide fault tolerance
▪ Enable a node to take over its partner’s storage and
later give back the storage
▪ Eliminate most hardware components and cables as
single points of failure
▪ Improve data availability

HA pairs provide hardware redundancy that is required for nondisruptive operations


and fault tolerance. The hardware redundancy gives each node in the pair the
software functionality to take over its partner's storage and later give back the
storage. These features also provide the fault tolerance required to perform
nondisruptive operations during hardware and software upgrades or maintenance.

A storage system has various single points of failure, such as certain cables or
hardware components. An HA pair greatly reduces the number of single points of
failure. If a failure occurs, the partner can take over and continue serving data until
the failure is fixed. The controller failover function provides continuous data
availability and preserves data integrity for client applications and users.

28
HA Interconnect
HA
Interconnect

Node 1 Node 2

Node 2 Storage
Node 1 Storage

Primary connection
Standby connection
Note: Multipath HA redundant
storage connections are not shown

Each node in an HA pair requires an HA interconnect between the controllers and


connections to both its own disk shelves and its partner node's shelves.

This example uses a standard FAS8080 EX HA pair with native DS4246 disk shelves.
The controllers in the HA pair are connected through an HA interconnect that consists
of adapters and cables. When the two controllers are in the same chassis, adapters
and cabling are not required because connections are made through an internal
interconnection. To validate an HA configuration, use the Hardware Universe.

For multipath HA support, redundant primary and secondary connections are also
required. For simplicity, these connections are not shown on the slide. Multipath HA is
required on all HA pairs except for some FAS2500 series system configurations,
which use single-path HA and lack the redundant standby connections.

29
Two-Node Cluster Interconnect

In a two-node switchless
cluster, ports are
connected between nodes. Onboard
10-GbE
4 x Ports
Cluster interconnect ports
on a FAS8060

In clusters with more than one node, a cluster interconnect is required. This example
shows a FAS8060 system that has two controllers installed in the chassis. Each
controller has a set of four onboard 10-GbE ports that can be used to connect to the
cluster interconnect.

In a two-node switchless cluster, a redundant pair of these ports is cabled together as


shown on this slide.

30
Switched Clusters

Inter-Switch
Cluster Interconnect Links (ISLs)

Cluster Switch Cluster Switch

If your workload requires more than two nodes, the cluster interconnect requires
switches. The cluster interconnect requires two dedicated switches for redundancy
and load balancing. Inter-Switch Links (ISLs) are required between the two switches.
There should always be at least two cluster connections, one to each switch, from
each node. The required connections vary, depending on the controller model.

After the cluster interconnect is established, you can add more nodes as your
workload requires.

For more information about the maximum number and models of controllers
supported, see the Hardware Universe.

For more information about the cluster interconnect and connections, see the Network
Management Guide.

31
MetroCluster
Benefits of MetroCluster software:
▪ Zero data loss
▪ Failover protection
▪ Nondisruptive upgrades

MetroCluster uses mirroring to protect the data in a cluster. The MetroCluster


continuous-availability and disaster recovery software delivers zero data loss, failover
protection, and nondisruptive upgrades.

MetroCluster provides disaster recovery through one MetroCluster command. The


command activates the mirrored data on the survivor site.

32
MetroCluster Configurations

Two-Node Configuration Four-Node Configuration Eight-Node Configuration


▪ Single-node cluster at each ▪ Two-node cluster at each site ▪ Four-node cluster at each
site site
▪ Protects data on a local level
▪ Protects data on a cluster and a cluster level ▪ Protects data on a local level
level and a cluster level

Cluster A Cluster B Cluster A Cluster B Cluster A Cluster B


Data Center A Data Center B Data Center A Data Center B Data Center A Data Center B

There are various two-node, four-node and eight-node MetroCluster configurations.

In a two-node configuration, each site or data center contains a cluster that consists
of a single node. The nodes in a two-node MetroCluster configuration are not
configured as an HA pair. However, because all storage is mirrored, a switchover
operation can be used to provide nondisruptive resiliency similar to that found in a
storage failover in an HA pair.

In a four-node configuration, each site or data center contains a cluster that consists
of an HA pair. A four-node MetroCluster configuration protects data on a local level
and on a cluster level.

In an eight-node configuration, each site contains a four-node cluster that consists of


two HA pairs. Like a four-node MetroCluster, an eight-node MetroCluster
configuration protects data on both a local level and a cluster level.

For more information about the MetroCluster configurations, see the MetroCluster
Management and Disaster Recovery Guide.

33
Knowledge Check
▪ Which cluster configuration provides a cost-effective,
nondisruptively scalable solution?
▪ Single-node
▪ Two-node switchless
▪ Multi-node switched
▪ MetroCluster

Which cluster configuration provides a cost-effective, nondisruptively scalable


solution?

34
Knowledge Check
▪ What is the maximum number of cluster switches that can be used in a
multinode switched cluster configuration?
▪ One
▪ Two
▪ Three
▪ Four

What is the maximum number of cluster switches that can be used in a multinode
switched cluster configuration?

35
Lesson 3
Create and Configure a Cluster

Lesson 3, Create and Configure a Cluster.

36
Revised Slide 34

Creating a Cluster
▪ Cluster creation methods:
▪ Cluster setup wizard, using the CLI
▪ Guided Cluster Setup, using OnCommand
System Manager

▪ The CLI method:


▪ Create the cluster on the first node.
▪ Join remaining nodes to the cluster.
▪ Configure the cluster time and AutoSupport.

▪ The Guided Cluster Setup method:


▪ Use your web browser.
▪ Use this link: https://<node-management-IP-address>

After installing the hardware, you can set up the cluster by using the cluster setup wizard (via
the CLI) or, in ONTAP 9.1 and later, by using the Guided Cluster Setup (via OnCommand
System Manager).

Before you set up a cluster, you should use a cluster setup worksheet to record the values that
you will need during the setup process. Worksheets are available on the NetApp Support
website.

Whichever method you choose, you begin by using the CLI to enter the cluster setup wizard
from a single node in the cluster. The cluster setup wizard prompts you to configure the node
management interface. Next, the cluster setup wizard asks whether you want to complete the
setup wizard by using the CLI.

If you press Enter, the wizard continues using the CLI to guide you through the configuration.
When you are prompted, enter the information that you collected on the worksheet. After
creating the cluster, you use the node setup wizard to join nodes to the cluster one at a time.
The node setup wizard helps you to configure each node's node-management interface.

It is recommended that, after you complete the cluster setup and add all the nodes, you
configure additional settings, such as the cluster time and AutoSupport.

If you choose to use the Guided Cluster Setup, instead of the CLI, use your web browser to
connect to the node management IP that you configured on the first node. When prompted,
enter the information that you collected on the worksheet. The Guided Cluster Setup discovers
all the nodes in the cluster and configures them at the same time.

For more information about setting up a cluster, see the Software Setup Guide.

37
Cluster Administration
▪ Cluster administrators administer
the entire cluster:
▪ All cluster resources
▪ SVM creation and management
▪ Access control and roles
▪ Resource delegation

▪ Login credentials:
▪ The default user name is “admin.”
▪ Use the password that was created
during cluster setup.

You access OnCommand System Manager through a web browser by entering the
cluster administration interface IP address that was created during cluster setup. You
log in as cluster administrator to manage the entire cluster. You manage all cluster
resources, the creation and management of SVMs, access control and roles, and
resource delegation.

To log in to the cluster, you use the default user name “admin” and the password that
you configured during cluster creation.

38
Managing Resources in a Cluster
OnCommand System Manager: The CLI:
▪ Visual representation of the ▪ Manual or scripted commands
available resources ▪ Manual resource creation that might require
▪ Wizard-based resource creation many steps
▪ Best-practice configurations ▪ Ability to focus and switch between specific
▪ Limited advanced operations objects quickly

There are many tools that can be used to create and manage cluster resources, each
with their own advantages and disadvantages. This slide focuses on two tools.

OnCommand System Manager is a web-based UI that provides a visual


representation of the available resources. Resource creation is wizard-based and
adheres to best practices. However, not all operations are available. Some advanced
operations might need to be performed by using commands in the CLI. Also, the
interface may change between ONTAP versions as new features are added.

The CLI can also be used to create and configure resources. Commands are entered
manually or through scripts. Instead of the wizards that are used in System Manager,
the CLI might require many manual commands to create and configure a resource.
Although manual commands give the administrator more control, manual commands
are also more prone to mistakes that can cause issues. One advantage of using the
CLI is that the administrator can quickly switch focus without having to move through
System Manager pages to find different objects.

39
Knowledge Check
▪ In OnCommand System Manager, which user name do you use to
manage a cluster?
▪ admin
▪ administrator
▪ root
▪ vsadmin

In OnCommand System Manager, which user name do you use to manage a cluster?

40
Knowledge Check
▪ In the CLI, which user name do you use to manage a cluster?
▪ admin
▪ administrator
▪ root
▪ vsadmin

In the CLI, which user name do you use to manage a cluster?

41
Lesson 4
Physical Storage

Lesson 4, Physical Storage.

42
ONTAP Storage Architecture

Files and LUNs


Logical Layer
FlexVol Volumes

Aggregate

Physical Layer
RAID Groups of Disks

This lesson focuses on the physical storage layer. The physical storage layer consists
of disks, RAID groups, and the aggregate.

43
Disks Types
ONTAP Industry-Standard
Disk Class Description
Disk Type Disk Type
BSAS Capacity SATA Bridged SAS-SATA disks
FSAS Capacity NL-SAS Near-line SAS
mSATA Capacity SATA SATA disk in multidisk carrier storage shelf
SAS Performance SAS Serial-attached SCSI
SSD Ultra-performance SSD Solid-state drive
ATA Capacity SATA FC-connected Serial ATA
FC-AL Performance FC Fibre Channel
LUN Not applicable LUN Array LUN
Virtual Machine Disks that VMware ESX
VMDISK Not applicable VMDK
formats and manages

At the lowest level, data is stored on disks. The disks that are most commonly used
are SATA disks for capacity, SAS disks for performance, and solid-state drives, or
SSDs, for ultra-performance.

The Virtual Machine Disk, or VMDISK, is used in software-only versions of ONTAP,


for example, ONTAP Select.

The LUN disk type is not the same as a LUN that is created in a FlexVol volume. The
LUN disk type appears when the FlexArray storage virtualization software presents
an array LUN to ONTAP.

44
Identifying Disks

Shelf ID
DS4246

SAS Disk Name = <stack_id>.<shelf_id>.<bay>


Example: 1.0.22

In all storage systems, disks are named to enable the quick location of a disk. The
example identifies disk 1.0.22 located in a DS4246 shelf.

ONTAP assigns the stack ID, which is unique across the cluster. The shelf ID is set
on the storage shelf when the shelf is added to the stack or loop. The bay is the
position of the disk within its shelf.

45
Array LUNs
▪ Array LUNs are presented to ONTAP
using FlexArray storage virtualization
E-Series
software:
▪ An array LUN is created on the enterprise
or
storage array and presented to ONTAP.
Enterprise Array LUNs
▪ Array LUNs can function as hot spares or be
Storage Array assigned to aggregates.

▪ Array LUNs in an aggregate:


▪ Aggregates use RAID 0.
▪ Aggregates can contain only array LUNs.
Aggregate

Like disks, array LUNs can be used to create an aggregate. With the FlexArray
storage virtualization software licenses, you enable an enterprise storage array to
present an array LUN to ONTAP. An array LUN uses an FC connection type.

The way that ONTAP treats an array LUN is similar to the way it treats a typical disk.
When array LUNs are in use, the aggregates are configured with RAID 0. RAID
protection for the array LUN is provided by the enterprise storage array, not ONTAP.
Also, the aggregate can contain only other array LUNs. The aggregate cannot contain
hard disks or SSDs.

For more information about array LUNs, see the FlexArray Virtualization
Implementation Guides.

46
Disks and Aggregates
▪ What happens when a disk is
Unowned inserted into a system:
Disks ▪ The disk is initially “unowned.”
▪ By default, disk ownership is
assigned automatically.
▪ Disk ownership can be changed.

▪ What happens after ownership


is assigned:
Spare ▪ The disk functions as a hot spare.
Disks ▪ The disk can be assigned to an aggregate.

Aggregate

When a disk is inserted into a storage system’s disk shelf or a new shelf is added, the
disk is initially unowned. By default, the controller takes ownership of the disk. In an
HA pair, only one of the controllers can own a particular disk, but ownership can be
manually assigned to either controller.

After disk ownership is assigned, the disk functions as a spare disk.

When an aggregate is created or disks are added to an aggregate, the spare disks
are used.

47
RAID Groups
▪ Disks are added to RAID groups
within an aggregate.
▪ Disk must be same type:
▪ SAS, SATA, or SSD
▪ Array LUNs

▪ Disks should be the same speed


Data Disks Parity Double- and size:
▪ SAS speeds: 15K or 10K
Disk Parity
▪ SATA speed: 7.5K
Disk
Hot Spares ▪ You should always provide enough
hot spares.

When an aggregate is created or disks are added to an aggregate, the disks are
grouped into one or more RAID groups. Disks within a RAID group protect each other
in the event of a disk failure. Disk failure is discussed on the next slide.

Disks within a RAID group or aggregate must be the same type and usually the same
speed.

You should always provide enough hot spares for each disk type. That way, if a disk
in the group fails, the data can be reconstructed on a spare disk.

48
RAID Types
▪ RAID 4:
▪ RAID 4 provides a parity disk to protect the data in
the event of a single-disk failure.
▪ RAID 4 data aggregates require a minimum of
three disks.

▪ RAID-DP:
▪ RAID-DP provides two parity disks to protect the
data in the event of a double-disk failure.
Data Disks Parity Double Triple ▪ RAID-DP data aggregates require a minimum of
Disk Parity Parity five disks.
Disk Disk
▪ RAID-TEC:
▪ RAID-TEC provides three parity disks to protect
the data in the event of a triple-disk failure.
▪ RAID-TEC data aggregates require a minimum of
seven disks.

Three primary RAID types are used in ONTAP: RAID 4, RAID-DP, and RAID-TEC.

RAID 4 provides a parity disk to protect data in the event of a single-disk failure. If a
data disk fails, the system uses the parity information to reconstruct the data on a
spare disk. When you create a RAID 4 data aggregate, a minimum of three disks are
required.

RAID-DP technology provides two parity disks to protect data in the event of a
double-disk failure. If a second disk fails or becomes unreadable during
reconstruction when RAID 4 is in use, the data might not be recoverable. With RAID-
DP technology, a second parity disk can also be used to recover the data. When you
create a RAID-DP data aggregate, a minimum of five disks are required. RAID-DP is
the default for most disk types.

RAID-TEC technology provides three parity disks to protect data in the event of a
triple-disk failure. As disks become increasingly larger, RAID-TEC can be used to
reduce exposure to data loss during long rebuild times. When you create a RAID-TEC
data aggregate, a minimum of seven disks are required. RAID-TEC is the default for
SATA and near-line SAS hard disks that are 6 TB or larger.

49
Aggregates
Storage System ▪ Aggregates are composed RAID
Aggregate groups that contain disks or array
LUNs:
Plex0 (Pool 0) ▪ All RAID groups must be the same RAID
type.
rg0
▪ Aggregates contain the same disk type.

rg1 ▪ Aggregates have a single copy of


data, which is called a plex:
▪ A plex contains all the RAID groups that
belong to the aggregate.
Pool 0 Hot Spares ▪ Mirrored aggregates have two plexes.
▪ A pool of hot spare disks is assigned to
each plex.

To support the differing security, backup, performance, and data sharing needs of
your users, you can group the physical data storage resources on your storage
system into one or more aggregates. You can then design and configure these
aggregates to provide the appropriate level of performance and redundancy.

Each aggregate has its own RAID configuration, plex structure, and set of assigned
disks or array LUNs. Aggregates can contain multiple RAID groups, but the RAID
type and disk type must be the same.

Aggregates contain a single copy of data, which is called a plex. A plex contains all
the RAID groups that belong to the aggregate. Plexes can be mirrored by using the
SyncMirror software, which is most commonly used in MetroCluster configurations.
Each plex is also assigned a pool of hot spare disks.

50
Aggregate Types

Root Aggregate ▪ Root aggregate (aggr0):


▪ Creation is automatic during system initialization.
▪ Container is only for the node’s root volume with
log files and configuration information.

ONTAP prevents you from creating other


volumes in the root aggregate.
Data Aggregate
▪ Data aggregate:
▪ Default of RAID-DP with a five-disk minimum for
most disk types
▪ Container for SAS, SATA, SSD, or array LUNs

Each node of an HA pair requires three disks to be used for a RAID-DP root
aggregate, which is created when the system is first initialized. The root aggregate
contains the node’s root volume, named vol0, which contains configuration
information and log files. ONTAP prevents you from creating other volumes in the root
aggregate.

Aggregates for user data are called non-root aggregates or data aggregates. Data
aggregates must be created before any data SVMs or FlexVol volumes. When you
are creating data aggregates, the default is RAID-DP with a minimum of five disks for
most disk types. The aggregate can contain hard disks, SSDs, or array LUNs.

51
Advanced Disk Partitioning
▪ Advanced Disk Partitioning
(ADP):
▪ Shared disks for more efficient

<- N1 Parity
<- N1 Parity

<- N2 Parity
N2 Parity
resource use

<- N1 Spare

N2 Spare
<- N1 Data

Data Spare<- N2 Data


N1 Data
N1 Data

<- N2 Data

Data Spare<- N2 Data


▪ Lower disk

Parity
Parity
Parity

Parity
consumption requirements

User Aggr Parity


User Aggr Parity
Node1 Root

Node2 Root
User Aggr

Node2 Root<-
Node2 Root<-
Node1 Root<-
Node1 Root<-

User Aggr
▪ Partitioning types: Root Partition
▪ Root-data
▪ Root-data-data (not shown)

Spare
Parity
Parity
Data
Data

Data

Data
Data
Data

Data
Data Partition
▪ Default configuration for:
▪ Entry-level FAS2xxx systems
▪ All Flash FAS systems 1 21 23 34 45 56 6 7 7 8 8 99 1010 11
11 12
12

All nodes require a dedicated root aggregate of three disks, and a spare disk should
be provided for each node. Therefore, a 12-disk, entry-level system, as shown here,
would require at least eight disks before a data aggregate could even be created.
This configuration creates a challenge for administrators because the four remaining
disks do not meet the five-disk minimum for a RAID DP data aggregate.

Advanced Disk Partitioning, or ADP, overcomes this challenge by enabling the


controllers to share disks. This configuration lowers the disk consumption
requirements.

ADP reserves a small slice from each disk to create the root partition that can be used
for the root aggregates and hot spares. The remaining larger slices are configured as
data partitions that can be used for data aggregates and hot spares. The partitioning
type that is shown is called root-data partitioning. A second type of partitioning that is
called [8] root-data-data partitioning creates one small partition as the root partition
and two larger, equally sized partitions for data.

ADP is the default configuration for entry-level systems and for All Flash FAS
systems. Different ADP configurations and partitioning types are available, depending
on the controller model, disk type, disk size, or RAID type.

For more information about ADP configurations, see the Hardware Universe.

52
Hybrid Aggregates
Flash Pool aggregate

▪ What Flash Pool aggregates contain:


▪ SAS or SATA disks for user data
▪ SSDs for high-performance caching

▪ How Flash Pool improves performance:


▪ Offloads random read operations
▪ Offloads repetitive random write operations

▪ Use case: Online transactional processing


(OLTP) workloads

A Flash Pool aggregate is a special type of hybrid data aggregate.

A Flash Pool aggregate combines SAS or SATA disks and SSDs to provide a high-
performance aggregate that is more economical than an SSD aggregate. The SSDs
provide a high-performance cache for the active dataset of the data volumes that are
provisioned on the Flash Pool aggregate. The cache offloads random read operations
and repetitive random write operations to improve response times and overall
throughput for disk I/O-bound data access operations.

Flash Pool can improve workloads that use online transactional processing, or OLTP,
for example a database application’s data. Flash Pool does not improve performance
of predominantly sequential workloads.

53
Hybrid Aggregates
FabricPool aggregate

▪ What FabricPool aggregates contain:


▪ A performance tier for frequently accessed (“hot”) data, Hot
which is located on an all-SSD aggregate
▪ A capacity tier for infrequently accessed (“cold”) data, On-premises
which is located on an object store

▪ How FabricPool can enhance the efficiency of


your storage system:
▪ Automatically tier data based on frequency of use Public Private
▪ Move inactive data to lower-cost cloud storage Cloud Cloud
▪ Make more space available on primary storage for
Cold
active workloads

A FabricPool aggregate is a type of hybrid data aggregate that was introduced in


ONTAP 9.2.

A FabricPool aggregate contains a performance tier for frequently accessed (“hot”)


data, which is located on an all-SSD aggregate, and a capacity tier for infrequently
accessed (“cold”) data, which is located on an object store. FabricPool supports
object store types that are in the public cloud using Amazon Simple Storage Service
(Amazon S3) and private cloud using StorageGRID Webscale software.

Storing data in tiers can enhance the efficiency of your storage system. FabricPool
stores data in a tier based on whether the data is frequently accessed. ONTAP
automatically moves inactive data to lower-cost cloud storage, which makes more
space available on primary storage for active workloads.

For more information about FabricPool aggregates, see the Disks and Aggregates
Power Guide.

54
Knowledge Check
▪ What is the minimum number of disks that are required to create a
RAID-DP data aggregate (excluding hot spares)?
▪ Two
▪ Three
▪ Four
▪ Five
▪ Six

What is the minimum number of disks that are required to create a RAID-DP data
aggregate (excluding hot spares)?

55
Knowledge Check
▪ What does a Flash Pool aggregate contain?
▪ Hard disks only
▪ Solid state drives (SSDs) only
▪ Hard disks for data storage and SSDs for caching
▪ Hard disks and SSDs that are used for data storage

What does a Flash Pool aggregate contain?

56
Lesson 5
WAFL

Lesson 5, WAFL.

57
Write Anywhere File Layout
Write Anywhere File Layout (WAFL) file system:
▪ Organizes blocks of data on disk into files
▪ FlexVol volumes represent the file system

FlexVol Volume
Inode file

Inode Inode

A B C D E

The Write Anywhere File Layout, or WAFL, file system organizes blocks of data on
disks into files. The logical container, which is a FlexVol volume, represents the file
system.

The WAFL file system stores metadata in inodes. The term “inode” refers to index
nodes. Inodes are pointers to the blocks on disk that hold the actual data. Every file
has an inode, and each volume has a hidden inode file, which is a collection of the
inodes in the volume.

58
NVRAM and Write Operations
▪ What happens when a host or client
Client Access
writes to the storage system:
▪ The system simultaneously writes to
system memory and logs the data in System
NVRAM. memory
▪ If the system is part of an HA pair, the
system also mirrors the log to the partner.
▪ The write can safely be acknowledged
because the NVRAM is battery-backed
memory. CP

▪ Write operations are sent to disk


NVRAM
from system memory at a
consistency point (CP).

When a host or client writes to the storage system, the system simultaneously writes
to system memory and logs the data in NVRAM. If the system is part of an HA pair,
the system also simultaneously mirrors the logs to the partner.

After the write is logged in battery-backed NVRAM and mirrored to the HA pair, the
system can safely acknowledge the write to the host or client.

The system does not write the data to disk immediately. The WAFL file system
caches the writes in system memory. Write operations are sent to disk, with other
write operations in system memory, at a consistency point, or CP. The system only
uses the data that is logged in NVRAM during a system failure, so after the data is
safely on disk, the logs are flushed from NVRAM.

59
Consistency Points
Certain circumstances trigger a CP:
▪ A ten-second timer runs out.
▪ An NVRAM buffer fills up and it is time to
flush the writes to disk.
▪ A Snapshot copy is created.

Inode New Snapshot

CP
Block A B C D E D’

WAFL optimizes all the incoming write requests in system memory before committing
the write requests to disk. The point at which the system commits the data in memory
to disk is called a consistency point because the data in system memory and disks is
consistent then.

A CP occurs at least once every 10 seconds or when the NVRAM buffer is full,
whichever comes first. CPs can also occur at other times, for example when a
Snapshot copy is created.

60
Direct Write Operation
Network interface card (NIC) or
Client Access host-bus adapter (HBA)

NVRAM
System HA System
Memory
NVRAM
Memory

When a write request is sent from the client, the storage system receives the request
through a network interface card (NIC) or a host-bus adapter (HBA). In this case, the
write is to a volume that is on the node and therefore has direct access. The write is
simultaneously processed into system memory, logged in NVRAM, and mirrored to
the NVRAM of the partner node of the HA pair. After the write has been safely logged,
the write is acknowledged to the client. The write is sent to storage at the next CP.

61
Indirect Write Operation

Client Access

NVRAM
System HA System
Memory
NVRAM
Memory

If a write request is sent from the client to a volume that is on a different node, the
write request accesses the volume indirectly.

The write request is processed by the node to which the volume is connected. The
write is redirected, through the cluster interconnect, to the node that owns the volume.
The write is simultaneously processed into system memory, logged in NVRAM, and
mirrored to the NVRAM of the partner node of the HA pair. After the write has been
safely logged, the write is acknowledged to the client. The write is sent to storage at
the next CP.

62
Direct Cache Read Operation

Client Access

NVRAM
System HA System
Memory
NVRAM
Memory

When a read request is sent from the client, the storage system receives the request
through a NIC or an HBA. In this case, the read is from a volume that is on the node
and therefore has direct access. The system first checks to see if the data is still in
system memory, which is called read cache. If the data is still in cache, the system
serves the data to the client.

63
Direct Disk Read Operation

Client Access

NVRAM
System HA System
Memory
NVRAM
Memory

If the data is not in cache, the system retrieves the data into system memory. After the
data is cached, the system serves the data to the client.

64
Indirect Read Operation

Client Access

NVRAM
System HA System
Memory
NVRAM
Memory

If a read request is sent from the client to a volume that is on a different node, the
read request accesses the volume indirectly.

The read is processed by the node to which the volume is connected. The read is
redirected, through the cluster interconnect, to the node that owns the volume. As
with the direct read, the system that owns the volume checks system memory first. If
the data is in cache, the system serves the data to the client. Otherwise, the system
needs to retrieve the data from disk first.

65
Knowledge Check
▪ Match each term with the term’s function.

WAFL Organizes blocks on disk into files

Inode Provides file pointers to the blocks on disk

FlexVol Volume Represents the file system

System Memory Memory that is not battery-backed

NVRAM Memory that is battery-backed

The point at which the data system memory is


Consistency Point
committed to disk

Match each term with the term’s function.

66
Knowledge Check
▪ When a client reads or writes to a volume that is on the node that the
client is connected to, access is said to be:
▪ Direct for both reads and writes
▪ Direct for reads, indirect for write
▪ Direct for writes, indirect for reads
▪ Indirect for reads and writes

When a client reads or writes to a volume that is on the node that the client is
connected to, access is said to be:

67
Knowledge Check
▪ When a client reads or writes to a volume that is on a node other
than the node that the client is connected to, access is said to be:
▪ Direct for both reads and writes
▪ Direct for reads, indirect for write
▪ Direct for writes, indirect for reads
▪ Indirect for reads and writes

When a client reads or writes to a volume that is on a node other than the node that
the client is connected to, access is said to be:

68
Resources
▪ Welcome to Data Fabric video:
http://www.netapp.com/us/campaigns/data-fabric/index.aspx
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com

When ready, click the Play button to continue.

Resources

69
ONTAP Cluster Fundamentals:
Management

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome to ONTAP Cluster Fundamentals: Management.

70
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules

The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.

This module was written for cluster administrators and provides an introduction to the
concept of managing a cluster.

71
This module focuses on enabling you to do the following:
▪ Define the role of a cluster administrator
▪ Manage a cluster
▪ List the cluster-configuration options
About This
Module ▪ Monitor a cluster

In this module, you learn about the role of a cluster administrator, the methods that
are used to manage a cluster, and the options for configuration. You also learn about
the ways to monitor a cluster.

72
Lesson 1
Cluster Administration

Lesson 1, cluster administration.

73
Administrators
▪ Tasks of cluster administrators:
▪ Administer the entire cluster
▪ Administer the cluster’s storage virtual
machines (SVMs)
▪ Can set up data SVMs and delegate SVM
administration to SVM administrators

▪ Tasks of SVM administrators:


▪ Administer only their own data SVMs
▪ Can set up storage and network resources, such
as volumes, protocols, logical interfaces (LIFs),
and services

This module focuses on cluster administration. Two types of administrators can


manage a cluster.

Cluster administrators administer the entire cluster and the storage virtual machines,
or SVMs, that the cluster contains. Cluster administrators can also set up data SVMs
and delegate SVM administration to SVM administrators.

SVM administrators administer only their own data SVMs. SVM administrators can
configure certain storage and network resources, such as volumes, protocols,
services, and logical interfaces, or LIFs. What an SVM administrator is allowed to
configure is based on how the cluster administrator has configured the SVM
administrator’s user account.

74
Admin SVM
Admin SVM:
▪ Automatic creation during cluster
creation process
Cluster
Management LIF
▪ Representation of the cluster
▪ Primary access point for administration of
Admin nodes, resources, and data SVMs
Admin SVM
▪ Not a server of data
▪ A cluster must have at least one data SVM to
serve data to its clients.

The cluster management LIF


is configured to fail over to any
Cluster node in the cluster.

The admin SVM is used to manage the cluster.

The admin SVM is automatically created during cluster creation process. There is
only one admin SVM, which represents the cluster. Through the cluster management
LIF, you can manage any node, resource, or data SVM. Also, the cluster
management LIF and is configured to be able to fail over to any node in the cluster.

The admin SVM cannot serve data. A cluster must have at least one data SVM to
serve data to its clients. Unless otherwise specified, the term SVM typically refers to a
data-serving SVM, which applies to both SVMs with FlexVol volumes and SVMs with
Infinite Volume. Also, in the CLI, SVMs are displayed as Vservers.

75
Accessing the Cluster
The CLI: OnCommand System Manager:
▪ Console access through a node’s serial port ▪ Web service in ONTAP
▪ Secure Shell (SSH) access through the ▪ Accessible with a browser and the cluster
cluster management LIF IP address management LIF IP address
▪ Telnet or Remote Shell (RSH) access is
disabled by default

This course discusses the CLI and OnCommand System Manager.

You can enter commands in the CLI from a console. You use the serial port or Secure
Shell, or SSH, and the IP address of the cluster management LIF. If the cluster
management LIF is unavailable, one of the node management LIFs can be used.
SSH is enabled by default. SSH and the cluster management LIF are the
recommended access methods. Although Telnet and Remote Shell, or RSH, are
supported, they are not secure protocols and are therefore disabled by default. If
Telnet or RSH is required in your environment, see the steps to enable these
protocols in the System Administration Guide.

If you prefer to use a GUI instead, you can use OnCommand System Manager.
OnCommand System Manager is included with ONTAP as a web service and is
enabled by default. To use a web browser to access System Manager, point the
browser to the IP address of the cluster management LIF.

76
Node Root Aggregate and Volume
▪ Node root aggregate (aggr0):
▪ Requirement for every node in the cluster HA
▪ Contains only the node root volume
Node 1 Node 2
ONTAP prevents you from creating other
volumes in the root aggregate.

▪ Node root volume (vol0):


▪ Requirement for every node in the cluster
▪ Contains log files for the node and cluster- vol0 vol0
wide configuration database information

Node 1 Node 1 Node 2 Node 2


NOTE: User data should never be stored root root root root
aggregate volume volume aggregate
in the node’s root volume.

A common question about clustering is, “How can several individual nodes appear as
one cluster?” The answer involves two parts. The first part of the answer involves
each node’s requirements for resources. The second part of the answer involves the
way that the cluster uses those resources. This slide discusses the node resources.

Every node in the cluster requires an aggregate that is dedicated to the node. This
aggregate is called the node root aggregate. By default, the aggregate is named
aggr0, but the name might include the node name also. The purpose of the node root
aggregate is to store the node root volume. ONTAP prevents you from creating other
volumes in the root aggregate.

By default, the node root volume is named vol0. The node root volume contains
special directories and files for the node. The special files include resources that the
node requires for proper operation, log files for troubleshooting, and cluster-wide
configuration database information. Because this volume is so critical to the node,
user data should never be stored in the node root volume.

77
Replicated Database
Cluster
interconnect
▪ Replicated database (RDB):
▪ Basis of clustering
HA HA
▪ An instance on each node in
the cluster Node 1 Node 2 Node 3 Node 4
▪ In use by several processes

▪ Replication rings:
▪ Consistency
▪ Healthy cluster links among all nodes
vol0 vol0 vol0 vol0

The RDB stores data for the management,


volume location, logical interface, SAN, and
configuration replication services.

This slide explains the second part of the answer, how the cluster uses the dedicated
node resources.

Clustering is how nodes maintain a configuration with each other. The basis of
clustering is the replicated database, or RDB. Replication is communicated over the
dedicated cluster interconnect.

An instance of the RDB is maintained on each node in the cluster. Several processes
use the RDB to ensure consistent data across the cluster. The processes that the
RDB collects data for include the management, volume location, logical interface,
SAN, and configuration replication services.

Replication rings are sets of identical processes that run on all nodes in the cluster.
Replication rings are used to maintain consistency. Each process maintains its own
ring, which is replicated over the cluster interconnect. Replication requires healthy
cluster links among all nodes; otherwise, file services can become unavailable.

It is important to understand why the RDB is needed to maintain a cluster-wide


configuration. However, you should not worry too much about the details. The cluster
maintains the RDB and rarely requires administrator intervention.

78
Knowledge Check
1. The admin SVM is created to manage the cluster and serve
data to the cluster administrators.
a. True
b. False

The admin SVM is created to manage the cluster and serve data to the cluster
administrators.

79
Knowledge Check
2. Where is a cluster’s configuration information stored?
a. In the first node’s root volume
b. In every node’s root volume
c. In the first SVM’s root volume
d. In every SVM’s root volume

Where is a cluster’s configuration information stored?

80
Lesson 2
Managing Clusters

Lesson 2, managing clusters.

81
Managing Resources in a Cluster
OnCommand System Manager: The CLI:
▪ Visual representation of the ▪ Manual or scripted commands
available resources ▪ Manual resource creation that might require
▪ Wizard-based resource creation many steps
▪ Best-practice configurations ▪ Ability to focus and switch between specific
▪ Limited advanced operations objects quickly

There are many tools that can be used to create and manage cluster resources, each
with their own advantages and disadvantages. This slide focuses on two tools.

OnCommand System Manager is a web-based UI that provides a visual


representation of the available resources. Resource creation is wizard-based and
adheres to best practices. However, not all operations are available. Some advanced
operations might need to be performed by using commands in the CLI. Also, the
interface may change between ONTAP versions as new features are added.

The CLI can also be used to create and configure resources. Commands are entered
manually or through scripts. Instead of the wizards that are used in System Manager,
the CLI might require many manual commands to create and configure a resource.
Although manual commands give the administrator more control, manual commands
are also more prone to mistakes that can cause issues. One advantage of using the
CLI is that the administrator can quickly switch focus without having to move through
System Manager pages to find different objects.

82
Clustershell
The default CLI, or shell, in ONTAP is called the “clustershell.”

Clustershell features:
▪ Inline help
▪ Online manual pages
▪ Command history
▪ Ability to reissue a command
▪ Keyboard shortcuts
▪ Queries and UNIX-style patterns
▪ Wildcards

The cluster has different CLIs or shells that are used for different purposes. This
course focuses on the clustershell, which is the shell that starts automatically when
you log in to the cluster.

Clustershell features include] inline help, an online manual, history and redo
commands, and keyboard shortcuts. The clustershell also supports queries and
UNIX-style patterns. Wildcards enable you to match multiple values in command-
parameter arguments.

83
login as: admin
Using keyboard-interactive authentication.
Using the CLI Password:
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
cluster1-01 true true
cluster1-02 true true
▪ Command structure: 2 entries were displayed.

cluster1::> cluster
▪ Cluster name at prompt
cluster1::cluster> show
▪ Hierarchy of commands in Node Health
--------------------- -------
Eligibility
------------
command directories cluster1-01
cluster1-02
true
true
true
true
▪ Choice of command path 2 entries were displayed.

cluster1::cluster> ?
or directory structure contact-info> Manage contact information for the cluster.
create Create a cluster
▪ Directory name at prompt date> Manage cluster's date and time setting
ha> Manage high-availability configuration
▪ Context-sensitive help identity> Manage the cluster's attributes, including name
and serial number
image> Manage cluster images for automated nondisruptive
update
join Join an existing cluster using the specified
member's IP address or by cluster name
log-forwarding> Manage the cluster's log forwarding configuration
peer> Manage cluster peer relationships
setup Setup wizard
show Display cluster node members
statistics> Display cluster statistics
time-service> Manage cluster time services

cluster1::cluster> top

cluster1::>

The CLI provides a command-based mechanism that is similar to the UNIX tcsh shell.

You start at the prompt, which displays the cluster name. Commands in the CLI are
organized into a hierarchy by command directories. You can run commands in the
hierarchy either by entering the full command path or by navigating through the
directory structure. The directory name is included in the prompt text to indicate that
you are interacting with the appropriate command directory.

To display context-sensitive help, use the question mark. To return to the top of the
menu, use the top command.

84
Privilege Levels in the CLI

Admin Advanced
▪ Most commands and parameters ▪ Infrequently used commands
▪ Default level and parameters
▪ Advanced knowledge requirements
▪ Possible problems from
inappropriate use
▪ Advice of support personnel

cluster1::> set -privilege advanced


Warning: These advanced commands are potentially
dangerous; use them only when directed to do so by
technical support.
Do you wish to continue? (y or n): y
cluster1::*> set -privilege admin
cluster1::> Return to admin level.

An asterisk appears in the


command prompt.

CLI commands and parameters are defined at privilege levels. The privilege levels
reflect the skill levels that are required to perform the tasks.

Most commands and parameters are available at the admin level. The admin level is
the default level that is used for common tasks.

Commands and parameters at the advanced level are used infrequently. Advanced
commands and parameters require advanced knowledge and can cause problems if
used inappropriately. You should use advanced commands and parameters only with
the advice of support personnel.

To change privilege levels in the CLI, you use the set command. An asterisk appears
in the command prompt to signify that you are no longer at the admin level. Changes
to privilege level settings apply only to the session that you are in. The changes are
not persistent across sessions. After completing a task that requires the advanced
privilege, you should change back to admin privilege to avoid entering potentially
dangerous commands by mistake.

There is also a diagnostic privilege level, which is not listed on this slide. Diagnostic
commands and parameters are potentially disruptive to the storage system. Only
support personnel should use diagnostic commands to diagnose and fix problems.

85
Navigating OnCommand System Manager
Main window for ONTAP 9.3 or greater

Your version of OnCommand System Manager might look a little different, depending
on the version of ONTAP software that runs on your cluster. The example that is
displayed here is from a cluster that runs ONTAP 9.3.

After you log in to System Manager, the main window opens. You can use the Guided
Problem Solving, Technical Support Chat, or Help menus at any time. Click the Setup
icon to manage users, roles, and other clusters setting.

The default view is of the cluster dashboard, which can display cluster details such as
alerts and notifications, health, and performance.

You use the navigation menu on the left side to manage the cluster. For example,
under Storage, you find SVMs and Volumes.

86
Navigating OnCommand System Manager
Main window before ONTAP 9.3

In ONTAP versions before ONTAP 9.3, the navigation menu is below the title bar.

After you log in to OnCommand System Manager, the main window opens. You can
use Help at any time. The default view is of the cluster dashboard, which is similar to
the dashboard for ONTAP 9.3, as previously shown.

87
OnCommand Management Portfolio

Insight
API Services and Service Level Manager
Workflow Automation
Unified Manager
System Manager Cloud Manager
Small Midsize Enterprise Private Public Hybrid

Besides the CLI and OnCommand System Manager, there are other products in the
OnCommand management portfolio that you can use to manage storage resources in
a cluster.

OnCommand Workflow Automation enables automation and delegation of all


repeatable storage management and storage service tasks.

System Manager provides simplified device-level management. For environments


with many clusters, OnCommand Unified Manager manages clusters at scale. From a
single dashboard, you can monitor availability, capacity, performance, and protection.
Unified Manager and OnCommand WFA can be used together to enable self-service
processes such as provisioning and data protection. Also, OnCommand API Services
and NetApp Service Level Manager can enable third-party management solutions to
manage cluster resources.

88
Knowledge Check
1. What is another name for the default CLI in ONTAP?
a. Systemshell
b. Clustershell
c. Vservershell
d. Rootshell

What is another name for the default CLI in ONTAP?

89
Knowledge Check
2. Which LIF should be used to access OnCommand System
Manager?
a. cluster LIF
b. cluster management LIF
c. node management LIF
d. SVM management LIF

Which LIF should be used to access OnCommand System Manager?

90
Lesson 3
Configuring Clusters

Lesson 3, configuring clusters.

91
Replace Slide 28

Configuring Clusters

Access Control Date and Time Licenses

Jobs and Schedules Alerts

The cluster might require some initial configuration, depending on the environment.
This lesson discusses access control, date and time, licenses, jobs and schedules,
and alerts.

92
Managing Cluster Access

Managing User Specifying Using Access-


Accounts Access Method Control Role

▪ Create, modify, lock, Specify methods by ▪ Use predefined roles


unlock, or delete user which a user account
▪ Create additional
accounts can access the storage
access-control roles
system:
▪ Reset passwords ▪ HTTP ▪ Modify or delete
▪ Display information for ▪ ONTAPI access-control roles
▪ SSH
all user accounts ▪ Console ▪ Specify restrictions for
▪ Service Processor a role’s users

You can control access to the cluster and enhance security by managing user
accounts, access methods, and access-control roles.

You can create, modify, lock, unlock, or delete a cluster user account or an
SVM user account. You can also reset a user's password or display
information for all user accounts.

You must specify the methods, by application, that enable a user account to access
the storage system. A user can be assigned one or more access methods.
Examples of the access methods include the HTTP, ONTAPI (ONTAP API), SSH,
console, and Service Processor.

Role-based access control, or RBAC, limits users' administrative access to the level
that is granted for their role. RBAC enables you to manage users based on the role
that users are assigned to. ONTAP provides several predefined access-control roles.
You can also create additional access-control roles, modify them, delete them, or
specify account restrictions for users of a role.

93
Predefined Cluster Roles

admin

autosupport backup

read-only none

ONTAP provides several predefined roles for the cluster. The admin role is the cluster
superuser, which has access to all commands. The admin role can also create roles,
modify created roles, or delete created roles.

The remaining predefined cluster roles are used for applications, services, or auditing
purposes. The autosupport role includes a predefined AutoSupport account that is
used by AutoSupport OnDemand. Backup applications can use the backup role. The
read-only and none roles are used for auditing purposes.

94
Predefined SVM Roles

vsadmin

vsadmin-volume vsadmin-protocol

vsadmin-backup vsadmin-read-only

Each SVM can have its own user and administration authentication domain. After you
create the SVM and user accounts, you can delegate the administration of an SVM to
an SVM administrator. The predefined vsadmin role is the SVM superuser and is
assigned by default. The vsadmin typically manages the vsadmin’s own user account
local password and key information.

The remaining predefined SVM roles have progressively fewer capabilities. These
SVM roles can be used for applications, services, or auditing purposes.

95
User Accounts

You can manage users from the CLI or OnCommand System Manager. There are two
preconfigured users, admin and AutoSupport.

To add a user, click Add and enter the user name and password. You then add user
login methods. Click Add in the Add User dialog box and then select the application,
authentication method, and role. You can select predefined roles, or you can create
custom roles. Also, you need to repeat the user login methods process for each
application.

96
Date and Time

Ways to configure date


and time:
▪ Manually: using CLI
▪ Automatically: using
Network Time Protocol
(NTP) servers

After you add an


NTP server, the
nodes require time
to synchronize.

Problems can occur when the cluster time is inaccurate. ONTAP software enables
you to manually set the time zone, date, and time on the cluster. However, you should
configure the Network Time Protocol, or NTP, servers to synchronize the cluster time.

To configure the date and time, click Edit, select the time zone from the menu, enter
the NTP address in the time server field, and click Add. Adding the NTP server
automatically configures all the nodes in the cluster, but each node needs to be
synchronized individually. It might take a few minutes for all the nodes in the cluster to
be synchronized.

97
Licenses

▪ A license is a record of
software entitlements.
▪ Before ONTAP 9.3, each
cluster required
a cluster-based
license key.
▪ Certain features or
services might require
additional licenses.
▪ Feature licenses are
issued as packages.

A license is a record of one or more software entitlements. Installing license keys,


also known as license codes, enables you to use certain features or services on your
cluster. Before ONTAP 9.3, each cluster required a cluster base license key, which
you can install either during or after the cluster setup. Some features require
additional licenses. ONTAP feature licenses are issued as packages, each of which
contains multiple features or a single feature. A package requires a license key, and
installing the key enables you to access all features in the package.

To add a license package, click Add and then enter the license keys or license files.

98
Schedules
Schedules for tasks:
▪ Basic schedules are
recurring.
▪ Interval schedules are run
at intervals.
▪ Advanced schedules are
run at a specific instance
(month, day, hour, and
minute).

Many tasks can be configured to run on specified schedules. For example, volume
Snapshot copies can be configured to run on specified schedules. These schedules
are similar to UNIX cron schedules.

There are three types of schedules:


• Schedules that run on specific days and at specific times are called basic
schedules.
• Schedules that run at intervals (for example, every number of days, hours, or
minutes) are called interval schedules.
• Scheduled that are required to run on specific months, days, hours, or minutes are
called advanced schedules.

You manage schedules from the protection menu in OnCommand System Manager.
In the Schedules pane, you can create schedules, edit schedules, or delete
schedules.

99
Jobs
▪ Are asynchronous
tasks
▪ Are managed by the
job manager
▪ Are typically long-
running operations
▪ Are placed in a job
queue

A job is any asynchronous task that the job manager manages. Jobs are typically
long-running volume operations such as copy, move, and mirror. Jobs are placed in a
job queue.

You can monitor the Current Jobs and view the Job History.

100
AutoSupport
▪ Is an integrated
monitoring and
reporting technology
▪ Checks the health of
NetApp systems
▪ Should be enabled on
each node of a cluster

AutoSupport is integrated and efficient monitoring and reporting technology that,


when enabled on a NetApp system, checks the system health on a continual basis.
AutoSupport should be enabled on each node of the cluster.

AutoSupport can be enabled or disabled. To configure AutoSupport, click Edit and


enter your configuration information.

101
Knowledge Check
1. Which name is the name of a predefined cluster role?
a. admin
b. vsadmin
c. svmadmin
d. root

Which name is the name of a predefined cluster role?

102
Knowledge Check
2. Match the feature with one of the functions that the feature provides.

User accounts Specify access methods at the application level

Licenses Enable software entitlements

Are used for long-running volume operations such as


Jobs
copy, move, and mirror

Schedules Specify when tasks run

AutoSupport Logs information about each individual node in a cluster

Match the feature with one of the functions that the feature provides.

103
Lesson 4
Monitoring Clusters

Lesson 4, monitoring clusters.

104
Monitoring Clusters

Resources Performance

Alerting Reporting

Reasons to monitor your storage might include the provisioning and protection of
resources, alerting the administrator about an event, and gathering performance-
related information. You might also monitor storage for use reporting and trend
reporting.

This lesson focuses on monitoring resources. This lesson also introduces some of the
software in the OnCommand management portfolio for monitoring the other items.

105
Active IQ
▪ Dashboard
▪ Inventory of NetApp
systems
▪ Health summary and
trends
▪ Storage efficiency and risk
advisors

▪ Upgrade Advisor
▪ Active IQ mobile app
(iOS and Android)

In addition to OnCommand System Manager, NetApp Active IQ provides


predictive analytics and proactive support for your hybrid cloud. Along with an
inventory of NetApp systems, you are provided with a predictive health
summary and trends. You also get improved storage efficiency information and
a system risk profile.

As mentioned earlier, you run Upgrade Advisor when Active IQ provides


upgrade recommendations.

You can access Active IQ from NetApp Support or through the Active IQ
mobile app.

106
Using Unified Manager to Monitor
Manage cluster resources at scale

Click links for


more details

System Manager provides simplified device-level management, typically on a cluster-


by-cluster basis. For larger environments with many clusters, workloads and
protection relationships, use Unified Manager to monitor, manage, and report on
cluster resources at scale. From the dashboards, you can monitor availability,
capacity, performance and protection for multiple clusters in your data center. Click
the blue links for more detailed information.

107
OnCommand Portfolio
Complex
Complexity of Configuration

Performance, Capacity,
Configuration, and
Strong ROI Story
Insight
Target Audience: Large
Enterprises and Service Providers
Manage at Scale,
Automate Storage Processes,
and Data Protection
Target Audience: Midsize to Large Enterprise Customers

Unified Manager and Workflow Automation

Simple, Web-Based, and


No Storage Expertise Required
Basic

System Manager Target Audience: Small to Midsize Businesses

NetApp Storage Multivendor

There are several management tools to choose from. Examine the use cases and
target audiences of these products.

System Manager provides simplified device-level management, and Unified Manager


can be used for monitoring cluster resources at scale. However, these products are
used to monitor only ONTAP storage systems. What if you need to monitor the data
center infrastructure or storage systems from other vendors? OnCommand Insight
enables storage resource management, including configuration and performance
management and capacity planning, along with advanced reporting for
heterogeneous environments.

108
Knowledge Check
1. Which OnCommand product can you use to monitor space
use in a heterogeneous environment?
a. System Manager
b. Unified Manager
c. Insight
d. Performance Manager

Which OnCommand product can you use to monitor space use in a heterogeneous
environment?

109
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com

When ready, click the Play button to continue.

Resources

110
ONTAP Cluster Fundamentals:
Networking

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome to ONTAP Cluster Fundamentals: Networking.

111
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules

The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.

This module was written for cluster administrators and provides an introduction to the
concept of networking in a cluster.

112
This module focuses on enabling you to do the following:
▪ List the type of networks that are used by clusters
▪ Identify the types of network ports
▪ Describe IPspaces, broadcast domains, and subnets
About This
Module ▪ Describe network interfaces and their features

In this module, you learn about the networks, ports, IPspaces, broadcast domains,
subnets, and network interfaces that clusters use.

113
Lesson 1
Networks

Lesson 1, networks.

114
Networks: Management and Data
▪ Cluster interconnect:
▪ Connection of nodes
▪ Private network

▪ Management network:
▪ For cluster administration
▪ Management and data may be on a shared
Ethernet network

▪ Data network:
Management Network ▪ One or more networks that are used for data
access from clients or hosts
▪ Ethernet, FC, or converged network
Data Network

This module further examines the networking of a cluster. You can get started by
examining the different types of networks.

In multinode clusters, nodes need to communicate with each other over a cluster
interconnect. In a two-node cluster, the interconnect can be switchless. When more
than two nodes are added to a cluster, a private cluster interconnect using switches is
required.

The management network is used for cluster administration. Redundant connections


to the management ports on each node and management ports on each cluster
switch should be provided to the management network. In smaller environments, the
management and data networks might be on a shared Ethernet network.

For clients and host to access data, a data network is also required. The data network
can be composed of one or more networks that are primarily used for data access by
clients or hosts. Depending on the environment, there might be an Ethernet, FC, or
converged network. These networks can consist of one or more switches, or even
redundant networks.

115
Cluster Interconnect
FAS8060

In a two-node switchless
cluster, ports are
connected between nodes. Onboard
10-GbE
Cluster interconnect ports
4 x Ports on a FAS8060

This example shows a FAS8060, which has two controllers installed in the chassis.
Each controller has a set of four onboard 10-GbE ports that are used to connect to the
cluster interconnect.

In a two-node switchless cluster, a redundant pair of these ports is cabled together as


shown.

116
Cluster Interconnect Inter-Switch
Private cluster interconnect Links (ISLs)
Cluster Interconnect
Cluster Switch A Cluster Switch B

A B B A

For more than two nodes, a private cluster interconnect is required. There must be
two dedicated switches, for redundancy and load balancing. Inter-Switch Links, or
ISLs, are required between the two switches. There should always be at least
two cluster connections, one to each switch, from each node. The connections
that are required vary, depending on the controller model and cluster size. The
connections might require all four ports.

For more information about the maximum number and models of controllers that are
supported, see the Hardware Universe at hwu.netapp.com. For more information
about the cluster interconnect and connections, see the Network Management
Guide. Links are provided in the course resources.

117
Management Network
A Cluster Interconnect
B
Cluster Switch A Cluster Switch B

Cluster switch management


ports should also be connected
to the management network.
Inter-Switch
Links (ISLs)

Management Switch A Management Switch B


Management Network

Although a dedicated management network is not required, NetApp recommends


using a management network that provides redundancy. In this example, the system
uses a dedicated two-switch network with Inter-Switch Links (ISLs). You should
provide at least two connections, one to each switch, from each node. The
connections required vary depending on the controller and switching network. In this
example, the management port of the node is connected to management switch B
and the first 1-GbE port of the node to management switch A.

You should also connect the management ports of the cluster switches to the
management network for configuration and management of the cluster switches.

118
Data Networks
▪ Ethernet network:
▪ Ethernet ports
▪ Support for NFS, CIFS, and iSCSI protocols

▪ FC network:
▪ FC ports
▪ Support for FC protocol

▪ Converged network:
▪ Unified Target Adapter (UTA) ports
▪ Support for NFS, CIFS, iSCSI, and FCoE protocols

Data Network

The data network might consist of one or more networks. The required networks
depend on which protocols the clients use.

An Ethernet network connects Ethernet ports, which support the NFS, CIFS, and
iSCSI protocols. An FC network connects FC ports, which support the FC protocol. A
converged network combines Ethernet and FC networks into one network. Converged
networks connections use Unified Target Adapter ports, or UTA ports, on the nodes to
enable support for NFS, CIFS, iSCSI, and FCoE protocols.

119
Knowledge Check
1. Which network type requires a private network?
a. Cluster interconnect
b. Management network
c. Data network
d. HA network

Which network type requires a private network?

120
Knowledge Check
2. Which port speed is supported for a cluster interconnect?
a. 1 Gbps
b. 8 Gbps
c. 10 Gbps
d. 16 Gbps

Which port speed is required for a cluster interconnect?

121
Lesson 2
Network Ports

Lesson 2, network ports.

122
Network Ports and Interfaces

Logical Logical interface (LIF) smv1-mgmt smv1-data1

Virtual LAN
(VLAN) a0a-50 a0a-80
Virtual
Interface group a0a

Physical Port
e2a e3a
Network
Ports

Nodes have various physical ports that are available for cluster traffic, management
traffic, and data traffic. These ports need to be configured appropriately for the
environment. In this example, Ethernet ports are shown; physical ports also include
FC ports and UTA ports.

Physical Ethernet ports can be used directly or combined by using interface groups.
Also, physical Ethernet ports and interface groups can be segmented by using virtual
LANs, or VLANs. Interface groups and VLANS are considered virtual ports but are
treated similar to physical ports.

Unless specified, the term “network port” includes physical ports, interface groups,
and VLANs.

123
Physical Ports

Expansion slots for additional


network adapters
Onboard Onboard
Cluster UTA2 1-GbE Mgmt
Interconnect 4 x Ports 4 x Ports Ports

Controllers support a range of ports. Each model has several onboard ports. This
example shows a FAS8060 that contains two controllers in an HA pair configuration.

On the right, there are two Ethernet ports reserved for management purposes. To the
left of the management ports are four 1-GbE ports that can be used for data or
management. To the left of the 1-GbE ports are four UTA2 data ports, which can be
configured as either 10-GbE ports or 16-Gbps FC ports. And lastly, there are four 10-
GbE cluster interconnect ports.

Controllers might also have expansion slots to increase the number of ports by
adding network interface cards (NICs), FC host bus adapters (HBAs), or UTAs.

124
Physical Port Identification
▪ Ethernet port name: ▪ FC port name: ▪ UTA2 ports have an
e<location><letter> <location><letter> Ethernet name and an
▪ Examples: ▪ Examples: FC name:
▪ e<location><letter>
▪ e0i is the first onboard 1- ▪ 0a is the first onboard FC
▪ <location><letter>
GbE port on this controller. port on a controller.
▪ e2a would be the first port ▪ 3a is the first port on the ▪ Examples:
on the NIC in slot 2. host bus adapter (HBA) in
▪ e0e/0e is the first onboard
slot 3.
UTA2 port on this controller.
▪ e4a/4a is the first port on
the UTA card in slot 4.

Port names consist of two or three characters that describe the port's type and
location.

Ethernet port names consist of three characters. The first character is a lowercase “e,”
to represent Ethernet. The second character represents the location; onboard ports
are labeled zero and expansion cards are labeled by slot number. The third character
represents the order of the ports. The slide shows some examples.

FC port names consist of only two characters. FC port names do not begin with the
lowercase “e,” but otherwise FC port names are named in the same manner as
Ethernet port names. The slide shows some examples. However, the controller model
pictured on the slide does not have any dedicated FC ports.

UTA2 ports are unique. Physically, a UTA2 port is a single port but the UTA2 port can
be configured as either a 10-GbE converged Ethernet port or as a 16-Gbps FC port.
Therefore, UTA2 ports are labeled with both the Ethernet name and the FC name.
The slide shows some examples.

125
Interface Groups
▪ Combine one or more
Ethernet interfaces
▪ Interface group modes:
▪ Single-mode (active-standby)
▪ Static multimode (active-active)
▪ Dynamic multimode using Link Aggregation
10-GbE 1-GbE
Control Protocol (LACP)
multimode single-mode
▪ Naming syntax: a<number><letter>, ifgrp ifgrp
for example, a0a
NOTE: Vendors might use other terms Active
for combining Ethernet interfaces. Standby

Interface groups (ifgrps) combine one or more Ethernet interfaces, which can be
implemented in one of three ways.

In single-mode, one interface is active and the other interfaces are inactive until the
active link goes down. The standby paths are only used during a link failover.

In static multimode, all links are active. Therefore, static multimode provides link
failover and load balancing features. Static multimode complies with the IEEE
802.3ad (static) standard and works with any switch that supports the combining of
Ethernet interfaces. However, static multimode does not have control packet
exchange.

Dynamic multimode is similar to static multimode, except that it complies with the
IEEE 802.3ad (dynamic) standard. When switches that support Link Aggregation
Control Protocol, or LACP, are used, the switch can detect a loss of link status and
dynamically route data. NetApp recommends that when you are configuring interface
groups, you use dynamic multimode with LACP and compliant switches.

All modes support the same number of interfaces per groups, but the interfaces in the
group should always be the same speed and type. The naming syntax for interface
groups is the letter “a,” followed by a number, followed by a letter; for example, a0a.

Vendors might use terms such as link aggregation, port aggregation, trunking,
bundling, bonding, teaming, or EtherChannel.

126
VLANs

Switch 1

e0i-170

Switch 2

Router
Mgmt
Switch

VLAN70 VLAN172 VLAN171 VLAN170


Clients Tenant B Tenant A Mgmt

A physical Ethernet port or interface group can be subdivided into multiple VLANs.
VLANs provide logical segmentation of networks by creating separate broadcast
domains. VLANs can span multiple physical network segments, as shown in the
diagram. VLANs are used because they provide better network security and reduce
network congestion.

Each VLAN has a unique tag that is communicated in the header of every packet. The
switch must be configured to support VLANs and the tags that are in use. The VLAN's
ID is used in the name of the VLAN when it is created. For example, VLAN "e0i-170"
is a VLAN with tag 170, which is in the management VLAN, and it is configured on
physical port e0i.

127
Network Ports

VLAN VLAN VLAN VLAN

ifgrp
port

port
port port
ifgrp

NOTE: Interface groups and VLANs cannot


port port be created on cluster interconnect ports.

So you’re probably asking yourself, “What type of network port should I use?” The
answer depends on your environment.

In most small to medium environments and in FC environments, physical ports are


used.

In Ethernet environments where multiple physical networks are not possible, it is


common to use VLANs to separate management traffic from data traffic. It is also
common to use VLANs to separate differing workloads. For example, you might
separate NAS traffic from iSCSI traffic for performance and security reasons.

In Ethernet environments where many application servers or hosts are sharing


switches and ports, dynamic multimode interface groups of four 10-GbE ports per
node are commonly used for load balancing.

Environments that use interface groups typically use VLANs also, for segmentation of
the network. This segmentation is common for service providers that have multiple
clients that require the bandwidth that interface groups provide and the security that
VLANs provide.

And lastly, it is not uncommon for different types of ports to be used in mixed
environments that have various workloads. For example, an environment might use
interface groups with VLANs that are dedicated to NAS protocols, a VLAN that is
dedicated to management traffic, and physical ports for FC traffic.

Interface groups and VLANs cannot be created on cluster interconnect ports.

128
Knowledge Check
1. How would you describe port e3a/3a?
a. The first Ethernet port in expansion slot 3
b. The first UTA2 port in expansion slot 3
c. The third Ethernet port of expansion card A
d. The third UTA2 port in expansion slot 3

How would you describe port e3a/3a?

129
Lesson 3
IPspaces

Lesson 3, IPspaces.

131
IPspace Components

IPspace
Broadcast Domain

Storage Virtual
Machine (SVM) Subnet
Port
LIF IP Addresses:
192.168.0.101
192.168.0.1 192.168.0.1 – 192.168.0.100

ONTAP has a set of features that work together to enable multitenancy. Before
looking at the individual components in depth, consider how they interact with each
other.

An IPspace can be thought of as a logical container that is used to create


administratively separate network domains. An IPspace defines a distinct IP address
space where there are storage virtual machines, or SVMs. The IPspace contains a
broadcast domain, which enables you to group network ports that belong to the same
layer 2 network. The broadcast domain contains a subnet, which enables you to
allocate a pool of IP addresses for your ONTAP network configuration.

When you create a logical interface, or LIF, on the SVM, the LIF represents a network
access point to the node. The IP address for the LIF can be assigned manually. If a
subnet is specified, the IP address is automatically assigned from the pool of
addresses in the subnet. This assignment works in much the same way that a
Dynamic Host Configuration Protocol (DHCP) server assigns IP addresses.

Next, examine these components individually.

132
IPspaces
Default Company A Company B
IPspace IPspace IPspace
SVM_1 SVM_A1 SVM_B1

Default Company A Company B


Routing Routing Routing
Table Table Table

Storage
Service
Provider Default Company A Company B
The “cluster” IPspace
is not shown. 192.168.0.5 > 10.1.2.5 > 10.1.2.5 >

The IPspace feature enables the configuration of one cluster so that clients can access the
cluster from more than one administratively separate network domain. Clients can access the
cluster even if those clients are using the same IP address subnet range. This feature enables
separation of client traffic for privacy and security.

An IPspace defines a distinct IP address space in which SVMs reside. Ports and IP addresses
that are defined for an IPspace are applicable only within that IPspace. A distinct routing table is
maintained for each SVM within an IPspace; therefore, no cross-SVM or cross-IPspace traffic
routing occurs.

During the cluster creation, a default IPspace was created. If you are managing storage for one
organization, then you do not need to configure additional IPspaces. If you are managing
storage for multiple organizations on one cluster and you are certain your customers do not
have conflicting networking configurations, you do not need to configure additional IPspaces.

The primary use case for this feature is the storage service provider that needs to connect
customers that are using overlapping IP addresses or ranges. In this example, both Company A
and Company B are using 10.1.2.5 as an IP address for their servers. The service provider
starts the configuration by creating two IPspaces, one for company A and the other for company
B. When the service provider creates SVMs for customer A, they are created in IPspace A.
Likewise, when the service provider creates SVMs for customer B, they are created in IPspace
B.

An IPspace that is named “cluster” that contains the cluster interconnect broadcast domain is
also created automatically during cluster initialization. The “cluster” IPspace is not shown on
this slide.

133
Broadcast Domains

Default
Broadcast Domain

Company A
Broadcast Domain

Company B
Broadcast Domain

The “cluster” broadcast Broadcast domains can contain physical


domain is not shown ports, interface groups, and VLANs

A broadcast domain enables you to group network ports that belong to the same layer
2 network. Broadcast domains are commonly used when a system administrator
wants to reserve specific network ports for use by a certain client or group of clients.
Broadcast domains should include network ports from many nodes in the cluster to
provide high availability for the connections to SVMs. A network port can exist in only
one broadcast domain.

This example extends the IPspace example from the previous slide. The default
IPspace, which is automatically created with the cluster, contains the first network
ports from each node. The system administrator created two broadcast domains
specifically to support the customer IPspaces. The broadcast domain for Company
A’s IPspace contains only network ports from the first two nodes. The broadcast
domain for Company B’s IPspace contains one network port from each of the nodes
in the cluster.

A broadcast domain that is named “cluster” that contains the cluster interconnect
ports is also created automatically during cluster initialization. Also, although only
physical ports are used in the example, interface groups and VLANs are also
supported.

134
Subnets

192.168.0.1
Default Subnet to
Broadcast Domain 192.168.0.100

10.1.2.5
Company A Subnet to
Broadcast Domain 10.1.2.20

Company B Subnet
10.1.2.5
to
Broadcast Domain 10.1.2.100

Subnets are recommended


for easier LIF creation.

A subnet is a pool of IP addresses that is created in a broadcast domain, which


belongs to the same layer 3 subnetwork, or subnet. Subnets enable you to allocate
specific blocks, or pools, of IP addresses for your network configuration. This
allocation enables you to create LIFs more easily when you use the network interface
create command, by specifying a subnet name instead of specifying IP address and
network mask values.

135
Knowledge Check
1. What does a broadcast domain contain?
a) Physical ports only
b) Network ports (physical, interface group, or VLAN)
c) Logical interfaces (LIFs)
d) A pool of IP addresses

What does a broadcast domain contain?

136
Lesson 4
Network Interfaces

Lesson 4, network interfaces.

137
Network Ports and Interfaces

Logical LIF smv1-mgmt smv1-data1

VLAN a0a-50 a0a-80


Virtual
Interface group a0a

Physical Port
e2a e3a

This module examines the logical layer.

138
Logical Interfaces

LIF Properties ▪ Logical interface (LIF):


▪ Associated SVM ▪ Represents the IP address or a worldwide port name
▪ Role (WWPN) that is associated with a network port.
▪ Protocol ▪ LIFs are associated with a particular SVM.
▪ Home node and port
▪ Address ▪ LIF management:
▪ Failover policy and group ▪ Cluster administrators can create, view, modify,
▪ Firewall policy migrate, or delete LIFs.
▪ Load balancing options ▪ SVM administrators can view only the LIFs that are
associated with the SVM.

A LIF represents an IP address or worldwide port name (WWPN) that is associated


with a network port. You associate a LIF with a physical port, interface group, or
VLAN to access a particular network. Also, a LIF is created for an SVM and is
associated only with the SVM that the LIF was created for.

LIFs are managed by the cluster administrators, who can create, view, modify,
migrate, or delete LIFs. An SVM administrator can only view the LIFs associated with
the SVM.

The properties of LIFs include: the SVM that the LIF is associated with, the role, the
protocols the LIF supports, the home node, the home port, and the network address
information. Depending on the type of LIF, there might be an associated failover
policy and group, firewall policy and load balancing options.

A default firewall policy is automatically assigned to a data, management, or


intercluster LIF. For more information about firewall policies, see the Network
Management Guide.

139
LIF Roles

Cluster Cluster Data Intercluster Node


Management Management

An interface to A single An interface for An interface for A dedicated


the cluster management communication cross-cluster interface for
interconnect interface for the with clients communication, managing a
entire cluster or hosts backup, and particular node
replication

Scoped to a Cluster-wide Scoped to a Scoped to a Scoped to a


specific node (any node) specific SVM specific node specific node
(any node)

LIFs are assigned one of five roles.

Cluster LIFs provide an interface to the cluster interconnect, which carries the “intracluster”
traffic between nodes in a cluster. Cluster LIFs are node scoped, meaning they can fail over to
other ports in the cluster broadcast domain but the ports must be on the same node. Cluster
LIFs cannot be migrated or failed over to a different node. Also, cluster LIFs must always be
created on 10-GbE network ports.

The cluster management LIF provides a single management interface for the entire cluster. The
cluster management LIF is cluster-wide, meaning the cluster management LIF can fail over to
any network port, on any node in the cluster, that is in the proper broadcast domain.

Data LIFs provide an interface for communication with clients and are associated with a specific
SVM. Multiple data LIFs from different SVMs can reside on a single network port, but a data LIF
can be associated with only one SVM. Data LIFs that are assigned NAS protocol access can
migrate or fail over throughout the cluster. Data LIFs that are assigned SAN protocol access do
not fail over, but can be moved offline to a different node in the cluster.

Intercluster LIFs provide an interface for cross-cluster communication, backup, and replication.
Intercluster LIFs are also node scoped and can only fail over or migrate to network ports on the
same node. When creating intercluster LIFs, you must create one on each node in the cluster.

Node management LIFs provide a dedicated interface for managing a particular node. Typically
cluster management LIFs are used to manage the cluster and any individual node. Therefore,
node management LIFs are typically only used for system maintenance when a node becomes
inaccessible from the cluster.

140
Data LIFs
▪ NAS data LIFs:
Data ▪ Multiprotocol (NFS, CIFS or both)
LIF
▪ Manually or automatically assigned
Client Access IP addresses
Data ▪ Failover or migration to any node in the cluster
LIF
SVM LUN Host Access ▪ SAN data LIFs:
▪ Single-protocol (FC or iSCSI):
▪ FC LIF is assigned a WWPN when created.
▪ iSCSI LIF IP addresses can be manually or
automatically assigned.
▪ No failover
Cluster ▪ Restrictions on migration

Data LIFs that are assigned a NAS protocol follow slightly different rules than LIFs
that are assigned a SAN protocol.

Data LIFs that are assigned with NAS protocol access are often called NAS LIFs.
NAS LIFs are created so that client’s can access data from a specific SVM. They are
multiprotocol and can be assigned NFS, CIFs or both. When the LIF is created, you
can manually assign an IP address or specify a subnet so that the address is
automatically assigned. NAS LIFs can fail over or migrate to any node in the cluster.

Data LIFs that are assigned with SAN protocol access are often called SAN LIFs.
SAN LIFs are created so that a host can access LUNs from a specific SVM. SAN LIFs
are single-protocol and can be assigned either the FC or iSCSI protocol. When a LIF
is created that is assigned the FC protocol, a WWPN is automatically assigned. When
a LIF is created that is assigned the iSCSI protocol, you can either manually assign
an IP address or specify a subnet, and the address is automatically assigned.
Although SAN Data LIFs do not fail over, they can be migrated. However, there are
restrictions on migration.

For more information about migrating SAN LIFs, see the SAN Administration Guide.

141
LIF Movement
Migrate Fail Over Revert
▪ The process of moving a ▪ The automatic migration ▪ Return of a failed-over or
LIF from one network port of a LIF from one migrated LIF back to its
to another network port network port to another home port
network port.
▪ A nondisruptive operation ▪ Process:
(NDO) for: ▪ Link failures: ▪ Manual
▪ Maintenance ▪ Component failure ▪ Automatic, if configured to
▪ Performance ▪ Nondisruptive upgrade (NDU) be automatic

Targets are based on the


assigned failover group and
failover policy.

Migration is the process of moving a LIF from one network port to another network
port. The destination depends on the role the LIF has been assigned or in the case of
data LIFs, the protocol. Migrating a LIF is considered a nondisruptive operation, or
NDO. Typically LIFs are migrated before maintenance is performed, for example to
replace a part. LIFs might also be migrated manually or automatically for performance
reasons, for example when a network port becomes congested with traffic.

A LIF failover is a migration that happens automatically due to a link failure.


Component failures can cause link failures, or link failures can occur during a system
software upgrade. During a nondisruptive upgrade, or NDU, LIFs automatically fail
over to a different node in the cluster while a node is being upgraded. When a LIF
fails over, the target of the LIF’s destination is based on the assigned failover group
and failover policy.

You can revert a LIF to its home port after the LIF fails over or is migrated to a
different network port. You can revert a LIF manually or automatically. If the home
port of a particular LIF is unavailable, the LIF remains at its current port and is not
reverted.

142
LIF Failover

Failover Groups Cluster Default User-Defined port ifgrp VLAN

Failover Policies

Broadcast System- Local Storage Disabled


Domain–Wide Defined Only Failover
Partner
All ports from all Only ports in the Only ports in the Only Not configured for
nodes in failover group that failover group that failover
the failover group are on the LIF's are on the LIF's Only ports in the
home node and home node failover group that
on a non-HA are on the LIF's
partner node home node and its
Default for cluster HA partner node SAN data LIFs
Default for cluster Default for NAS and node
management LIF data LIFs management LIFs

Configuring LIF failover involves creating the failover group, modifying the LIF to use the
failover group, and specifying a failover policy.

A failover group contains a set of network ports from one or more nodes in a cluster. The
network ports that are present in the failover group define the failover targets that are available
for the LIF. Failover groups are broadcast domain–based and are automatically created when
you create a broadcast domain. The ”Cluster” failover group contains only cluster LIFs. The
”Default” failover group can have cluster management LIFs, node management LIFs,
intercluster LIFs, and NAS data LIFs assigned to it. User-defined failover groups can be created
when the automatic failover groups do not meet your requirements. For example, a user-
defined failover group can define only a subset of the network ports that are available in the
broadcast domain.

LIF failover policies are used to restrict the list of network ports within a failover group that are
available as failover targets for a LIF. Usually, you should accept the default policy when you
create a LIF. For example, the cluster management LIF can use any node in the cluster to
perform management tasks, so the cluster management LIF is created by default with the
“broadcast-domain-wide” failover policy.
The node management LIFs and cluster LIFs are set to the “local-only” failover policy because
failover ports must be on the same local node.
NAS data LIFs are set to be system defined. This setting enables you to keep two active data
connections from two unique nodes when performing software updates. This setting also
enables rolling upgrades to be performed.
SAN data LIFs are configured as disabled. This configuration cannot be changed, so SAN data
LIFs do not fail over.

143
Knowledge Check
1. Which two items can a logical interface represent?
(Choose two.)
a) An IP address
b) A WWPN
c) A VLAN
d) An interface group

Which two items can a logical interface represent?

144
Knowledge Check
2. Match the LIF role with the default LIF failover policy.

Cluster LIF Local only

Cluster management LIF Broadcast domain-wide

NAS data LIF System-defined

SAN data LIF Disabled

Match the LIF role with the default LIF failover policy.

145
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com

When ready, click the Play button to continue.

Resources

146
ONTAP Cluster Fundamentals:
Storage Virtual Machines

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome to ONTAP Cluster Fundamentals: Storage Virtual Machines.

147
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules

The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.

This module was written for cluster administrators and provides an introduction to the
concept of storage virtual machines.

148
This module focuses on enabling you to do the following:
▪ Describe the benefits, components, and features of storage
virtual machines (SVMs)
▪ Describe FlexVol volumes and efficiency features
About This ▪ Create and manage SVMs
Module

In this module, you learn about the benefits, components, and features of storage
virtual machines (SVMs). You learn about FlexVol volumes and efficiency features.
You also learn how to create and manage SVMs.

149
Lesson 1
Storage Virtual Machines

Lesson 1, Storage Virtual Machines.

150
Replace Slide 5

Data SVM
▪ Stored in data SVMs:
▪ Data volumes that serve client data
Data
LIF ▪ Logical interfaces (LIFs) that serve
client data
Client access
SVM with FlexVol ▪ Data SVM volume types:
volumes ▪ FlexVol volumes
▪ FlexGroup volumes
▪ Infinite volumes

Cluster

This module examines the data storage virtual machine, or SVM.

A data SVM contains data volumes and logical interfaces, or LIFs, that serve data to
clients. Unless otherwise specified, the term SVM refers to data SVM. In the CLI,
SVMs are displayed as Vservers.

ONTAP software provides three types of volumes: FlexVol volumes, FlexGroup


volumes, and Infinite volumes. In this module, we focus on the SVM with FlexVol
volumes.

151
SVM Benefits
▪ Secure multitenancy: ▪ Unified storage:
▪ Partitioning of a storage system ▪ SVMs with FlexVol volumes
▪ Isolation of data and management ▪ NAS protocols: CIFS and NFS
▪ No data flow among SVMs in cluster ▪ SAN protocols: iSCSI and FC (FCoE included)

▪ Nondisruptive operations ▪ Delegation of management:


and upgrades: ▪ User authentication and
administrator authentication
▪ Resource migration
▪ Access assigned by the
▪ Resource availability during hardware and
cluster administrator
software upgrades.

▪ Scalability:
▪ Adding and removing SVMs as needed
▪ Modifying SVMs for data throughput and
storage requirements on demand

SVMs provide many benefits.


One benefit of SVMs is secure multitenancy. SVMs are the fundamental unit of secure
multitenancy. SVMs enable partitioning of the storage infrastructure so that it appears
as multiple independent storage systems. These partitions isolate data and
management. Each SVM appears as a single independent server, which enables
multiple SVMs to coexist in a cluster and ensures that no data flows among them.
Another benefit of SVMs is nondisruptive operations, or NDO. SVMs can operate
continuously and nondisruptively. By enabling resources such as volumes and logical
interfaces to move to other nodes, SVMs help clusters to operate continuously. The
clusters can operate continuously during software and hardware upgrades, the addition
and removal of nodes, and all administrative operations.
Another benefit of SVMs is scalability. SVMs can be added, removed, or given more
resources as the underlying physical storage grows. SVMs can be modified to meet on-
demand data throughput and the other storage requirements.
Another benefit of SVMs is unified storage. SVMs can serve data concurrently through
multiple data access protocols. SVMs with FlexVol volumes provide file-level data
access through NAS protocols, such as CIFS and NFS, and block-level data access
through SAN protocols, such as iSCSI and FC (FCoE included). SVMs with FlexVol
volumes can serve data to SAN and NAS clients independently at the same time.
Another benefit of SVMs is delegation of management. Each SVM can have its own
user authentication and administrator authentication. SVM administrators can manage
the SVMs that they are authorized to access. However, cluster administrators assign
privileges to SVM administrators.

152
Replace Slide 7

SVM Considerations
SVM creation tools: SVM use cases:
▪ System Manager ▪ Configuring secure multitenancy
▪ The CLI ▪ Separating resources and workloads

NOTE: Resources such as volumes and LIFs


cannot be moved nondisruptively between
different SVMs.

You must set up at least one data access SVM per cluster, which involves planning
the setup, understanding requirements, and creating and configuring the SVM.
NetApp recommends using OnCommand System Manager to create an SVM.

The reasons for creating an SVM depend on the use case or workload requirements.
Usually, only a single SVM is needed. Sometimes, for example when the customer is
a service provider, SVMs can be created for each tenant. Other use cases include
separating different storage domains, meeting network requirements, configuring data
protection domains, or managing different workloads.

When creating more than one SVM, you cannot move resources such as volumes or
LIFs between different SVMs nondisruptively.

153
SVM with FlexVol Volumes
▪ FlexVol volume:
Qtree
Q3
Data ▪ Representation of the file system in a
LIF
Q2
Q1
NAS environment
Client Access ▪ Container for LUNs in a SAN environment
LUN Data
LIF ▪ Qtree:
SVM Host Access ▪ Partitioning of FlexVol volumes into
smaller segments
▪ Management of quotas, security style, and
CIFS opportunistic lock (oplock) settings

▪ LUN: Logical unit that represents a


Cluster SCSI disk

An SVM can contain one or more FlexVol volumes. In a NAS environment, volumes
represent the file system where clients store data. In a SAN environment, a LUN is
created in the volumes for a host to access.

Qtrees can be created to partition a FlexVol volume into smaller segments, much like
directories. Qtrees can also be used to manage quotas, security styles, and CIFS
opportunistic lock settings, or oplock settings.

A LUN is a logical unit that represents a SCSI disk. In a SAN environment, the host
operating system controls the reads and writes for the file system.

154
SVM Root Volume

Characteristics of an SVM root volume:


Data
root
LIF
▪ Is created when the SVM is created
Client Access
Data
▪ Serves as the NAS clients’ entry point
LIF to the namespace provided by an SVM
SVM LUN Host Access
▪ Should not be used to store user data

Cluster

When the SVM is created, an SVM root volume is also created, which serves as the
NAS clients’ entry point to the namespace provided by an SVM. NAS clients' data
access depends on the health of the root volume in the namespace. In contrast, SAN
clients' data access is independent of the root volume's health in the namespace.

You should not store user data in the root volume of an SVM.

155
Data LIFs

▪ NAS data LIFs:


Data ▪ Multiprotocol (NFS, CIFS or both)
LIF
root ▪ Manually or automatically assigned
Client Access IP addresses
Data ▪ Failover or migration to any node in the cluster
LIF
SVM LUN Host Access ▪ SAN data LIFs:
▪ Single-protocol (FC or iSCSI):
▪ FC LIF is assigned a WWPN when created.
▪ iSCSI LIF IP addresses can be manually or
automatically assigned.
▪ No failover
Cluster ▪ Restrictions on migration

Data LIFs that are assigned a NAS protocol follow slightly different rules than LIFs
that are assigned a SAN protocol.

Data LIFs that are assigned with NAS protocol access are often called NAS LIFs.
NAS LIFs are created so that clients can access data from a specific SVM. They are
multiprotocol and can be assigned NFS, CIFs or both. When the LIF is created, you
can manually assign an IP address or specify a subnet so that the address is
automatically assigned. NAS LIFs can fail over or migrate to any node in the cluster.

Data LIFs that are assigned with SAN protocol access are often called SAN LIFs.
SAN LIFs are created so that a host can access LUNs from a specific SVM. SAN LIFs
are single-protocol and can be assigned either the FC or iSCSI protocol. When a LIF
is created that is assigned the FC protocol, a WWPN is automatically assigned. When
a LIF is created that is assigned the iSCSI protocol, you can either manually assign
an IP address or specify a subnet, and the address is automatically assigned.
Although SAN Data LIFs do not fail over, they can be migrated. However, there are
restrictions on migration.

For more information about migrating SAN LIFs, see the SAN Administration Guide.

156
Administration
▪ Cluster administrator
▪▪ Aggregates
Administer andthe entire
network cluster
ports: and the
Can perform SVMs
all system
administration
it contains.tasks
▪ SVMs: Can create, view, modify, or delete
▪▪ Access-control:
Set up data Can SVMs and
create, delegate
view, modify, or SVM
delete
▪ Volumes:
administration toview,
Can create, SVM administrators.
modify, move, or delete
▪ LIFs: Can create, view, modify, migrate, or delete LIFs

▪ SVM administrator
▪ Administer only their own data SVMs.
▪ Aggregates and network ports: Have a limited view
▪▪ SVMs:
Set up Arestorage
assigned and
to an network
SVM by theresources, such
cluster administrator
as volumes, protocols, LIFs, and services.
▪ Access-control: Can manage their own user account local
password and key information
Note: SVM administrators cannot
▪ Volumes: Can create, view, modify, or delete log in to System Manager.
▪ LIFs: Can only view the LIFs associated with their
assigned SVM

Cluster administrators administer the entire cluster and the SVMs it contains. They can
also set up data SVMs and delegate SVM administration to SVM administrators. This
list is a list of common tasks, but the specific capabilities that cluster administrators
have depend on their access-control roles.

SVM administrators administer only their own data SVMs storage and network
resources, such as volumes, protocols, LIFs, and services. This list is a list of common
tasks, but the specific capabilities that SVM administrators have depend on the access-
control roles that are assigned by cluster administrators.

It should be noted, when the cluster administrator creates an SVM administrator, they
also need to create a management LIF for the SVM. The SVM administrator or
management software uses this LIF to log in to the SVM. For example, SnapDrive data
management software would use this LIF. SVM administrators cannot log in to System
Manager. SVM administrators are required to manage the SVM by using the CLI.

157
Knowledge Check
1. Match each term with the term’s function.

SVM Owns its logical storage and network resources

SVM’s root volume Serves as the NAS clients’ entry point to the namespace

Node root volume Contains cluster configuration data

FlexVol Volume Contains user data

Provides a network access point for clients or hosts to access


Data LIF
data in an SVM

Cluster management LIF Provides a network access point to manage an SVM

Match each term with the term’s function.

158
Knowledge Check
2. Using the default configuration, which items can an SVM
administrator create?
a. Aggregate
b. SVM
c. Volume
d. LIF

Using the default configuration, which items can an SVM administrator create?

159
Lesson 2
FlexVol Volumes

Lesson 2, FlexVol volumes.

160
FlexVol Volumes
Write Anywhere File Layout (WAFL) file system:
▪ Organizes blocks of data on disk into files
▪ FlexVol volumes represent the file system

FlexVol Volume
Inode file

Inode Inode

A B C D E

The Write Anywhere File Layout, or WAFL, file system organizes blocks of data on
disks into files. The logical container, which is a FlexVol volume, represents the file
system.

The WAFL file system stores metadata in inodes. The term “inode” refers to index
nodes. Inodes are pointers to the blocks on disk that hold the actual data. Every file
has an inode, and each volume has a hidden inode file, which is a collection of the
inodes in the volume.

161
Volumes in Aggregates
▪ Aggregate: FlexVol 1 FlexVol 2 FlexVol 3
Inode file
▪ 4KB blocks
▪ WAFL reserves 10% vol1 vol2
vol3
▪ Volume:
▪ Provisioning types:
▪ Thick: volume guarantee = volume
▪ Thin: volume guarantee = none
▪ Dynamic mapping to 4KB 4KB
physical space 10%
RG1 RG2

Aggregate

One or more FlexVol volumes can be created in an aggregate. To understand how


space is managed, it is necessary to examine how space is reserved in the
aggregate.

The WAFL file system writes data in 4KB blocks that are contained in the aggregate.
When the aggregate is created, WAFL reserves 10 percent of capacity for overhead.
The remainder of the aggregate is available for volume creation.

A FlexVol volume is a collection of disk space that is provisioned from the available
space within an aggregate. FlexVol volumes are loosely tied to their aggregates.
FlexVol volumes are striped across all the disks of the aggregate, regardless of the
volume size. In this example, the blue block that is labeled “vol1” represents the inode
file for the volume, and the other blue blocks contain the user data.

When a volume is created, the volume guarantee setting must be configured. The
volume guarantee setting is the same as the space reservations. If space is reserved
for the volume, the volume is said to be thick-provisioned. If space is not reserved
during creation, the volume is said to be thin-provisioned. FlexVol volumes are
dynamically mapped to physical space. Whether the volume is thick-provisioned or
thin-provisioned, blocks are not consumed until data is written to the storage system.

A FlexVol volume can be as small as 20MB or as large as the controller model


supports. Also, the volume can grow or shrink, regardless of the provisioning type.

162
Volume Footprint
User data is written to Metadata is internal The Snapshot reserve is
a volume. tracking for the file counted as used space
system, inodes, even if there are no
and features. Snapshot copies in
the reserve.
Volume footprint with guarantee = Volume
None

Volume size

Volume File Snap-


Space reserved for Aggregate free
meta- system User data shot
guarantee space
data metadata copies

Aggregate

A volume footprint is the amount of space that a volume is using in the aggregate. The
volume footprint consists of the space that is used by user data, snapshot copies, and
metadata. The metadata includes metadata that resides in the aggregate rather than in
the volume itself. For this reason, a volume might take up more space in the aggregate
than ONTAP advertises to the client.

When a volume is created, the client sees the total volume size, regardless of the
volume guarantee settings. For example, if you create a 10GB volume, the client sees
the full 10GB, regardless of whether the space is available.

If the volume guarantee is set to “volume,” the volume footprint inside the aggregate
includes the total reserved space. If another thick provisioned volume is created, the
volume could only be the size of the remaining aggregate free space.

With a guarantee of “none,” the volume size is not limited by the aggregate size. In fact,
each volume could, if necessary, be larger than the containing aggregate. The storage
that is provided by the aggregate is used only as data is written to the volume.

Thin provisioning enables you to overcommit the storage object that supplies its storage.
A storage object is said to be overcommitted if the objects it supplies storage to are
collectively larger than the amount of physical storage it can currently supply.
Overcommitting a storage object can increase your storage efficiency. However,
overcommitting also requires that you take an active role in monitoring your free space
to prevent writes from failing due to lack of space.

163
Snapshot Copy Technology
Create Snapshot copy 1

Blocks on Create Snapshot copy 1:


Volume disk ▪ Pointers are copied.
▪ No data is moved.
A A
B B
C C

File or A
LUN B
C

Snapshot
Copy 1

Understanding the technology that is used to create a Snapshot copy helps you to
understand how space is utilized. Furthermore, understand this technology will also
help you understand features such as FlexClone volumes, deduplication and
compression.

A Snapshot copy is a local, read-only point-in-time image of data. Snapshot copy


technology is a built-in feature of WAFL storage virtualization technology that
provides easy access to old versions of files and LUNs.

When a Snapshot copy is created, ONTAP starts by creating pointers to physical


locations. The system preserves the inode map at a point in time and then continues
to change the inode map on the active file system. ONTAP then retains the old
version of the inode map. No data is moved when the Snapshot copy is created.

Snapshot technology is highly scalable. A Snapshot copy can be created in a few


seconds, regardless of the size of the volume or the level of activity on the storage
system. After the copy is created, changes to data objects are reflected in updates to
the current version of the objects, as if the copy did not exist. Meanwhile, the
Snapshot copy of the data remains stable. A Snapshot copy incurs no performance
overhead. Users can store up to 255 Snapshot copies per volume. All the Snapshot
copies are accessible as read-only and online versions of the data.

164
Snapshot Copy Technology
Continue writing data

Blocks on 1. Create Snapshot copy 1.


Volume disk
2. Continue writing data:
A A ▪ Data is written to a new
B B location on the disk.
▪ Pointers are updated.
D C C
D

A
B
C

Snapshot
Copy 1

When ONTAP writes changes to disk, the changed version of block C gets written to
a new location. In this example, D is written to a new location. ONTAP changes the
pointers rather than moving data.

In this way, the file system avoids the parity update changes that are required if new
data is written to the original location. If the WAFL file system updated the same
block, the system would have to perform multiple parity reads to be able to update
both parity disks. The WAFL file system writes the changed block to a new location,
again writing in complete stripes and without moving or changing the original data
blocks.

165
Snapshot Copy Technology
Create Snapshot copy 2

Blocks on 1. Create Snapshot copy 1.


Volume disk
2. Continue writing data.
A A
B B
3. Create Snapshot copy 2:
▪ Pointers are copied.
D C
▪ No data is moved.
D

A A
B B
C D

Snapshot Snapshot
Copy 1 Copy 2

When ONTAP creates another Snapshot copy, the new Snapshot copy points only to
the active file system blocks A, B, and D. Block D is the new location for the changed
contents of block C. ONTAP does not move any data; the system keeps building on
the original active file system. Because the method is simple, the method is good for
disk use. Only new and updated blocks use additional block space.

166
Snapshot Copy Technology
Restore from a Snapshot copy

Blocks on To restore a file or LUN, use


Volume disk SnapRestore to restore the file
A A
or LUN from Snapshot copy 1.
B B Snapshot copies that were
D
C C created after Snapshot copy 1
D are deleted.
Unused blocks on disk are
A A made available as free space.
B B
C D

Snapshot Snapshot
Copy 1 Copy 2

Assume that after the Snapshot copy was created, the file or LUN became corrupted,
which affected logical block D. If the block is physically bad, RAID can manage the
issue without recourse to the Snapshot copies. In this example, block D became
corrupted because part of the file was accidentally deleted and you want to restore
the file.
To easily restore data from a Snapshot copy, use the SnapRestore feature.
SnapRestore technology does not copy files; SnapRestore technology moves
pointers from files in the good Snapshot copy to the active file system. The pointers
from that Snapshot copy are promoted to become the active file system pointers.
When a Snapshot copy is restored, all Snapshot copies that were created after that
point in time are destroyed. The system tracks links to blocks on the WAFL system.
When no more links to a block exist, the block is available for overwrite and is
considered free space.

Because a SnapRestore operation affects only pointers, the operation is quick. No


data is updated, nothing is moved, and the file system frees any blocks that were
used after the selected Snapshot copy. SnapRestore operations generally require
less than a second. To recover a single file, the SnapRestore feature might require a
few seconds or a few minutes.

167
Volume Efficiency
Deduplication: Data Compression: Data Compaction:
▪ Elimination of duplicate ▪ Compression of redundant ▪ Store more data in less
data blocks data blocks space
▪ Inline or postprocess ▪ Inline or postprocess ▪ Inline
▪ Inline deduplication for All ▪ Two compression methods: ▪ Enabled by default on All
Flash FAS and Flash Pool ▪ Secondary: 32KB Flash FAS systems
systems to reduce the compression groups (optional on FAS systems)
number of writes to the ▪ Adaptive: 8KB compression
groups, which improves read
solid-state drives (SSDs) performance

ONTAP provides three features that can increase volume efficiency: deduplication, data
compression, and data compaction. You can use these features together or
independently on a FlexVol volume to reduce the amount of physical storage that a
volume requires.

To reduce the amount of physical storage that is required, deduplication eliminates the
duplicate data blocks, data compression compresses redundant data blocks, and data
compaction increases storage efficiency by storing more data in less space. Depending
on the version of ONTAP and the type of disks that are used for the aggregate,
deduplication and data compression can be run inline or postprocess. Data compaction
is inline only.

Inline deduplication can reduce writes to solid-state drives (SSDs), and is enabled by
default on all new volumes that are created on the All Flash FAS systems. Inline
deduplication can also be enabled on new and existing Flash Pool volumes.

Data compression combines multiple 4KB [kilobytes] WAFL blocks into compression
groups before the compression process starts. There are two data compression
methods that can be used. The secondary method uses 32KB [kilobytes] compression
groups. The adaptive method uses 8KB compression groups, which helps to improve
the read performance of the storage system.

Inline data compaction stores multiple user data blocks and files within a single 4KB
block on a system that is running ONTAP software. Inline data compaction is enabled
by default on All Flash FAS systems, and you can optionally enable it on volumes on
FAS systems.

168
Deduplication
▪ Deduplication:
▪ Elimination of duplicate data blocks to Volume
reduce the amount of physical storage File A File B
▪ Volume-level
abcd eabc deaa abcd eaaa bcde abcd eabc

▪ Postprocess example:
▪ File A is ~20KB, using five blocks
▪ File B is ~12KB, using three blocks free eaaa bcde abcd
abcd eabc deaa abcd free eabc
free

Aggregate

Deduplication eliminates duplicate data blocks, at a volume level, to reduce the


amount of physical storage that is required. When inline deduplication is used,
duplicate blocks are eliminated while they are in main memory, before they are
written to disk. When postprocess is used, the blocks are written to disk first and
duplicates are later freed at a scheduled time.

In this example, postprocess deduplication has been enabled on a volume that


contains two files. File A is a document of approximately 20KB. This file uses five 4KB
[kilobytes] blocks. File B is another document of approximately 12KB [kilobytes]. This
file uses three 4KB [kilobytes] blocks. The data in the blocks has been simplified on
the slide, using four characters. The blocks have also been color coded on the slide
to easily identify the duplicate blocks.

In file A, the first and fourth block contain duplicate data, one of the blocks can be
eliminated. The second block in file B, also contains the same duplicate data, which
can be eliminated. Duplication eliminates duplicate blocks within the volume,
regardless of the file.

169
Aggregate-Level Inline Deduplication
A cross-volume
▪ Aggregate-level inline shared block is
deduplication: owned by the
FlexVol volume
▪ Performs cross-volume sharing for
that first wrote the
volumes belonging to the same aggregate block. Cross-Volume
Deduplication
▪ Is enabled by default on all newly created
volumes on All Flash FAS systems that run
ONTAP 9.2 or greater Volume
Deduplication
▪ A cross-volume shared block is
owned by the FlexVol volume that
first wrote the block.
Enhanced
for
ONTAP 9.3

Beginning with ONTAP 9.2, you can perform cross-volume sharing in volumes that
belong to the same aggregate using aggregate-level inline deduplication. Aggregate-
level inline deduplication is enabled by default on all newly created volumes on All
Flash FAS (AFF) systems running ONTAP 9.2 or greater. Cross-volume sharing is
not supported on Flash Pool and HDD systems.

When cross-volume sharing is enabled on an aggregate, volumes that belong to the


same aggregate can share blocks and deduplication saving. A cross-volume shared
block is owned by the FlexVol volume that first wrote the block.

Beginning with ONTAP 9.3, you can schedule background cross-volume


deduplication jobs on AFF systems. Cross-volume background deduplication
provides additional incremental deduplication savings.

Additionally, you can automatically schedule background deduplication jobs with


Automatic Deduplication Schedule (ADS). ADS automatically schedules background
deduplication jobs for all newly created volumes with a new automatic policy that is
predefined on all AFF systems.

170
Data Compression
▪ Compression:
▪ Compression of redundant data blocks to Volume
reduce the amount of physical storage File A File B
▪ Volume-level
abcd eabc deaa abcd eaaa bcde abcd eabc

▪ Example:
▪ File A is ~20KB, using five blocks
▪ File B is ~12KB, using three blocks abcd eabc deaa abcd eaaa bcde abcd eabc

~>#! *abc

Aggregate

abcd eabc de#! *abc deab c

~>#! *abc

Data compression compresses redundant data blocks, at a volume level, to reduce


the amount of physical storage that is required. When inline data compression is
used, compression is done in main memory, before blocks are written to disk. When
postprocess is used, the blocks are written to disk first and data is compressed at a
scheduled time.

This example starts exactly where the previous example started, except postprocess
data compression is enabled.

Data compression first combines several blocks into compression groups. In this
example, the 32KB compression group is made up of these eight 4KB [kilobytes]
blocks. The data compression algorithm identifies redundant patterns, which can be
compressed. The algorithm continues to find redundancies and compress them. After
everything has been compressed, all that remains on disk are the fully compressed
blocks.

171
Inline Data Compaction
Data 4KB
▪ Stores multiple logical I/Os or files in
Free Logical
a single physical 4KB block
Space Block ▪ For small I/O or files, less than 4KB
Data ▪ Increases efficiency of adaptive (8KB)
4KB compression
▪ Compresses 4KB I/Os
Physical
Data Block ▪ Enabled by default on All Flash FAS
systems
▪ Optional for FAS systems
Data

Data compaction takes I/Os that normally consume a 4KB block on physical storage
and packs multiple such I/Os into one physical 4KB block.

This increases space savings for very small I/Os and files, less than 4KB, that have a
lot of free space.

To increase efficiency, data compaction is done after inline adaptive compression and
inline deduplication.

Compaction is enabled by default for All Flash FAS systems shipped with ONTAP 9.
Optionally, a policy can be configured for Flash Pool and HDD-only aggregates.

172
All Flash FAS Inline Storage Efficiency Workflow

Inline Zero-block Inline Adaptive Inline Inline Data


Deduplication Compression Deduplication Compaction

Detects all-zero blocks Compresses 8KB Deduplicates incoming Combines two or more
blocks written to blocks against recently small logical blocks into
Updates only
storage written blocks a single 4KB physical
metadata, not user
block
data Is aligned with the I/O Is used in conjunction
size used with most with background (post-
databases write) deduplication to
achieve maximum
space savings

Data compaction is an inline operation that occurs after inline compression and inline
deduplication. On an All Flash FAS system, the order of execution follows the steps
shown here.

In the first step, inline zero-block deduplication detects all-zero blocks. No user data is
written to physical storage during this step. Only metadata and reference counts are
updated.

In the second step, inline adaptive compression compresses 8KB logical blocks into
4KB physical blocks. Inline adaptive compression is very efficient in determining
compressibility of the data and doesn’t waste lot of CPU cycles trying to compress
incompressible data.

In the third step, inline deduplication opportunistically deduplicates incoming blocks to


already existing blocks on physical storage.

In the last step, inline adaptive data compaction combines multiple logical blocks that
are less than 4KB into a single 4KB physical block to maximize savings. It also tries to
compress any 4KB logical blocks that are skipped by inline compression to gain
additional compression savings.

173
All Flash FAS Storage Efficiency Example
Vol A Vol B Vol C

8KB 8KB 8KB 4KB 4KB 3x 1KB


Writes from
hosts or
clients 50% compressible 80% compressible 80% compressible 55% compressible

4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB
Without
compression
11 blocks
After inline 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB
adaptive 8 blocks
compression

After inline 4KB 4KB 4KB 4KB


data 4 blocks
compaction

The example shows the I/O from three separate volumes:


Vol A consists of three 8KB I/Os, one of which is 50% compressible, and the other
two are 80% compressible.
Vol B consists of two 4KB I/Os, both of which are 55% compressible.
Vol C consists of three 1KB I/Os.

Without data compression or data compaction, the incoming I/Os would consume a
total of eleven 4KB blocks on physical storage. The 1KB I/Os from Vol C each require
a 4KB block because the minimum block size in WAFL is 4KB.

If inline adaptive compression is used, the 50% compressible 8KB I/O from Vol A is
compressed to 4KB. The two 80% compressible 8KB I/Os from Vol A and the three
1KB I/Os from Vol C also consume 4KB each on the physical storage because of the
WAFL 4K block size. The result totals eight 4KB blocks on physical storage.

If inline adaptive data compaction is used after the inline adaptive compression, the
two 80% compressible 8KB I/Os from Vol A are packed into a single 4KB block. The
two 55% compressible 4KB I/Os from Vol B are packed into another 4KB block. And
the three 1KB I/Os from Vol C are packed into another 4KB block. The result totals
four 4KB blocks on physical storage.

174
Moving Volumes
▪ Where and how volumes can
be moved:
▪ To any aggregate in the cluster
▪ Only within the SVM
▪ Nondisruptively to the client

▪ Use cases:
aggr1 ▪ Capacity: Move a volume to an aggregate
aggr5
with more space
aggr3
aggr2
▪ Performance: Move a volume to an
aggregate with different
aggr6 performance characteristics
▪ Servicing: Move volumes to newly added
aggr4
nodes or from nodes that are being retired

FlexVol volumes can be moved from one aggregate or node to another within the
same SVM. A volume move does not disrupt client access during the move.

You can move volumes for capacity use, for example when more space is needed.
You can move volumes to change performance characteristics, for example from a
controller with hard disks to one that uses SSDs. You can move volumes during
service periods, for example to a newly added controller or from a controller that is
being retired.

175
Cloning Volumes

FlexVol Volume A FlexClone


FlexVol Volume
Volume
B
File A File A File C
G
A B C A B’
B C’
C H I

A B C B’ C’
H
G I

Aggregate

A storage administrator uses the FlexClone feature to copy volumes. FlexClone


volumes are writable, point-in-time copies of a parent FlexVol volume. FlexClone
volumes are space-efficient because they share data blocks with their parent FlexVol
volumes for common data. Only when you write new data to a parent or clone does
the entity on which new data is written start occupying extra storage space. The client
or host can perform all operations on the files or LUNs in a FlexClone volume just as
they can on standard files or LUN.

A read/write FlexClone volume can be split from the parent volume, for example to
move the clone to a different aggregate. Splitting a read/write FlexClone volume from
its parent requires the duplication of the shared blocks and removes any space
optimizations that are currently used by the FlexClone volume. After the split, both the
FlexClone volume and the parent volume require the full space allocation determined
by their volume guarantees. The FlexClone volume becomes a normal FlexVol
volume.

176
Knowledge Check
1. Which storage efficiency feature removes duplicate blocks?
a) Thin provisioning
b) Snapshot copy
c) Deduplication
d) Compression

Which storage efficiency feature removes duplicate blocks?

177
Knowledge Check
2. Data can be written to a Snapshot copy.
a) True
b) False

Data can be written to a Snapshot copy.

178
Knowledge Check
3. Data can be written to a FlexClone volume.
a) True
b) False

Data can be written to a FlexClone volume.

179
Lesson 3
Creating and Managing SVMs

Lesson 3, creating and managing SVMs.

180
SVM Setup Workflow
Step 1: SVM basic details

▪ SVM details:
▪ SVM name
▪ IPspace
▪ Volume Type
▪ Data Protocols
▪ Default Language
▪ Root volume security style
▪ Root aggregate (root
volume location)

▪ Domain Name Server


(DNS) configuration

Creating SVMs by using OnCommand System Manager is wizard-based and simple


to use.

In the first step, you specify details about the SVM. Next you specify the Domain
Name Server, or DNS, configuration information.

The next steps depend on the protocols that you choose here. In this example, the
user has chosen CIFS, NFS and iSCSI, which require separate steps for NAS
protocols and SAN protocols.

181
SVM Setup Workflow
Step 2: Configure NAS protocols

Configure CIFS or
NFS protocols:
▪ Configuration of data LIFs
▪ CIFS server configuration
▪ Network Information Service
(NIS) server configuration
(optional, for NFS)
▪ Provisioning (optional):
▪ Volume for CIFS storage
▪ Volume for NFS storage

If you choose either CIFS or NFS, you configure those protocols in Step 2. First, you
specify information about the data LIFs. If you choose the CIFS protocol, you specify
the CIFS server information. If you choose the NFS protocol, you might want to
specify the Network Information Service (NIS) server information if applicable.
Optionally, you can also have the wizard provision storage. You can specify those
details before continuing.

182
SVM Setup Workflow
Step 3: Configure SAN protocols

Configure iSCSI, FC, or


FCoE protocols:
▪ Configuration of data LIFs
▪ Provisioning (optional):
▪ Volume and LUN for iSCSI or
FC storage
▪ Initiator details

If you also choose either iSCSI or FC, you configure those protocols in Step 3. In the
example, the user chose iSCSI. If you choose FC, the steps are similar.

First, you specify information about the data LIFs. Optionally, you can also have the
wizard provision storage. You can specify those details before continuing.

183
SVM Setup Workflow
Step 4: Configure SVM administration

SVM administrator
details (optional):
▪ User name and password
▪ Configuration of
management LIF for SVM

In the final step, you are asked to optionally create an SVM administrator for use by
host-side applications like SnapDrive software and SnapManager software. Data LIFs
that are assigned the CIFS or NFS protocols enable management access by default.
For environments where only iSCSI or FC protocols are chosen and host-side
applications like SnapDrive and SnapManager are used, a dedicated SVM
management LIF is required.

184
Editing an SVM
Cluster administration

SVM properties that can be modified:


▪ Details: Data protocols
▪ Resource allocation: Delegate volume creation
▪ Services: Name service switch and
name mapping switch

After the SVM setup is complete, you can add or remove protocols, configure
resource allocation, or edit the name services properties.

By default, administrators can create a volume or move a volume within the SVM to
any aggregate in the cluster. To enable or prevent an SVM from using a particular
aggregate in the cluster, you edit the Resource Allocation properties. When the
“Delegate volume creation” option is selected, you can select aggregates to delegate
volume creation to those aggregates.

185
Volume Properties

Actions that can be Volume options: Tools to


taken on volumes: ▪ Storage efficiency protect volumes:
▪ Create ▪ Storage quality of ▪ Snapshot copies
▪ Edit service (QoS) ▪ Mirrors
▪ Resize ▪ Vaults
▪ Delete
▪ Clone
▪ Move

Now that the SVM has been created, you can create, edit, resize, delete, clone, or
move volumes within the SVM. You can also configure efficiency features or
performance features, using storage quality of service, or QoS. Also, you can protect
volumes by using snapshot copies, mirrors, and vaults.

186
Configuring SVMs

Storage: Policies: Protection:


▪ Volumes ▪ Export ▪ Mirror
▪ Namespace ▪ Efficiency ▪ Vault
▪ Shares ▪ Protection
▪ LUNs ▪ Snapshot
▪ Qtrees ▪ Storage quality of
▪ Quotas service (QoS)
Configuration:
▪ Protocols
▪ Security
▪ Services
▪ Users
and groups

In addition to volumes, you can allocate and configure other storage resources. You
can also create and apply policies and configure SVM data protection features. You
can also configure [4] other configuration settings such as protocols, security,
services, users, and groups.

For more information about configuring SVMs, see the Logical Storage
Management Guide.

187
Policy-Based Management

Policy Snapshot Policy Efficiency Policy

Rule: Run by:


• Value 1 • Schedule
• Value 2 …
… Schedule:
• Daily
Value: … Schedule:
• Daily
enter value …
Copies retained:
enter amount
Item: Maximum Run Time:
v select hours

SVMs use policy-based management for many of their resources. A policy is a


collection of rules or properties that are created and managed by the cluster
administrator or sometimes by the SVM administrator. Policies are predefined as
defaults or policies can be created to manage the various resources. By default, the
policy applies to the current resources and to newly created resources, unless
otherwise specified.

For example, Snapshot policies can be used to schedule automatic controller-based


Snapshot copies. The policy includes such things as the schedule or schedules to
use and how many copies to retain. When a volume is created for the SVM, the policy
is automatically applied, but the policy can be modified later.

The efficiency policy is used to schedule postprocess deduplication operations. The


policy might include when and how long deduplication runs.

These examples are only two of the policies that you encounter in ONTAP. The
advantage of policy-based management is that when you create a policy, you can
apply the policy to any appropriate resource, either automatically or manually. Without
policy-based management, you would have to enter these settings for each individual
resource separately.

188
Knowledge Check
1. How can you change the configuration to prevent an SVM from
creating a volume on a particular aggregate?
a) Modify the aggregate settings
b) Modify the SVM settings
c) Modify the volume settings
d) Modify the user policy

How can you change the configuration to prevent an SVM from creating a volume on
a particular aggregate?

189
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com

When ready, click the Play button to continue.

Resources

190
ONTAP Cluster Fundamentals:
Maintenance

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Welcome to ONTAP Cluster Fundamentals: Maintenance.

191
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules

The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.

This module was written for cluster administrators and provides an introduction to the
concept of servicing and maintaining clusters.

192
This module focuses on enabling you to do the following:
▪ Upgrade cluster hardware and software
▪ Describe the performance features and monitoring tools
▪ Describe the tools and features that are used to identify and
About This resolve cluster issues
Module

This module discusses how to maintain the health of a cluster. You learn about
hardware and software upgrades, performance maintenance, cluster issues, and the
tools that can be used to maintain clusters.

193
Lesson 1
Nondisruptive Upgrades

Lesson 1, nondisruptive upgrades.

194
Nondisruptive Upgrades and Operations

Nondisruptive Upgrades (NDU): Nondisruptive Operations (NDO):


▪ Nondisruptive software upgrade types: ▪ Moving an aggregate between the
▪ Rolling upgrade nodes of a high-availability (HA) pair
▪ Batch upgrade
▪ Automated upgrade ▪ Moving volumes, LUNs, and logical
interfaces (LIFs) within a storage
▪ Nondisruptive hardware maintenance: virtual machine (SVM)
▪ Adding, replacing, or upgrading hardware
components on a node ▪ Creating a FlexClone of a volume
▪ Adding nodes to a cluster or LUN

Nondisruptive upgrades and operations require healthy HA pairs.

This module examines nondisruptive upgrades (NDUs) and nondisruptive operations


(NDOs).

Clusters can be upgraded nondisruptively by using the high-availability, or HA,


architecture of ONTAP. The three types of NDUs are rolling, batch, and automated
upgrades. The type of upgrade that you use depends on the version of ONTAP that
the cluster is running and the target version. Usually, hardware maintenance can be
performed nondisruptively also; for example, adding components to nodes, replacing
components, or adding new nodes.

Clusters also support nondisruptive operations, or NDO. Examples of NDO include


moving aggregates between the nodes of an HA pair and moving volumes, LUNs,
and logical interfaces within SVMs. Also, FlexClone volumes and FlexClone LUNs
can be created without disruption to the source volume or LUN.

HA pairs and the ONTAP architecture make many of these nondisruptive operations
possible.

195
Upgrade Advisor

List the serial


numbers for each
node in the cluster.

Upgrade Advisor, which is part of NetApp Active IQ, simplifies the process of planning
ONTAP upgrades. NetApp strongly recommends that you generate an upgrade plan
from Upgrade Advisor before upgrading your cluster.

When you submit your system identification and target release to Upgrade Advisor,
the tool compares AutoSupport data about your cluster to known requirements and
limitations of the target release. Upgrade Advisor then generates an upgrade plan
(and optionally a back-out plan) with recommended preparation and execution
procedures.

196
Rolling Upgrade
To perform a software upgrade in a cluster
that consists of two or more nodes:
1. The HA partner takes over control of the
Offline HA Offline storage resources.
Node 1 Node 2 2. The node that is being upgraded is taken offline.
Storage Resources Storage Resources
3. The node is upgraded after a reboot.
Data Data
Aggregate Aggregate 4. When the upgrade is complete, the node gives
back control to the original node.
Vol1 Vol1

Vol2 Vol2
5. The process is repeated on the other node of the
HA pair.
6. The process is repeated on additional HA pairs.

Rolling upgrades can be performed on clusters of two or mode nodes, but rolling
upgrades are run on one node of an HA pair at a time.

For a rolling upgrade, the partner node must first perform a storage takeover of the
node that is being upgraded. The node that is being upgraded is taken offline and
upgraded while its partner controls the storage resources. When the node upgrade is
complete, the partner node gives control back to the original owning node. The
process is repeated, this time on the partner node. Each additional HA pair is
upgraded in sequence until all HA pairs are running the target version.

197
Batch Upgrade

Cluster To perform a software upgrade in a cluster


HA Pair 1 HA Pair 2
that consists of eight or more nodes:
Offline Offline
1. The cluster is separated into two batches, each of
Node 1 Node 3 which contains multiple HA pairs.
Offline Offline
Node 2 Node 4 2. In the first batch, one node in each HA pair is
taken offline and upgraded while their partner
HA Pair 3 HA Pair 4
nodes take over their storage.

Offline Offline 3. When upgrades are complete on the first nodes,


Node 5 Node 7 the other node of the HA pair is upgraded.
Offline Offline
Node 6 Node 8 4. The process is then repeated on the
second batch.
Batch 1 Batch 2

Batch upgrades can be performed on clusters of eight or mode nodes. Unlike rolling
upgrades, batch upgrades can be run on more than one HA pair at a time.

To perform a batch upgrade, the cluster is separated into two batches, each of which
contains multiple HA pairs. In the first batch, one node in each HA pair is taken offline
and upgraded while the partner nodes take over the storage. When the upgrade is
completed for the first half of all the HA pairs, the partner nodes give control back to
the original owning nodes. Then the process is repeated, this time on the partner
nodes. The process then begins on the second batch.

198
Software Upgrade with System Manager

If you are upgrading from ONTAP and you prefer a UI, you can use OnCommand
System Manager to perform an automated, nondisruptive upgrade. Alternatively, you
can use the CLI to perform upgrades.

199
Automated Upgrade
Stage 1 Stage 2 Stage 3
Select Validate Update

Select ONTAP software View and validate cluster: Update cluster:


image:
▪ Validate the cluster update ▪ Update all the nodes in
▪ Display the current cluster readiness. the cluster or an HA pair in
version. the cluster.
▪ Display validation errors
and warnings with ▪ Support a rolling or batch
▪ Select a software image:
corrective action. update.
▪ Select from an available image. ▪ Update when validation is
complete and successful. ▪ Default update type
▪ Download an image from the depends on the number of
NetApp Support site. ▪ Enable update with warnings. nodes in cluster.

The automated upgrades that are performed by using System Manager consist of
three stages. The stages are select, validate, and update.

In the first stage, you select the ONTAP software image. The current version details
are displayed for each of the nodes or HA pairs. System Manager enables you to
select an already available software image for the update or to download a software
image from the NetApp Support site and add the image for the update.

In the second stage, you view and validate the cluster against the software image
version for the update. A pre-update validation checks whether the cluster is in a state
that is ready for an update. If the validation is completed with errors, a table displays
the status of the various components and the required corrective action for the errors.
You can perform the update only when the validation is completed successfully.

In the third and final stage, you update all the nodes in the cluster, or an HA pair in
the cluster, to the selected version of the software image. The default upgrade type
can be rolling or batch. The upgrade type that is performed depends on the number of
nodes in the cluster. While the update is in progress, you can choose to pause and
then either cancel or resume the update. If an error occurs, the update is paused and
an error message is displayed with the remedial steps. You can choose to either
resume the update after performing the remedial steps or cancel the update. You can
view the table with the node name, uptime, state, and ONTAP version when the
update is successfully completed.

200
Nondisruptive Hardware Maintenance
To perform hardware maintenance in
a cluster that consists of two or
more nodes:
Offline HA
Node 1 Node 2 1. The HA partner takes over control of the
Storage Resources Storage Resources storage resources.

Data Data
2. The node that is being serviced is taken offline
Aggregate Aggregate and powered off.
Vol1 Vol1 3. After the node has been serviced, the node is
Vol2 Vol2
powered on.
4. When the node is back online, the partner
node gives back control to the original node.

Examples of nondisruptive hardware maintenance include adding or replacing an


expansion card. Nondisruptive hardware maintenance is similar to a rolling upgrade.
Maintenance is performed on one node of an HA pair at a time.

For hardware maintenance, the partner node must first perform a storage takeover of
the node that will be serviced. The node can now be taken offline and powered off.
After the node has been serviced, the node is powered on. After the node has come
back online and is healthy, the partner node gives control back to the original owning
node. The process can be repeated, this time on the partner node, if necessary.

201
Nondisruptive Addition of Nodes to a Cluster
To add nodes to a healthy multinode ::> cluster setup

switched cluster: Welcome to the cluster setup wizard.

1. Verify that the nodes are configured as You can enter the following commands at any time:
"help" or "?" - if you want to have a question
HA pairs and connected to the clarified,
"back" - if you want to change previously answered
cluster interconnect. questions, and
"exit" or "quit" - if you want to quit the cluster
2. Power on both nodes of the HA pair. setup wizard.
Any changes you made before quitting will be saved.
3. Start the Cluster Setup wizard on one of You can return to cluster setup at any time by typing
the nodes. "cluster setup".
To accept a default or omit a question, do not enter a
value.
4. Use the join command and follow the wizard.
Do you want to create a new cluster or join an existing
5. Repeat Steps 3 and 4 on the partner node. cluster?
{create, join}: join

You can expand an existing cluster by nondisruptively adding nodes to it.

Nodes must be added from HA pairs that are connected to the cluster interconnect.
Nodes are joined to the cluster one at a time. Power on both nodes of the HA pair that
you want to add to the cluster. After the nodes boot, use a console connection to start
the Cluster Setup wizard on one of the nodes. Use the join command and follow the
wizard. After the node has been joined to the cluster, repeat the steps for the partner
node and any additional nodes that you want to add.

202
Cluster Expansion
ONTAP 9.2 or greater

ONTAP 9.2 System


Manager automatically
detects the following:
▪ New compatible nodes
▪ Switchless cluster
configurations
▪ Switched cluster
configurations

Beginning with ONTAP 9.2, clusters can also be expanded nondisruptively using
System Manager. System Manager automatically detects any new compatible nodes,
whether the cluster configuration is switchless or switched.

203
Knowledge Check
1. Which two upgrade types can group HA pairs that are
upgraded together? (Choose two.)
a. Rolling upgrade
b. Batch upgrade
c. Automated upgrade
d. Hardware upgrade

Which two upgrade types can group HA pairs that are upgraded together?

204
Knowledge Check
2. What are the three phases of an automated upgrade?
(Choose three)
a. Select
b. Validate
c. Failover
d. Update

What are the three phases of an automated upgrade?

205
Lesson 2
Cluster Performance

Lesson 2, cluster performance.

206
Performance Considerations
▪ Workloads
▪ I/O operation types:
▪ Random
▪ Sequential

▪ Quality of service (QoS)


WORKLOADS

Storage system performance calculations vary widely based on the kind of


operations, or workloads, that are being managed.

The storage system sends and receives information that is called I/O operations. l/O
operations can be categorized as either random or sequential. Random operations
are usually small. Random operations lack any pattern and happen quickly, for
example database operations. In contrast, sequential operations are large, with
multiple parts that must be accessed in a particular order, for example video files.

Some applications have more than one dataset. For example, a database
application’s data files and log files might have different requirements. Data
requirements might also change over time. For example, data might start with specific
requirements but as the data ages, those requirements might change.

Also, if more than one application is sharing the storage resources, each workload
might need to have quality of service, or QoS, restrictions imposed. The QoS
restrictions prevent applications or tenants from being either bullies or victims.

207
Analyzing I/O
IOPS

▪ I/O is measured in input/output operations per second (IOPS).


▪ IOPS measures how many requests can be managed in one second.
▪ IOPS data is most useful if I/O has any of these features:
▪ I/O request patterns are random.
▪ I/O requests are small.
▪ Multiple I/O sources must be managed.

Input/output operations per second (IOPS) is a measurement of how many requests


can be managed in one second. Factors that affect IOPS include the balance of read
and write operations in the system. IOPS is also affected by whether traffic is
sequential, random, or mixed. Other factors that affect IOPS are the type of
application; the operating system; background operations; and I/O size.

Applications with a random I/O profile, such as databases and email servers, usually
have requirements that are based on an IOPS value.

208
Analyzing I/O
Throughput

▪ Throughput is a measurement of how much data can be managed in


one second.
▪ Throughput is measured in megabytes per second (MBps).
▪ Throughput data is most useful when I/O
has any of these features:
▪ I/O request patterns are sequential.
▪ I/O requests are large.
▪ Storage is dedicated to one application.

Throughput is a measurement of the average number of megabytes, that is how much


data, can be transferred within a period for a specific file size. Throughput is
measured in megabytes per second, or MBps.

Applications with a sequential I/O profile, such as video or audio streaming, file
servers, and disk backup targets, usually have requirements that are based on an
MBps value.

209
Analyzing I/O
Latency

▪ Latency is measured in milliseconds (ms).


▪ Latency is a measurement of how long data processing takes.
▪ Latency values are most useful when you are comparing flash performance.

Latency is the measurement of how long a storage system takes to process an I/O
task. Smaller latency time values are better.

Latency for hard disks is typically measured in milliseconds. Because solid-state


media is much faster than hard disks, the latency of the media is measured in
submilliseconds or microseconds.

210
ONTAP Performance
You must balance the need for performance and the need for resilience:
▪ More disks per RAID group increase performance.
▪ Fewer disks per RAID group increase resilience.

Always
Protect Use Space follow best
Data Efficiently practices.

ONTAP performance is measured at the aggregate level. To support the differing


security, backup, performance, and data sharing needs of your users, you can group
the physical data storage resources on your storage system into one or more
aggregates. You can then design and configure these aggregates to provide the
appropriate level of performance and redundancy.

When creating aggregates and the underlying RAID group, you must balance the
need for performance and the need for resilience. By adding more disks per RAID
group, you increase performance by spreading the workload across more disks, but at
the cost of resiliency. In contrast, adding fewer disks per RAID group increases the
resiliency because the parity has less data to protect, but at the cost of performance.

By following best practices when you add storage to an aggregate, you optimize
aggregate performance. Also, you should choose the right disk type for the workload
requirements.

211
Performance of Disk Types
High IOPS and high cost per GB
Use solid-state
drive (SSD) for
ultra-performance
Performance

Flash
Acceleration Use SAS for
performance

Use SATA for


capacity

Capacity Low IOPS and low cost per GB

The proper disk type depends on the performance or capacity requirements of the
workload.

When a workload requires the largest capacity at the lowest cost with lower
performance, SATA disks should be used.
When a workload requires the highest performance at the lowest cost with lower
capacity, solid-state drives (SSDs) should be used.
When a workload requires a balance of capacity and performance, SAS disks should
be used.

Sometimes, a workload might require large amounts of capacity at the lowest cost but
at a higher performance than SATA or SAS provides. To improve the performance of
high-capacity hard disks, Flash Cache or a Flash Pool can be used.

212
Virtual Storage Tier
Flash Cache Flash Pool
▪ Controller-level cache ▪ Storage-level cache
▪ Flash Cache modules in ▪ Hybrid aggregates
the expansion slots of a node of hard disks and SSDs
▪ Improved response time for repeated, ▪ Improved response time for repeated,
random reads random reads and overwrites
▪ Simple use; no additional administration ▪ Consistent performance across storage
failover events
▪ Cache for all volumes on the controller
▪ Cache for all volumes that are
on the aggregate

The Virtual Storage Tier provides two flash acceleration methods to improve the
performance of FAS storage systems.

Flash Cache uses expansion modules to provide controller-level flash acceleration.


Flash Cache is an ideal option for multiple heterogeneous workloads that require
reduced storage latency for repeated random reads, for example file services. The
feature is simple to use, because all the volumes on the controller and on aggregates
that use hard disks are automatically accelerated.

Flash Pool uses both hard disks and SSDs in a hybrid aggregate to provide storage-
level flash acceleration. Flash Pool is an ideal option for workloads that require
acceleration of repeated random reads and random overwrites, for example database
and transactional applications. Because Flash Pool is at the storage level, rather than
in the expansion slot of a controller, the cache remains available even during storage
failover or giveback. Like Flash Cache, the Flash Pool feature is simple to use,
because acceleration is automatically provided to volumes that are on the Flash Pool
aggregate.

213
SSDs in Flash Pool
Storage Each SSD is divided
Allocation pool into four partitions.
unit

DATA DATA DATA DATA PARITY PARITY

Node1 DATA DATA DATA DATA PARITY PARITY

DATA DATA DATA DATA PARITY PARITY


SSDs can be
added to a hybrid Node2 DATA DATA DATA DATA PARITY PARITY
aggregate.
1 2 3 4 5

Six SSD Disks

SSDs can also be partitioned into


storage pools.

When adding SSDs to a Flash Pool aggregate, you add the SSDs to form a RAID
group dedicated to caching. Alternatively, you can use Flash Pool SSD partitioning,
also known as Advanced Drive Partitioning. Flash Pool SSD partitioning enables you
to group SSDs together into an SSD storage pool from which partitions are allocated
to multiple Flash Pool aggregates. This grouping spreads the cost of the parity SSDs
over more aggregates, increases SSD allocation flexibility, and maximizes SSD
performance. The storage pool is associated with an HA pair, and can be composed
of SSDs owned by either node in the HA pair.

When you add an SSD to a storage pool, the SSD becomes a shared SSD, and the
SSD is divided into four partitions. The SSD storage pool is made up of rows of these
partitions, which are called allocation units. Each allocation unit represents 25 percent
of the total storage capacity of the storage pool. Each allocation unit contains one
partition from each SSD in the storage pool. Allocation units are added to a Flash
Pool cache as a single RAID group. By default, for storage pools associated with an
HA pair, two allocation units are assigned to each of the HA partners. However, you
can reassign the allocation units to the other HA partner if necessary.

214
Cluster Performance
Adding and relocating resources
Relocating resources
nondisruptively:
▪ Moving an aggregate
between the nodes of an
HA pair
▪ Moving volumes, LUNs, and
A B LIFs within an SVM
▪ Creating a FlexClone of a
C D volume or LUN
SATA SAS

When ready, click the Play button to continue.

We have been discussing performance at the node level. We also need to discuss
performance at the cluster level.

In this example, an administrator creates some volumes on a two-node cluster that is


used for file services. The system is configured with SATA disks to meet the workload
requirements.

After some time, the administrator needs to add a volume for a database application.
The SATA disks do not meet the requirements for this new workload. The
administrator decides, for future growth, to nondisruptively add another HA pair with
SAS disks. With new nodes with SAS disks active in the cluster, the administrator can
nondisruptively move the volume to the faster disks.

The slide shows some other nondisruptive resource relocation actions that are
commonly performed in a cluster. [add 3s silence here to final audio]

215
Cluster Performance
All Flash FAS
All Flash FAS
FlashEssentials features:
▪ Coalesced writes to
free blocks
▪ A random read I/O
E processing path
A B
D ▪ A highly parallelized
processing architecture
C ▪ Built-in quality of
SATA SAS SSD
service (QoS)
▪ Inline data reduction
and compression

The administrator has a new requirement for a workload that requires high
performance requirements. For easier management of the various workload types,
the administrator decides to create in the cluster a new high-performance tier that
uses All Flash FAS controllers.

NetApp FlashEssentials is the power behind the performance and efficiency of All
Flash FAS. All Flash FAS uses high-end or enterprise-level controllers with an all-
flash personality, which supports SSDs only. The slide shows some of the
FlashEssentials features. For more information about All Flash FAS and
FlashEssentials, see Using All Flash FAS with ONTAP on the NetApp Support site. A
link is provided in the module resources.

216
Storage QoS
Storage QoS can deliver
consistent performance for
mixed workloads and
mixed tenants.
Monitor, isolate, and
SVM1 limit workloads of
storage objects:
▪ Volume
▪ LUN
SVM2 ▪ File
▪ SVM

Storage quality of service, or QoS, can be used to deliver consistent performance by


monitoring and managing application workloads.

The storage QoS feature can be configured to prevent user workloads or tenants from
affecting each other. The feature can be configured to isolate and throttle resource-
intensive workloads. The feature can also enable critical applications to achieve
consistent performance expectations. QoS policies are created to monitor, isolate,
and limit workloads of such storage objects as volumes, LUNs, files and SVMs.

Policies are throughput limits that can be defined in terms of IOPS or megabytes per
second.

217
Monitoring Cluster Performance
Using OnCommand System Manager

Cluster performance charts:


▪ Viewable items:
▪ Latency (ms/op)
▪ IOPS (Ops/s)
▪ Throughput (MBps)

▪ Performance sample every


15 seconds
▪ Point-in-time view of
cluster performance

System Manager has built-in cluster performance monitoring from the main window.
The cluster performance charts enable you to view latency, IOPS, and throughput.

Performance is sampled every 15 seconds to provide a point-in-time view of cluster


performance.

218
Monitoring Cluster Performance
Using OnCommand Unified Manager

Click links for


more details

System Manager provides simplified device-level management for a single cluster.


For larger environments, Unified Manager should be used to monitor, manage, and
report on cluster resources at scale.

The Overview Dashboard, provides a high-level view of the performance of your


clusters, SVMs, and volumes to quickly identify any performance issues. Click the
links for more detailed information.

The Performance Dashboard provides various performance metrics for each cluster
that Unified Manager is monitoring.

219
OnCommand Portfolio
Complex
Complexity of Configuration

Performance, Capacity,
Configuration, and
Strong ROI Story
Insight
Target Audience: Large
Enterprises and Service Providers
Manage at Scale,
Automate Storage Processes,
and Data Protection
Target Audience: Midsize to Large Enterprise Customers

Unified Manager

Simple, Web-Based, and


No Storage Expertise Required
Basic

System Manager Target Audience: Small to Midsize Businesses

NetApp Storage Multivendor

There are many management tools to choose from.

Although System Manager provides simplified device-level management and Unified


Manager can be used for monitoring cluster resources at scale, these products are
used to monitor only ONTAP storage systems. OnCommand Insight enables storage
resource management, including configuration and performance management and
capacity planning, along with advanced reporting for heterogeneous environments.

220
Knowledge Check
1. Match each term with the term’s function.

Workload The type of input and output operations

The amount of input and output operations that can be


IOPS
managed per second

Throughput The number of megabytes that can be managed per second

Latency The number of milliseconds it takes to process an operation

The management of restrictions imposed on input and output


Storage QoS
operations

Match each term with the term’s function.

221
Knowledge Check
2. When you create a Flash Pool, which two options are
supported? (Choose two.)
a. SATA disks with SSDs
b. SAS disks with SSDs
c. Array LUNs with SSDs on FAS only
d. Array LUNs with SSDs on All Flash FAS only

When you create a flash pool, which two options are supported?

222
Knowledge Check
3. When Flash Pool SSD partitioning is used, how many
partitions are created by default?
a. Two partitions; one per node
b. Three partitions; one per node plus a parity partition
c. Four partitions; two per node
d. Five partitions; two per node plus a parity partition

When Flash Pool SSD partitioning is used, how many partitions are created by
default?

223
Lesson 3
Identifying Issues

Lesson 3, identifying issues.

224
Common Issues

Alerts Disk Failure Performance

Component Failure Configuration Storage Utilization

Understanding the topics and best practices covered in the ONTAP Cluster
Fundamentals course is essential to keeping a cluster healthy and working
continuously without disruptions. But components can fail, configurations change, and
performance can suffer due to over-utilization or configuration issues.

Troubleshooting serious issues can be overwhelming, and troubleshooting is beyond


the scope of a fundamentals course. However, a cluster administrator has tools to
monitor, analyze, and possibly resolve some potential issues. This lesson discusses
the potential issues a cluster administrator might encounter.

225
Active IQ
▪ Dashboard
▪ Inventory of NetApp
systems
▪ Health summary and
trends
▪ Storage efficiency and
risk advisors

▪ Active IQ mobile app


(iOS and Android)

Active IQ provides predictive analytics and proactive support for your hybrid cloud.
Along with an inventory of NetApp systems, you are provided with a predictive health
summary, trends, and a system risk profile.

You can access Active IQ from NetApp Support or through the Active IQ mobile app.
Alerts
Tools to monitor system:
▪ System Manager
▪ Unified Manager
▪ Event management
system (EMS)
▪ AutoSupport

!!

Monitoring your system regularly is a best practice.

In the example, there is an alert from System Manager that needs to be diagnosed.
When there is an alert or event, first try the solution that the monitoring software
suggests.

227
Component Failure
LEDs to observe: Items to inspect: Common cluster CLI
▪ Controllers ▪ Cables commands:
▪ cluster show
▪ Drives ▪ Connections
▪ system node show
▪ Switches ▪ Power
▪ Ports

Attention
LED

There are a few basic actions that you can take to assess the situation. The actions
are not listed in any particular order on the slide.
Observe the LEDs on the controllers, drives, switches, and ports.
Inspect the cables, connections, and power.
Analyze the cluster, nodes, and resources by using common CLI commands such as
cluster show and node show.

228
Disk Failures
▪ ONTAP continually
▪ Place suspect disk in prefail mode.
monitors disks. Prefail

▪ When a disk error is


encountered: ▪ Select a suitable hot spare replacement.
▪ Disk is taken offline. Hot
Spare
▪ Disk is placed in the
maintenance center. ▪ Copy the suspect disk contents to the selected
▪ ONTAP performs Copy spare.
rapid RAID recovery.

▪ After the copy is complete, put the disk into the


Fix or maintenance center to fix or fail the disk.
Fail

ONTAP continually monitors disks to assess their performance and health. This
monitoring is often called “predictive failure” in the storage industry.

When ONTAP encounters certain errors or behaviors from a disk, ONTAP takes the
disk offline temporarily or takes the disk out of service to run further tests. While the disk
is offline, ONTAP reads from other disks in the RAID group while writes are logged.
When the offline disk is ready to come back online, ONTAP resynchronizes the RAID
group and brings the disk online. This process generally takes a few minutes and incurs
a negligible performance effect.

Disks can sometimes display small problems that do not interfere with normal operation,
but the problems can be a sign that the disk might fail soon. The maintenance center
provides a way to put these disks under increased scrutiny. When a suspect disk is in
the maintenance center, the disk is subjected to several tests. If the disk passes all of
the tests, ONTAP redesignates the disk as a spare; if the disk fails any tests, ONTAP
fails the disk. By default, ONTAP puts a suspect disk into the maintenance center
automatically only if there are two or more spares available for that disk.

When ONTAP determines that a disk has exceeded error thresholds, ONTAP can
perform rapid RAID recovery. ONTAP removes the disk from its RAID group for testing
and, if necessary, fails the disk. Spotting disk errors quickly helps prevent multiple disk
failures and enables problem disks to be replaced. By performing the rapid RAID
recovery process on a suspect disk, ONTAP avoids long rebuilding time, performance
degradation, and potential data loss due to additional disk failure during reconstruction.

229
Disk Failures
Spare disk selection

Larger Size:
Unused capacity

Exact Different Speed:


Match Performance

Do not run a RAID Group in degraded


!! Degraded Mode:
No replacement
mode for more than 24 hours.

ONTAP always tries to choose a hot spare that exactly matches the failed or failing
disk. If an exact match is not available, ONTAP uses the best available spare, or
ONTAP puts the RAID group into a degraded mode. Understanding how ONTAP
chooses an appropriate spare when there is no matching spare enables you to
optimize the spare allocation for your environment.

ONTAP uses specific criteria to choose a nonmatching hot spare.

First, if the available hot spares are not the correct size, ONTAP uses the hot spare
that is the next larger size, if there is one. The replacement disk is downsized to
match the size of the disk that it is replacing; the extra capacity is not available.

Next, if the available hot spares are not the correct speed, ONTAP uses a hot spare
that is a different speed. Using disks with different speeds in the same aggregate is
not optimal. Replacing a disk with a slower disk can cause performance degradation,
and replacing a disk with a faster disk is not cost-effective.

Finally, if no spare exists with an equivalent disk type or checksum type, the RAID
group that contains the failed disk enters degraded mode. ONTAP does not combine
effective disk types or checksum types within a RAID group.

Degraded mode is intended to be a temporary condition until an appropriate spare


disk can be added. Do not run a RAID group in degraded mode for more than 24
hours.

230
Configuration
Config Advisor

▪ ONTAP features:
▪ Validation of shelf cabling
▪ Validation of ONTAP and switches setup
▪ Firmware revision checks
▪ Support for MetroCluster, FlexPod, and
7-Mode Transition Tool (7MTT) transitions
▪ Config Advisor AutoSupport

▪ Config Advisor components:


▪ Collect
▪ Analyze
▪ Present

Config Advisor contains more than 300 configuration checks that can be used to
validate setup or operational configuration. Config Advisor contains checks for
cabling, shelf setup, and the latest firmware validation. Config Advisor also contains
several checks to validate network switches and the setup of ONTAP.

Config Advisor AutoSupport is specific to Config Advisor and is independent of the


AutoSupport tool in ONTAP. The Config Advisor AutoSupport requires its own HTTPS
connection over the Internet to transmit data back to NetApp. Config Advisor
AutoSupport is enabled by default during installation but can be disabled by updating
a setting in Config Advisor.

Config Advisor has three major components that collect data, analyze data, and
present the findings. For consistency in the display of alerts, the results are shown in
a table format similar to My AutoSupport. There is also a visual depiction of the shelf
and storage layout to better emphasize connectivity issues.

231
Performance
Ways to minimize performance issues: Potential performance issues:
▪ Correctly size and follow best practices for the ▪ Controller: Resource over-utilization, ONTAP
specific workload. version, offline, or rebooting
▪ Verify the supported minimums and ▪ Storage: Disk types, aggregate configuration,
maximums. volume movement, and free space
▪ Adhere to the ONTAP storage platform mixing ▪ Networking: Configuration, LIF location, port
rules. saturation, port speeds, or indirect access
▪ Check compatibility of components, host OS, ▪ Host or clients: Application, drivers, network
applications, and ONTAP version. adapter, or user knowledge

When ready, click the Play button to continue.

As the saying goes, prevention is the best medicine. Start with a properly sized
system and follow best practices for ONTAP, the host operating system, and the
application. Verify that the supported minimums, maximums, and mixing rules are
adhered to. Always use the NetApp Interoperability Matrix Tool (or IMT) to check
compatibility of components, host OS, applications, and ONTAP.

Things can change over time and issues can arise. Performance issues can occur for
many different reasons, and analysis can be complex. Performance analysis is
beyond the scope of a fundamentals course, but this some components that might be
related to performance issues are listed here.

232
Storage Utilization
Ways to minimize use issues:
▪ Use the appropriate volume and LUN
settings for the workload requirements.
▪ Monitor free space to prevent offline
volumes and LUNs.
▪ Monitor the number of Snapshot copies.
▪ Select the appropriate efficiency settings.

When you provision storage, use the appropriate volume and LUN settings for the
workload requirements. There are best practices guides for ONTAP, host operating
systems, and applications.

When a resource such as a volume or a LUN runs out of space, ONTAP protects the
currently stored data by taking the resource offline. To prevent resources from going
offline, you should monitor the free space in aggregates, volumes, and LUNs. You
also need to monitor the number of Snapshot copies and their retention period
because they share space with user data in the volume.

When using efficiency features such as thin provisioning, deduplication, and


compression, select the appropriate settings for the workload. Different workloads
experience more or less savings depending on the type of data that is being stored.
Also, when resources are moved, you might lose or change the amount of savings.
Verify that there is enough space at both the source and the destination before
moving a volume or LUN.

233
NetApp Support
▪ NetApp Support:
mysupport.netapp.com
▪ Hardware Universe:
hwu.netapp.com
▪ NetApp Interoperability
Matrix Tool (IMT):
mysupport.netapp.com/
matrix

For support information, documentation, software downloads, and access to Active


IQ, see NetApp Support at mysupport.netapp.com.
For system configuration information, see the NetApp Hardware Universe at
hwu.netapp.com.
To determine the compatibility between various NetApp and third-party products that
are officially supported, see the NetApp Interoperability Matrix Tool (IMT) at
mysupport.netapp.com/matrix.

234
Knowledge Check
1. A disk has experienced errors. What does ONTAP do if at least two
matching spares are available?
a. Immediately halts I/O and takes the disk offline.
b. Immediately halts I/O and rebuilds the disk to a spare.
c. Places the disk in the maintenance center and assesses the disk.
d. Enters degraded mode for 24 hours while the disk is being repaired.

A disk has experienced errors. What does ONTAP do if at least two matching spares
are available?

235
Knowledge Check
2. You require more UTA ports on a controller. Where do you find the
correct UTA expansion card?
a. MyAutoSupport
b. NetApp Interoperability Matrix Tool (IMT)
c. Hardware Universe
d. The expansion card vendor’s website

You require more UTA ports on a controller. Where do you find the correct UTA
expansion card?

236
Knowledge Check
3. You require more CNA ports on your host. Where do you find a
supported CNA card?
a. MyAutoSupport
b. NetApp Interoperability Matrix Tool (IMT)
c. Hardware Universe
d. The expansion card vendor’s website

You require more CNA ports on your host. Where do you find a supported CNA card?

237
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com

When ready, click the Play button to continue.

Resources

238
Thank You!

© 2018 NetApp, Inc. All rights reserved. Legal Notices

Thank you.

239

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy