Sg247841 Power Ha Systemmirror 61
Sg247841 Power Ha Systemmirror 61
Dino Quintero
Benny Abrar
Adriano Almeida
Michael Herrera
SoHee Kim
Bjorn Roden
Andrei Socoliuc
Tom Swart
ibm.com/redbooks
SG24-7841-01
Note: Before using this information and the product it supports, read the information in Notices on
page xi.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
xiii
xvi
xvi
xvi
39
40
40
40
43
45
47
48
49
52
iii
113
114
114
121
125
135
139
140
141
142
142
143
143
145
145
4.3.1
4.3.2
4.3.3
4.3.4
4.3.5
4.3.6
146
147
148
148
149
149
vi
241
242
243
243
244
244
244
247
250
251
252
252
252
252
254
255
257
257
259
260
260
263
265
266
267
268
268
269
269
270
270
271
272
272
275
276
283
290
292
297
302
304
305
316
319
319
320
321
323
329
330
Contents
vii
435
436
436
436
438
441
443
445
445
445
448
449
452
457
458
461
464
469
470
470
470
471
471
472
472
473
473
475
479
479
481
483
483
486
491
493
496
498
502
502
508
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator 515
11.1 Planning for TrueCopy/HUR management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
11.1.1 Software prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
11.1.2 Minimum connectivity requirements for TrueCopy/HUR . . . . . . . . . . . . . . . . . . 516
11.1.3 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
11.2 Overview of TrueCopy/HUR management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
11.2.1 Installing the Hitachi CCI software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
viii
520
521
523
525
525
528
539
547
550
551
553
555
556
557
558
559
559
562
564
565
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IBM Redbooks publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
573
573
573
574
574
Contents
ix
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring
any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made
on development-level systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs.
xi
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation in the United States, other countries, or both. These and other IBM trademarked terms are
marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US
registered or common law trademarks owned by IBM at the time this information was published. Such
trademarks may also be registered or common law trademarks in other countries. A current list of IBM
trademarks is available on the web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX
AIX 5L
DB2
DS4000
DS6000
DS8000
Enterprise Storage Server
ESCON
FlashCopy
Global Technology Services
GPFS
HACMP
IBM
Micro-Partitioning
NetView
Power Systems
Power Systems Software
POWER6
POWER7
PowerHA
PowerVM
Redbooks
Redbooks (logo)
System i
System p
System Storage
SystemMirror
Tivoli
xii
Preface
This IBM Redbooks publication positions the IBM PowerHA SystemMirror V6.1 for
AIX Enterprise Edition as the cluster management solution for high availability. This solution
enables near-continuous application service and minimizes the impact of planned and
unplanned outages.
The primary goal of this high-availability solution is to recover operations at a remote location
after a system or data center failure, establish or strengthen a business recovery plan, and
provide separate recovery location. The IBM PowerHA SystemMirror Enterprise Edition is
targeted at multisite high-availability disaster recovery.
The objective of this book is to help new and existing PowerHA customers to understand how
to plan to accomplish a successful installation and configuration of the PowerHA
SystemMirror for AIX Enterprise Edition.
This book emphasizes the IBM Power Systems strategy to deliver more advanced
functional capabilities for business resiliency and to enhance product usability and robustness
through deep integration with AIX, affiliated software stack, and storage technologies.
PowerHA SystemMirror is designed, developed, integrated, tested, and supported by IBM
from top to bottom.
xiii
Shawn Bodily
Steven Finnes
Francisco Garcia
Nick Harris
Kevin Henson
Kam Lee
Glenn E. Miller
Rohit Krishna Prasad
Ravi Shankar
David Truong
Thomas Weaver
Joe Writz
IBM US
Claudio Marcantoni
IBM Italy
Tony Steel
IBM Australia
Michael Malicdem
IBM Philippines
Bernhard Buehler
IBM Germany
Joergen Berg, Christian Schmidt
IBM Denmark
Alex Abderrazag
Rosemary Killeen
IBM UK
Laszlo Niesz
IBM Hungary
Bernard Goelen
IBM Belgium
David Arrigo
Paul Bessa
James Murray
EMC Corporation
Preface
xv
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
xvi
Summary of changes
This section describes the technical changes made in this edition of the book and in previous
editions. This edition may also include minor corrections and editorial changes that are not
identified.
Summary of Changes
for SG24-7841-01
for Exploiting IBM PowerHA SystemMirror V6.1 for AIX Enterprise Edition
as created or updated on February 18, 2014.
New information
This publication contains the following additional chapters:
Chapter 10, Disaster recovery with DS8700 Global Mirror on page 469
Chapter 11, Disaster recovery by using Hitachi TrueCopy and Universal Replicator on
page 515
xvii
xviii
Part 1
Part
Introduction
This part provides an overview of the IBM PowerHA SystemMirror Enterprise Edition solution
and details about the infrastructure considerations that clients must be aware of before
implementing the solution.
This part includes the following chapters:
Chapter 1, High-availability and disaster recovery overview on page 3
Chapter 2, Infrastructure considerations on page 39
Chapter 1.
Introduction
Disaster recovery and PowerHA
Application data integrity
Selecting the correct solution
PowerHA enterprise logistics
What is new on PowerHA Enterprise Edition
1.1 Introduction
The IBM PowerHA clustering solution has undergone changes over the last couple of years.
The most obvious change is the rebranding from its former HACMP name to PowerHA in
2008 with the 5.5 release. The 6.1 release in late 2009 updated the name to PowerHA
SystemMirror for AIX. This change was a result of an IBM Power Systems Software
initiative to provide an array of IBM solutions, such as the following examples, under the
PowerHA name:
IBM PowerHA SystemMirror for AIX
IBM PowerHA SystemMirror for System i
IBM PowerHA DB2 PureScale
To simplify the offerings, the base clustering product is now sold as the Standard Edition and
provides local clustering functions. The former HACMP/XD or Extended Distance that was
intended for disaster recovery is now bundled into the PowerHA SystemMirror Enterprise
Edition. Table 1-1 compares the features that are offered by these editions.
Table 1-1 Standard Edition versus Enterprise Edition features comparison
Standard Edition
Enterprise Edition
Smart assists
Multisite HA management
PowerHA GLVM
GLVM deployment wizard
IBM Metro Mirror support
IBM Global Mirror support
EMC SRDF/S SRDF/A
Historically, the Enterprise Edition was mistaken for a single offering that addresses all areas
as they pertain to data replication and site failover. This extension to the base product is in
essence an umbrella of integrated solutions that can be fitted into various environments,
depending on a multitude of factors.
In early versions of the High Availability Cluster Multi-Processing (HACMP) software solution,
the mechanism that is used to replicate data between sites revolved around the Geographic
Remote Mirroring (GeoRM) product. The GeoRM product used the concept of a Geographic
Mirror Device (GMD). GMD was a logical device that the application wrote to. It pointed to a
logical volume on either site and would commit the following types of writes to them based on
the policy that is specified:
Synchronous
Synchronous with mirror write consistency
Asynchronous
Used in a stand-alone fashion, you could replicate in one direction and reverse the mirroring
flow manually if required. When integrated with the HACMP cluster software, GeoRM became
known as HAGEO. The HAGEO construct allowed for the automated control of the GMDs and
role reversal of the replication between the sites depending on where the resources were
being hosted. Although the GeoRM product does a good job of moving the data in a
consistent way between sites. Because it is more complex to implement, you considerations
to make because it pertained to dynamic operations within the cluster.
To address certain HAGEO limitations and to simplify the logical device implementation when
using IP replication, in the HACMP 5.3 release in 2005, the development team introduced
Geographic Logical Volume Mirroring (GLVM). Similar to the original GeoRM product, GLVM
can be installed from the base AIX 5.3 or AIX 6.1 media and used as a stand-alone product
for IP replication between sites. However, its integration with the IBM PowerHA Enterprise
Edition automates the management and role reversal of replicated resources if planned and
unplanned movements occur. The initial release of GLVM provided a synchronous-only mode
of IP replication as an alternative to HAGEO.
The GLVM product is much simpler to use because it is ultimately built on the concept of
logical volume mirroring. After you install the GLVM file sets and establishing IP connectivity
between the sites, you can make LUNs that are in a remote site appear as though they are
local. By adding those LUNs into your volume groups and using them to create logical volume
mirrors, you can have copies of the data at the local sites and remote sites. Because of its
sync-only mode, at the time it was deemed necessary to continue to sustain the HAGEO
offering. However, the current PowerHA 5.5 and 6.1 releases that are running on IBM AIX 6.1
can now replicate synchronously and asynchronously by using GLVM.
End-of-support dates: The older HAGEO functions in the HACMP/XD releases are no
longer available as of the PowerHA 5.5 release. For information about the migration from
HAGEO to GLVM, see 8.11, Migration from HAGEO (AIX 5.3) to GLVM (AIX 6.1) on
page 423.
In its entirety, the IBM PowerHA SystemMirror Enterprise Edition provides IP and disk
replication alternatives, including integration with IBM System Storage SAN Volume
Controller Metro and Global Mirror and Metro Mirror with the IBM TotalStorage Enterprise
Storage Server 800, IBM System Storage DS6000, and IBM System Storage DS8000
storage subsystems. New in PowerHA 6.1 is the integration with EMC Symmetrix Remote
Data Facility (SRDF) disk replication technology. In addition, you can replicate data by using
the wide area network (WAN), which uses the IP replication functions of the GLVM product.
The GLVM product as a standalone product is available in the AIX 5.3 and 6.1 releases.
Important: Asynchronous GLVM is available only with PowerHA 5.5 SP1 and later or with
PowerHA 6.1 running on AIX 6.1. This availability is a result of the new mirror pool and
asynchronous I/O (aio) cache logical volume constructs introduced in the AIX 6.1 release.
Uptime
Five nines
99.999%
5 minutes 35 seconds
Four nines
99.99%
52 minutes 33 seconds
Three nines
99.9%
8 hours 46 minutes
Two nines
99.0%
87 hours 36 minutes
One nine
90.0%
36 days 12 hours
Planning a disaster recovery solution involves key considerations, such as accounting for the
time that is for planned maintenance and whether that time deducted from the maximum
downtime figures for the year. The ability to nondisruptively update a cluster is available as of
the PowerHA 5.4.0.1 release, although using this feature avoids the need for a restart when
you update a cluster or upgrade to a new release. Therefore, you must consider the impact of
other service interruptions in the environment, such as upgrades that involve the applications,
the AIX operating system, and the system firmware, which often require the system to cycle.
The IBM PowerHA SystemMirror Enterprise Edition solution can provide a valuable
proposition for reliably orchestrating the acquisition and release of cluster resources from one
site to another. It can also provide quick failovers if an outage or natural disaster occurs.
Figure 1-1 shows the various tiers of disaster recovery solutions and how the Enterprise
Edition is considered a tier 7 recovery solution. Solutions in the alternate tiers can all be used
to back up data and move it to a remote location, but they lack the automation that the
Enterprise Edition provides. Looking over the recovery time axis, you can see how meeting an
RTO of under four hours can be achieved with the implementation of an automated multisite
clustering solution.
PowerHA Enterprise
Edition fits in here
Value
Applications with
Low tolerance to
Zero or near zero outage
data recreation
Applications
Somewhat Tolerant
to outage
up to 24 hours
data recreation
24 to 48 hours
data recreation
1-4 Hr.
4-8 Hr.
8-12 Hr.
12-16 Hr.
24 Hr.
Applications very
tolerant to outage
Days
Recovery Time
Tiers based on SHARE definitions
Figure 1-1 Tiers of disaster recovery solutions - IBM PowerHA SystemMirror Enterprise Edition
Table 1-3 describes the tiers of disaster recovery in more detail. Table 1-3 outlines different
Power Systems solutions available in each Tier for disaster recovery solutions.
Table 1-3 Disaster recovery solution tiers - IBM disaster recovery planning model
Disaster recovery
planning model
reference
Power Systems
solutions
100% recovery
of application
data possible?
Automatic
detection of
site failure?
Facility
locations
supported
Communica
tion modes
or protocols
Tier 7
Zero data loss.
Recovery up to
application restart in
minutes.
PowerHA Enterprise
Edition:
Cross Site LVM
GLVM
Metro Mirror
Global Mirror
SRDF/S
Yes - Failover
and failback
automated
All
Metropolitan
area network
(MAN) or
WAN - IP
No
All
All
Yes
No
All
All
Yes
No
All
All
Tier 5
Continuous electronic
vaulting of backup data
between active sites.
Active data management
at each site is provided.
No
No
All
All
Tier 4
Electronic vaulting of
critical backup data to a
hot site. The hot site is
not activated until a
disaster occurs.
Tivoli Storage
Manager with copy
pool duplexing to the
hot site, and Tivoli
Storage Manager
server at active site
only
No
Tier 3
Off-site vaulting with a
hot site. Backup data is
transported to the hot site
manually. The hot site is
staffed and equipped but
not active until a disaster
occurs.
Tivoli Storage
Manager with
multiple local storage
pools on disk and
tape at active site
No
Tier 6
Two-site two-phase
commit. Recovery time
varies from minutes to
hours.
Recovery in days
or weeks
Must restore from
backups
No
Recovery in days
or weeks
No
Recovery in days
or weeks
PowerHA
Standard
Edition
suffices for a
campusstyle disaster
recovery
solution
SAN (dark
fibre) - disk
replication
Disaster recovery
planning model
reference
Power Systems
solutions
100% recovery
of application
data possible?
Automatic
detection of
site failure?
Tier 2
Off-site vaulting of
backup data by courier. A
third-party vendor
collects the data at
regular intervals and
stores it in its facility.
When a disaster occurs:
A hot site must be
prepared.
Backup data must be
transported.
Tivoli Storage
Manager with
multiple local storage
pools on disk and
tape at active site
No
No
Tier 1
Pickup Truck Access
Method with tape.
Tape-backupbased solution
No
Tier 0
No disaster recovery
plan or protection.
Local backup
solution may be in
place, but no offsite
data storage
No disaster
recovery - site
and data lost
Facility
locations
supported
Communica
tion modes
or protocols
Recovery in days
or weeks
No
High availability and disaster recovery are a balance between recovery time requirements
and cost. There are various external studies available that cover dollar loss estimates for
every bit of downtime that is experienced as a result of service disruptions and unexpected
outages. Therefore, key decisions must be made to determine what parts of the business are
important and must remain online to continue business operations.
Beyond the need for secondary servers, storage, and infrastructure to support the replication
bandwidth between two sites, items, such as the following examples, can be easily
overlooked:
Where does the staff go if a disaster occurs?
What if the technical staff managing the environment is unavailable?
Are there facilities to accommodate the remaining staff, including desks, phones, printers,
and desktop PCs?
Is a disaster recovery plan documented that can be followed by nontechnical staff
if necessary?
For various infrastructure considerations, see Chapter 2, Infrastructure considerations on
page 39.
Restriction: Current PowerHA Enterprise Edition releases have a limitation of only two
sites and a maximum eight nodes within a cluster.
We recognize that customers have service level agreements (SLAs) that require them to
maintain a highly available environment after a site failure, and in turn we try to use 4-node
cluster configurations for most of our test scenarios (Figure 1-2).
SVC_Site A
SVC_Site B
net_ether_01
xd_ip
net_diskhb_01
svcxd_A1
net_diskhb_02
svcxd_A2
P550
svcxd_B1
P550
svcxd_B2
P575
P575
PPRC Links
SVC
SVC
Figure 1-2 PowerHA Enterprise Edition with two sites that have a 4-node cluster
10
The line between local availability and disaster recovery is often denoted by the existence of
multiple physical sites and distance between them, but with gray areas between the two. This
setup is especially prevalent in environments that have multiple server rooms across different
buildings within the same site. We typically consider these environments dispersed over a
relatively close area to be campus-style hybrid disaster recovery environments. Banking
institutions, universities, and hospitals are common examples, where an infrastructure is in
place between separate server rooms, allowing IP and SAN connectivity between the
different machines and independent storage subsystems (Figure 1-3).
FC switch
FC switch
FC switch
LVM mirrors
DWDM, CWDM or
other SAN extenders
(ie. 120 km)
FC switch
DWDM
DWDM
Distance limited by
latency effect on
performance
FC switch
FC switch
FC switch
LVM mirrors
Figure 1-3 Cross Site LVM - direct SAN versus DWDM extended SAN
The PowerHA cluster nodes can be on different machine types and server classes if you
maintain common AIX and cluster file set levels. This setup can be valuable in a scenario
where a customer acquires newer machines and uses the older hardware to serve as the
failover targets. In a campus environment, creating this type of a stretch-cluster can serve two
roles:
Providing high availability
Providing a recovery site that contains a current second copy of the data
These environments often present various options for how to configure the cluster. For
example, they include a cross-site LVM mirrored configuration, one using disk-based
metro-mirroring, or scenarios that use SAN Volume Controller VDisk mirroring with a split I/O
group between two sites, new with the 5.1 SVC firmware release. Each option has its own
merits and corresponding considerations that are explained more in Chapter 5, Configuring
PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror on page 153.
Being inherently synchronous, all of these solutions experience minimal to zero data loss,
similar to solutions in a local cluster that shares LUNs from the same storage subsystem.
Although every environment differs, more contention and disk latency are introduced the
farther the sites are from each other. However, no hard set considerations dictate whether you
11
resources at the backup location. Plan for multiple redundant heart-beating links between the
sites to avoid a separation. The PowerHA 5.5 release introduced the ability to disable the
automated failover function for specific failures, for example, the loss of the connectivity
between the storage units that manage the replication. With this option selected the cluster
prevents a failover to retain data integrity and the cluster processing requires manual
intervention. Figure 1-4 shows a sample of this option.
Change / Show SVC PPRC Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
*
[Entry Fields]
svc_metro
[]
[B8_8G4]
+
[B12_4F2]
+
[svc_disk2 svc_disk3 s> +
METRO
+
MANUAL
+
The status of the replicated resources determines whether the MANUAL recovery action
prevents a failover. The states vary between the replication types (Example 1-1).
Example 1-1 Replication types, depending on the status of the replicated resources
13
needs and you sized the data links appropriately you quickly realize that the management is
simpler when you use the integration within the Enterprise Edition. Similarly, using the
integrated logic to pass instructions to the disk storage subsystems automatically based on
the various events detected by the cluster is also more efficient when left to the integrated
code.
A major benefit of Power HA Enterprise Edition is that it has been comprehensively tested to
ensure that it works with the basic failover and failback scenarios and with rg_move and
selective failover inherent cluster mechanisms, for example. In addition, using the Enterprise
Edition automatically reverses the flow and restarts the replication after the original site is
restored. The integrated cluster clverify functions also help to identify and correct any
configuration errors. The cluster EVENT logging is appended into the existing PowerHA logs,
and the nightly verification checks identify whether any changes occurred to the configuration.
The replicated resource architecture in the Enterprise Edition allows finer control over the
status of the resources. Through features such as application monitoring or the pager
notification methods, you can receive updates any time that a critical cluster event occurs.
Enabling full integration can also facilitate the ability to gracefully move resources and the
testing of your scripts. By moving the resources from one site to the next, you can test the
application stop and start scripts and ensure that everything is in working order. Using some
of the more granular options within the cluster, such as resource group location
dependencies, can also facilitate the destaging of lower priority test or development
resources at the failover location whenever a production site failure occurs. By using the
site-specific dependencies, you can also specify a set of resource groups to always coexist
within the same site.
Support considerations
Customers and lab service staff have been customizing replication solutions within the
PowerHA software for a while now. If the integration for a replication technology is not
available, it can be automated by using custom events or within the application scripts. One
example of this customization is the custom integration of EMC SRDF replication before the
availability of PowerHA 6.1.
Many clients already replicate their clustered volumes between sites without cluster
automation. In these environments, the data exists at the remote site and its activation in a
failure scenario requires a manual or semi-scripted procedure. These clients are good
candidates for a customized environment if their storage replication devices are not already
integrated with the PowerHA Enterprise Edition. The major caveat with a customized
environment is that support for any of the custom scripts ultimately falls on the administrator.
Any of the custom logic that is embedded within custom events or application scripts is not
reviewed by the IBM support staff. Technically, it is not considered an unsupported
configuration because basic cluster functions are beyond those custom scripts.
Summary
As mentioned in the previous sections, using the integrated replication solutions with the
Enterprise Edition can help simplify the management and shorten the amount of time to
recovery. The key here is that you must be confident in the clustering technology to handle all
aspects of the failover process. For tiered applications, it especially makes sense to use
automatic recovery to ensure the order in which the resources are brought back online.
As new technologies emerge and the complexity of the environment continues to increase,
the product simplifies the cluster interface and assists the user in hardening the environment
to build a robust disaster recovery solution.
14
The goal is to provide the most resilient solution that incorporates as much of the existing
components within the environment as possible. If the goal is to replicate the data in a
consistent way, each of the mutually exclusive replication technologies does so in an effective
manner. However, merging the clustering and replication technologies provides a much more
elegant and effective solution.
PRIMARY
LOG
DATA
15
In an environment that replicates the volumes between storage subsystems, the flow has a
few more updates. In a metro-mirrored environment, the database transaction sequence for
updating DB objects normally has the following characteristics (Figure 1-6):
1a The update request is written to the database log on the primary volumes.
1b The DB log update is mirrored on the secondary volumes.
2a After the log is successfully updated, the database object is updated on the primary
volumes.
2b The database object update is mirrored on the secondary volume.
3a After the database object is successfully updated, the database log is updated to mark
that the transaction is completed.
3b The database log update is then mirrored on the secondary volume.
PRIMARY
LOG
SECONDARY
DATA
LOG
DATA
One of the challenges in a replicated environment is that failures tend to be intermittent and
gradual. It is unlikely that all of the components within a complex would fail at the same exact
time. This type of event is called a rolling disaster. For example, consider a scenario where
the storage links are suspended for a short period.
In the scenario shown in Figure 1-7, consider the order of events:
1. The link between the DB data volume and its mirrored pair at the recovery site is lost.
2. The write sequence of (1) - (2) - (3) completes on the primary devices.
3. Log writes (1) and (3) are mirrored on the log-sec device, however, the corresponding DB
write (2) is not applied on db-sec.
PRIMARY
LOG
DATA
SECONDARY
LOG
DATA
16
With this type of failure, the database on the secondary site can end up having missing
updates that might potentially go unnoticed for a time. This situation is considered
unacceptable since there is no way to easily identify the problem. This area is one in which
integration with the PowerHA Enterprise Edition can also provide added value. Depending on
the storage type and replication technology that is implemented, separate methods are built
in for mitigating these risks.
Link degraded
Link down
Link up
LSS pair consistency group error
Session consistency group error
LSS is suspended
PowerHA tests each SNMP trap that is received to ensure that it is from the following sources:
A valid storage unit (checked by storage unit ID)
A storage unit that is defined to PowerHA
A logical subsystem (LSS) that was previously configured into a Peer-to-Peer Remote
Copy (PPRC) resource group
If PowerHA receives an SNMP trap that does not meet this criteria, it is logged, but otherwise
ignored. To enable SNMP traps in PowerHA, you must start the Cluster Information Daemon
with SNMP traps enabled, which provides consistency group support.
Important: If clinfoES is already running, you must first stop and restart it with the
corresponding flag for consistency group support as follows:
# stopsrc -s clinfoES
# startsrc -s clinfoES -a "-a"
17
The clinfoES subsystem listens on port 162/tcp and uses the ERROR log to record the valid
incoming trap. It then calls the resource_state_change script if the trap is for one of the online
resources. It sets the global environment variable SNMP_DEVICE_INFO, which is used by
the consistency group SNMP trap event script to process the trap.
There are different SNMP traps that trigger a cluster EVENT:
18
Figure 1-9 shows the following flow of a failure with the SNMP trap monitoring enabled:
1. An error is detected on storage subsystem and an SNMP trap notification is sent.
2. SNMP traps support (clinfoES) receives a trap and determines whether it is valid.
3. clinfoES writes the trap to the AIX errlog and sends the errlog ID and event type to the
cluster manager.
4. The cluster manager parses the errlog for the ID, extracting the trap information, sends the
event to the nodes in the cluster, targeting the node with the RG online, and sets the
environment variable SNMP_DEVICE_INFO with the source storage ID and LSS pair
information (Example 1-2).
Example 1-2 SNMP_DEVICE_INFO format
Storage DS8000
Server P780
19
In the scenario in Figure 1-10, a sent trap, which indicates that one of the links is down, is
received and processed on the cluster node that manages the resource group. After the
consistency group paths are evaluated and it is determined that for one of the paths the link is
NotEstablished, the cluster freezes the replication and suspends all PPRC pairs in each
consistency group.
Backup Site
Primary Site
HMC
RG1
DATABASE
DATABASE
CG1
CG1
CG2
CG2
RG1
20
freezes CGs, links are remade so we can receive future link-related traps (that is,
LINK-UP).
InconsistentSuspended
This stat indicates that certain volumes are in the suspended state, while other volumes
are in any other state (for example, full-duplex or copy pending). Certain volumes are
mirroring while others are suspended. This case is handled the same as the
ConsistentSuspended state.
VolumesInactive
This state indicates that the storage system/volume is unreachable. Either the storage
system is down or the volume failed. This state triggers an rg_move to the remote site.
After a new trap is sent indicating that the link is reestablished, the nodes reevaluate the
status by using the pprc_snmtrap_event and dynamically make the calls to the storage
enclosure to unfreeze the consistency groups. The cluster then allows for the replication for
all of the PPRC pairs to resume (Figure 1-11).
Backup Site
Primary Site
HMC
RG1
DATABASE
DATABASE
CG1
CG1
CG2
CG2
RG1
When using one of the alternative solutions, such as the sync/async replication with the SAN
Volume Controller or the EMC subsystems, the built-in consistency group functions detect the
loss of a link and freeze the relationships automatically. The SAN Volume Controller logs
internal messages, indicating that the relationships are frozen. It is up to the client to configure
the appropriate alerts for notification about the failure. The cluster software does not poll the
paths to reactivate them. Clients can enable external monitoring to receive notification about
the failure and must manually reestablish the links after the connectivity is resumed.
For more information about setting up external notification, see the IBM System Storage SAN
Volume Controller Troubleshooting Guide, GC27-2227.
For more information about event monitoring for EMC Symmetrix, see the event daemon
(storevntd) configuration that is described in the EMC Solutions Enabler Version 7.0 Installation Guide, P/N 300-008-918.
21
22
Increasing performance: More processor and memory resources can help increase
the performance but only if the application is designed to take advantage of it. An
uncapped partition can help in this situation (with a specific weight factor). Also, to
look at the license fee when using uncapped partitions (for example, processor
license-based applications).
Therefore, when beginning to plan your environment, consider ratios such as current system
utilization and latency impact (Example 1-5).
Example 1-5 Utilization and latency impact ratio formulas
Current throughput
--------------------- = Current System Utilization
Max system throughput
Time to local disk
--------------------- = Latency Impact
Time to remote disk
Most readily available reference material generalizes about synchronous distance caps for
devices that extend the SAN at distances of 100300 km. For more information about the
different considerations, see Chapter 2, Infrastructure considerations on page 39.
In synchronous mirroring, both the local and remote copies must be committed to their
respective subsystems before the acknowledgement is returned to the application. In
contrast, asynchronous transmission mode allows the data replication at the secondary site
to be decoupled so that primary site application response time is not impacted. Asynchronous
transmission is commonly selected, with the exposure that the secondary sites version of the
data might be out of sync with the primary site by a few minutes or more. This lag represents
data that is unrecoverable if a disaster occurs at the primary site. The remote copy can lag
behind in its updates. If a disaster strikes, it might never receive all of the updates that were
committed to the original copy.
Integrated asynchronous solutions: The following integrated asynchronous solutions
are the only ones that are available in PowerHA Enterprise Edition 6.1:
Asynchronous GLVM on AIX 6.1
SAN Volume Controller Global Mirror
EMC SRDF/A
23
Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
on page 153.
By using the EMC SRDF replication, you can make the change from the cluster panels, and
the change is reflected immediately on the storage side. After the change with a cluster,
synchronization updates the changes in the cluster definitions across all of the nodes. For
more information, see Chapter 7, Configuring PowerHA SystemMirror Enterprise Edition with
SRDF replication on page 267.
Finally, the change in a GLVM environment is different because it depends on whether you
were already running with AIX 6.1 for asynchronous support and whether the scalable volume
groups and aio cache logical volumes existed. Therefore, going from synchronous to
asynchronous requires manual steps and is disruptive. However, when you go from an
asynchronous setup to a synchronous setup, a change to the mirror pool settings is dynamic.
For more information, see Chapter 8, Configuring PowerHA SystemMirror Enterprise Edition
with Geographic Logical Volume Manager on page 339.
Primary Site
Secondary Site
Router
Router
GLVM Sync/Async Replication
Server A
SVC
Server C
Server B
ESS/DS Metro-Mirroring
EMC SRDF & SDRF/A
24
Server D
SVC
Table 1-4 highlights considerations for choosing whether to use AIX LVM or the GLVM
product.
Table 1-4 When to choose Cross-Site LVM versus GLVM
Logical Volume Mirroring (X-LVM)
Using disk-level replication might make more sense in your environment if you are already
using storage subsystems that have integrated support. For example, if you are already using
the SAN Volume Controller virtualization across your enterprise, then use its Metro Mirror and
Global Mirror functions.
Table 1-5 outlines considerations that are associated with each disk replication offering.
Table 1-5 When to choose SVC versus SRDF versus DS Metro/Global Mirroring
SAN Volume Controller
Metro/Global Mirror
Synchronous or asynchronous
Integration
Synchronous or asynchronous
integration
Synchronous-only integration
Finally, when the environment calls for asynchronous replication, you have the following
support choices:
GLVM
SAN Volume Controller Global Mirror
SRDF/A
Table 1-6 shows the requirements for each asynchronous replication solution.
Table 1-6 When to choose async GLVM or async disk replication
Geographic Logical Volume Mirroring
25
Sample topologies and different cluster configurations are outlined in the test scenarios that
are described in this book.
Power Systems
Small tier
Medium tier
Mid-range Power
Large tier
Enterprise servers
For more specific pricing details based on the server models that you deployed, contact your
IBM sales representative.
Table 1-8 on page 27 outlines the orderable features codes for the PowerHA Standard and
Enterprise Editions.
26
Table 1-8 PowerHA Standard and Enterprise Edition product identification numbers
Program PID number
SW Maintenance
Regist/Renewal 1 Year 5660-H23
SW Maintenance Registration
3 Year - 5662-H23
SW Maintenance Renewal
3 Year - 5663-H23
SW Maintenance After License
3 Year - 5664-H23
SW Maintenance
Regist/Renewal 1 Year 5660-H24
SW Maintenance After License
1 Year - 5661-H24
SW Maintenance Registration
3 Year - 5662-H24
SW Maintenance Renewal
3 Year - 5663-H24
SW Maintenance After License
3 Year - 5664-H24
The PowerHA SystemMirror for AIX Enterprise Edition includes all of the capabilities of the
Standard Edition and more. The Enterprise Edition package enables you to extend your data
center solution across multiple sites. It provides integrated support with the IBM TotalStorage
Enterprise Storage Server (ESS), IBM System Storage (DS6000 and DS8000), and SAN
Volume Controller replication types. In the 6.1 release the Enterprise Edition expands HA and
disaster recovery support to integrate with the EMC SRDF replication. It also provides
enhancements, including a wizard to configure and deploy host-based mirroring solutions
(GLVM configuration wizard).
The base cluster software in the Standard Edition allows the use of sites within a local cluster.
This is applicable to environments located within Metro distances that wanted to implement a
campus-style disaster recovery solution. In an environment that uses AIX Logical Volume
Mirroring, licensing only the PowerHA SystemMirror Standard Edition suffices. Even though
the cluster would be configured between a two-site data center, it would behave more like a
local stretch cluster. The use of site definitions within the cluster provides only extra cluster
measures to check for the consistency of the configuration.
For an example of a four-node Cross Site LVM cluster that uses the Mirror Pool functions in
AIX 6.1, see Chapter 4, Configuring PowerHA Standard Edition with cross-site logical
volume mirroring on page 113.
27
Figure 1-13 shows the changes to the Dynamic LPAR panel in the PowerHA 6.1 release.
Add Dynamic LPAR and CUoD Resources for Applications
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
* Application Server Name
[Entry Fields]
oracle_db
[
[
0.00]
0.00]
[0]
[0]
#
#
[0]
[0]
#
#
[no]
[no]
+
+
To grasp the licensing considerations, one must first have a fundamental understanding of
how Dynamic LPAR works in a PowerHA environment. The criteria for the minimum and
desired values that you typically associate with the server that hosts your application are not
bound to the individual machines. In PowerHA, these criteria are bound to the application
server definition within the cluster. The cluster tries to meet these criteria based on where the
application is being hosted. The Dynamic LPAR operations are invoked only during the
acquisition or release of your application server. This means that an LPAR does not self-tune
during normal operations. Its resource counts are only altered during the start and stop
phases of the cluster EVENT processing. The cluster only deallocates any resources if it
previously added them through the cluster functions. If you are trying to build a solution with
minimal PowerHA licenses, you can license the standby partitions for only a single processor
and lower your initial HA licensing costs.
28
To provide a better understanding of this concept, see the multisite scenario with two local
nodes and one remote node that is shown in Figure 1-14.
Site B
Site A
patsy
kaitlyn
connor
These LPARs are configured with the partition profile settings shown in Table 1-9.
Table 1-9 Partition profiles for the dynamic LPAR example
LPAR name
Minimum
Desired
Maximum
Site 1 - patsy
1 processor
1 processor
4 processors
Site 1 - kaitlyn
1 processor
1 processor
4 processors
Site 2 - connor
1 processor
1 processor
4 processors
In Figure 1-14, all nodes are configured to come up with only one processor. During the start of
the cluster, the Dynamic LPAR settings that are bound to the application server are evaluated and
enforced. The desired resources for the application server are appended only if the shared pool
has the available resources. Table 1-10 lists the application server dynamic LPAR settings.
Table 1-10 Dynamic LPAR example - application server settings
Application server
Minimum value
Desired value
app1
On cluster startup, the cluster evaluates the application server requirements and appends
three more processors to meet the desired amount, or four processors. Therefore wherever
the resource group and corresponding application server are being hosted, the cluster
performs a +3 processor operation on acquisition or a -3 processor operation on the release.
The key from a licensing standpoint is that you must license the number of active processors
at any time. In this example, the number of processors that are licensed is six.
However, an exception to the rule is a scenario in which the production partition profile is set
up with a minimum value of one and a desired value of four, as in Table 1-11 on page 30. This
scenario can have different results.
29
Minimum value
Desired value
Maximum value
Site 1 - patsy
Site 1 - kaitlyn
Site 2 - connor
Assuming the same application server processor values as in Table 1-10, after a resource
group move, you can end up with four active processors on the source and four on the target.
This time is the only time in this type of configuration where you should run your system with
a total of nine active processors when licensed for only six. For example, you are licensed for
only four on production, one on local backup node, and one on the disaster recovery node.
A different scenario is an environment in which the LPARs are running uncapped. In such an
environment, the partitions can exceed their desired partition processor counts and consume
more of the available resources within their shared processor pool. To remain compliant in
this situation, license the maximum number of processors that can ever be used by each
cluster partition. Alternatively, create a shared processor pool that caps the processor
resources that can be used by your cluster. Therefore, if a partition is set with a desired value
of 4, but running uncapped enabled it to consume eight processors, the number of PowerHA
licenses is eight. In this model, you pay more for cluster licensing, but the LPARs would be
configured so that they can self-tune and truly share the processors with the other LPARs
within the same CEC.
If insufficient resources are available in the free pool that can be allocated through the
Dynamic LPAR, the Capacity on Demand (CoD) function can provide extra resources to the
node. The proper CoD licenses must be in place for this function to work. Licensing revolves
around the CoD terms and conditions. CoD is not available on all Power Systems.
Although the goal of the IBM clustering software team was never to tightly enforce processor
license counts, you must follow these guidelines to remain fully compliant if an audit occurs.
30
For firewall configurations, the HMC-to-HMC discovery uses UDP port 9900, and the
HMC-to-HMC commands use TCP port 9920. For firewall configuration, the HMC browser
interface uses TCP port 443, TCP port 8443, and TCP port 9960 (browser applet
communication).
For more information about HACMP configuration for dynamic LPAR and CoD, see the
HACMP for AIX: Administration Guide, SC23-4862, and the PowerHA for AIX Cookbook,
SG24-7739. For more information about HMC configurations, see the Power Systems:
Managing the Hardware Management Console manual at:
http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/iphai_p5/iphaibook.pdf
HACMP 5.4.1
Consistency groups
GLVM Monitoring enhancements
HACMP 5.4.0
Metro Mirror support for intermixed environments (DS6000, DS8000, Enterprise
Storage Server 800)
Multiple GLVM XD_Data networks
IPAT Across Sites
Site-specific service labels solve the problem of having different subnets at
different sites.
HACMP 5.3
Location dependencies (for example, online on same site)
Synchronous GLVM
Notes about the new PowerHA features are available in the PowerHA Enterprise Edition
release notes, which are installed in the /usr/es/sbin/cluster/release_notes_xd directory.
For more information, see the PowerHA website at:
http://www.ibm.com/systems/power/software/availability/
31
1.6.4 DSCLI-based PPRC support for multiple storage units per site
DSCLI-based PowerHA Enterprise Edition PPRC supports more than one storage system
per site. You can now configure and use more than one DSS on a single SPPRC site. Each
PPRC replicated resource still has only one primary and one auxiliary storage per site, but
you can use any configured DSS if there is a single one per site in a PPRC replicated
resource group.
32
basic SAN Volume Controller environment is configured, PPRC relationships are created
automatically. No additional access to the SAN Volume Controller interface is needed.
*
*
*
*
*
[Entry Fields]
svc_global
[]
[B12_4F2]
+
[B8_8G4]
+
[svc_disk6 svc_disk7 s> +
GLOBAL
+
AUTO
+
33
The states vary based on the replication type that is being used between the sites.
Example 1-6 lists the states that the MANUAL policy would be applicable.
Example 1-6 Applicable states for the MANUAL policy
#lsdev
hdisk0
hdisk1
hdisk2
hdisk3
-Cc disk
Available
Virtual SCSI Disk Drive
Available
Virtual SCSI Disk Drive
Available 33-T1-01 MPIO FC 2145
Available 33-T1-01 MPIO FC 2145
vscsi0 Available
vscsi1 Available
lscfg -vl hdisk0
hdisk0
Drive
#lscfg -vl hdisk2
hdisk2
Manufacturer................IBM
Machine Type and Model......2145
ROS Level and ID............0000
Device Specific.(Z0)........0000043268101002
Device Specific.(Z1)........0200640
Serial Number...............600507680190026C4000000000000000
If the base NPIV requirements are met in the environment, a configuration that uses disk
replication works the same as in one that uses dedicated host bus adapters (HBAs).
However, you must still confirm the NPIV qualification of the separate storage systems and
their corresponding replication mechanisms.
Considerations
Consider the following items regarding the current PowerHA Enterprise Edition releases:
A single PowerHA cluster supports only two sites. A single node can be part of only one
PowerHA cluster site.
A single PowerHA Enterprise Edition cluster supports up to eight nodes.
Concurrent disk access within a cluster by using GLVM is supported only within sites, not
between sites.
PowerHA Version 6 base (cluster.es.server.rte 6.n.0.0) must be at the same release level
(n) as PowerHA/XD.
The GLVM two-site configuration wizard is not IPv6-enabled.
The GLVM two-site configuration wizard does not support asynchronous replication.
There is no support for enhanced concurrent mode volume groups when you use
asynchronous GLVM on AIX 6.1.
PPRC eRCMF support is no longer included with the Enterprise Edition.
SRDF support is officially direct attach only.
PowerHA for SRDF known considerations:
C-SPOC considerations
C-SPOC cannot perform the following LVM operations on nodes at the remote site (that
contain the target volumes):
Operations that require nodes at the target site to read from the target volumes
cause an error message in C-SPOC. This situation includes functions such as
changing file system size, changing mount point, and adding LVM mirrors. However,
nodes on the same site as the source volumes can successfully perform these
tasks, and the changes then are propagated to the other site by using lazy update.
35
For C-SPOC operations to work on all other LVM operations, perform all C-SPOC
operations with the SRDF pairs in synchronized or consistent states or the cluster
ACTIVE on all nodes.
Inter-site failover considerations
The following functions that are provided by PowerHA for local failover are not yet
supported across sites:
For more information about these considerations, see the 6.1 Release Notes in the
/usr/es/sbin/cluster/release_notes_xd file.
36
37
38
Chapter 2.
Infrastructure considerations
The key to an effective high-availability solution goes well beyond the installation of the
Systems software. Thorough planning and careful consideration of all potential single points
of failure helps minimize the risk of an unforeseen outage of critical business applications.
This chapter reviews many of the requirements and infrastructure considerations when you
plan a disaster recovery solution with the PowerHA SystemMirror for AIX Enterprise Edition
on Power Systems.
Under the infrastructure considerations for high availability, include the user network or the
client network, the enterprise network, the local and global network (cloud technology), and
the security needs for end-to-end operations of the business process.
This chapter includes the following sections:
Network considerations
Cluster topology considerations
Storage considerations
PowerVM virtualization considerations
Server considerations
39
2.1.1 Bandwidth
The performance of a geographically dispersed synchronized mirror storage system depends
on the throughput data rate (bandwidth) and the latency of the communication links between
the primary and the secondary sites. Use dedicated networks for synchronizing data.
For long-haul transmissions to extend storage networks, consider the following parameters
regarding the required quality of service and performance levels. From a storage network
extension perspective, consider the following quality of service parameters:
Bandwidth
Latency
Can be viewed as the delay from the source node to the access
network before transmission can begin using the bandwidth.
End-to-End delay
Can be viewed as the propagated latency from the source node to the
target node with accumulated latency (end-to-end), which can vary
from time to time and can include effects of congestion control.
Error rate
Availability
Can be viewed as the uptime of the local access network for storage
network extension.
Reliability
The effective throughput depends on the bandwidth and all write I/O reductions. Examples
include latency, end-to-end delay, error rates that cause local I/O queuing, and availability and
reliability issues that can cause temporary synchronization suspension, depending on the
deployed synchronization method and write verification policy.
40
The gmdsizing tool monitors disk utilization over a time. When the command completes or is
interrupted, it produces a report on disk usage over the time the command was running.
The gmdsizing tool file set is available on the HACMP/XD for AIX installation media or can be
downloaded from:
http://www.ibm.com/systems/resources/systems_p_advantages_ha_downloads_gmdsizing.tar
The gmdsizing command requires that the interval (-i) and time (-t) flags are supplied. The
interval specifies how often disk activity is checked and the time specifies for how long the
program runs.
The time flag defaults to seconds, but can be changed by appending letters after the number:
d
h
m
s
Number of days
Number of hours
Number of minutes
Number of seconds
For example, to check over five days, you can use 5d, 120h, or 7200m as an argument to the
time flag.
In addition, at least one of either disk devices (-p) or volume groups (-v) must be specified. If
the volume group flag is specified, the command converts the volume group argument into
physical volume names. All data reported by gmdsizing is given in disk blocks.
If two-copy mirroring is enabled for the logical volumes, twice as many writes occur at the
physical disk level because one logical write from the application generates two physical
writes to the disk devices. Rather than selecting an entire volume group to be monitored,
select just those disks that contain one copy of the mirrors. If the volume group is laid out
such that selecting those disks is not feasible, potentially twice as much write activity is
generated from the application perspective. Therefore, keep in mind the potential for twice the
amount of write activity when you analyze the data.
The verbose (-V) flag results in a summary that is written at the end. The file flag (-f) makes
the command write the output to the specified file. Always use both the -V and -f flags. The
following flags are also available:
-D
-U
-w
-A
-T
To change delimiter
To change units
For writes only
For aggregate
For time scale
41
The following command monitors the disks in volume group datavg01 over 24 hours with a
one-minute interval and saves the output to a file:
gmdsizing -I 1m -t 24h -v datavg01 -V -f /tmp/gmdout.48hx1m.$(date +%Y%m%d)
The following command monitors the disks in volume group datavg01 over 72 hours with a
10-minute interval and saves the output to a file:
gmdsizing -I 10m -t 72h -v datavg01 -V -f /tmp/gmdout.48hx1m.$(date +%Y%m%d)
The following command monitors the disks in volume group datavg01 over 60 days with a
one-hour interval and saves the output to a file:
gmdsizing -I 60d -t 1h -v datavg01 -V -f /tmp/gmdout.60dx1h.$(date +%Y%m%d)
For sizing the networks, we are only interested in write traffic, but the read columns are useful
to help determine the read:write ratio for the workload. Knowing how long the disk activity was
measured for (the parameter that is passed with the -t flag) allows the determination of an
average write rate. Because all data reported by gmdsizing is given in disk blocks, it is
necessary to convert it to bytes, which can be done by multiplying the total of all the total
write values by the block size of the device. This value can then be divided by the total time
over which we were measuring. See the following example:
n-blocks-written x bytes-per-disk-block = total-volume-of-data-written
Then, divide the total volume of data that is written by the time for the measurement period:
total-volume-of-data-written / measurement-time = bytes/time-scale
See the following example:
130000 blocks x 512 bytes/block = 66560000 bytes
If the measurement period was 30 minutes, you make the following calculation:
66560000 / 1800 seconds = 36977 bytes/second
This value is an average, assuming that data is written at a constant rate, which is unlikely. To
understand how accurate this average is, compare this value with the theoretical worst-case
scenario. This value is the maximum volume of data that is written in one measurement
interval (defined by the -i flag). Dividing this figure by the measurement interval gives us a
worst-case rate. In this example:
4388 block x 512 bytes/block = 2246656 bytes / 60 seconds = 37444 bytes/second
Compare these two values (36977 and 37444). If they are relatively close (as in this
example), the disk activity and the network bandwidth requirements are fairly uniform. These
values probably can be used to get a good estimate for the network requirements. If these
values differ widely, which is more likely (having a worst-case value 7 - 10 times that of the
average), a more detailed analysis of the gmdsizing data is needed to determine the
bandwidth requirements. More complex analysis of the data might be performed by using
statistical techniques such as calculating the max, average, median, first, and third quartiles,
and standard deviation based on a more detailed sample.
The network bandwidth determined by these methods does not take into account networking
latency. Allowing a 20 - 30% overhead for the networking protocols can be sufficient for most
networks. See the following example:
bytes/second / 0.75 = bandwidth requirement
In addition to the gmdsizing tool, the AIX filemon tool can be used to measure more
granular metrics such as average write size and disk write times for each logical volume. Use
42
caution with the filemon tool because it can collect a huge amount of data during a relatively
short period.
After you determine the bandwidth requirements for the current storage network utilization,
consider the latency, end-to-end delay, error rate, and reliability (assuming that the local
availability is near 100%). A mapping can be made to a network technology and to a network
service provider.
Consider the protocol stack that is involved in long-haul transmissions to extend storage
networks, and whether it is from host nodes or storage nodes. The following list is a simplified
transmission stack from a source host node to a target host node:
1. SCSI layer
2. FCP/IBA
3. IP transport layer
4. IP network layer
5. Network interface
6. Local network
7. Wide area network interface
8. Wide area network
9. Wide area network interface
10.Local network
11.Network interface
12.IP network layer
13.IP transport layer
14.FCP/IBA
15.SCSI layer
Transmission media and the problem of signal degradation, Encyclopedia Britannica Online, 2010,
http://www.britannica.com/EBchecked/topic/585825/telecommunications-media
43
Protocol
Bandwidth Mbps
Bandwidth Gbps
10 Mbps Ethernet
TCP/IP
10
0.01
TCP/IP
100
0.1
TCP/IP
1,000
10 Gbps Ethernet
TCP/IP
10,000
FC
FCP
1,000
FC
FCP
2,000
FC
FCP
4,000
FC
FCP
8,000
FC
FCP
10,000
10
FC
FCP
20,000
20
T1/DS1
TCP/IP
1,544
1.544
E1
TCP/IP
2,048
2.048
T2
TCP/IP
6,312
6.312
E2
TCP/IP
8,448
8.448
T3/DS3
TCP/IP
44,736
44.736
E3
TCP/IP
34,368
34.368
OC-1 (SONET)
ATM
51.84
0.05
ATM
155.52
0.16
OC-12 (SONET)/STM-4(SDH)
ATM
622
0.62
OC-24 (SONET)
ATM
1,244
1.24
OC-48 (SONET)/STM-16(SDH)
ATM
2,488
2.49
OC-192 (SONET)/STM-64(SDH)
ATM
10,000
10
OC-256 (SONET)
ATM
13,271
13.27
OC-768 (SONET)/STM-256(SDH)
ATM
40,000
40
OC-1536 (SONET)/STM-512(SDH)
ATM
80,000
80.00
44
Network technology
Protocol
Bandwidth Mbps
Bandwidth Gbps
OC-3072 (SONET)/STM-1024(SDH)
ATM
160,000
160.00
SVP/IPoIB
2,000
SVP/IPoIB
4,000
SVP/IPoIB
8,000
InfiniBand 4xSDR
SVP/IPoIB
8,000
InfiniBand 4xDDR
SVP/IPoIB
16,000
16
InfiniBand 4xQDR
SVP/IPoIB
32,000
32
InfiniBand 12xSDR
SVP/IPoIB
24,000
24
InfiniBand 12xDDR
SVP/IPoIB
48,000
48
InfiniBand 12xQDR
SVP/IPoIB
96,000
96
SONET and SDH: Synchronous Optical NEtwork (SONET) is the American National
Standards Institute (ANSI) standard for synchronous data transmission on optical media.
Synchronous Digital Hierarchy (SDH) is the international standard for synchronous data
transmission on optical media.
InfiniBand speed and data rate (bandwidth) differ because of the 8/10 bit ratio. Therefore, a
40 Gbps 4XQDR adapter has a data rate of 32 Gbps. Multiply the data rate by 1.25 for
transmission speed and by 0.8 for the data rate from the transmission speed.
45
aggregation, you can use switches that support the IEEE 802.3ad standard but do not
support EtherChannel.
In IEEE 802.3ad, the Link Aggregation Control Protocol (LACP) automatically tells the switch
which ports should be aggregated. When an IEEE 802.3ad aggregation is configured, Link
Aggregation Control Protocol Data Units (LACPDUs) are exchanged between the server
machine and the switch. LACP assists the switch in detecting whether to consider the
adapters that are configured in the aggregation as one on the switch without further user
intervention.
Table 2-2 provides a brief comparison between the two modes of link aggregation.
Table 2-2 EtherChannel and IEEE 802.3ad differences
EtherChannel
IEEE 802.3ad
Details on the implementation and requirements for link aggregation mode can be found at
the following website:
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.com
madmn/doc/commadmndita/etherchannel_intro.htm
46
adapter does not experience any interruption if the physical adapters on one of the virtual I/O
server (VIOS) is compromised or if only a single VIOS is taken offline.
Finally, since the introduction of the Dynamic Adapter Membership (DAM) functionality in
AIX 5.2, if an adapter is created as an EtherChannel from the start, you can dynamically
append extra links to it without disruption. Also, the replacement of a failed adapter is also
non-disruptive, making the idea of using link aggregation, assuming that the environment is
supported, even more appealing.
47
Base10
48
spans between two system or network devices where repeater/amplifier spacing must
be maximized.
Multi-mode (MM) fiber allows more than one mode of light and is suited for
shorter-distance applications. Common core sizes are 50 micron and 62.5 micron.
The Fibre Channel architecture supports short-wave and long-wave optical transmitter
technologies. The distance limitations for optical fiber depends on both, whether it is a
multi-mode or single-mode cable and the light wave. The adapter, the equipment, and the
service provider delivery specifications can override standardized distance limitations.
Specifications: International multimode and single-mode cables are specified by the
following organizations:
ISO/IEC IS11801 for multimode cables
http://www.iso.org
ITU-T for single-mode cables
http://www.itu.int/ITU-T
For more information about FC and SAN, see the Introduction to Storage Area Networks and
System Networking, SG24-5470.
2.1.7 DWDM
Nowadays, the expansion of IP-based technology is becoming faster and faster. Many
devices and applications that did not use an IP-based layer for their data transmission before,
such as TV, video, telephone (voice and data), and multiplayer gaming, have moved to an
IP-based layer now.
Most of those devices and applications are bandwidth-intensive and real time, which results in
the need for special methods and devices to support their requirements. Dense Wavelength
Division Multiplexing (DWDM) is one of the solutions that can support that requirement,
especially for data transmission. DWDM is the process of multiplexing signals of different
wavelengths onto a single fiber. Through this operation, it creates many virtual fibers, each
capable of carrying a different signal. At its simplest, a DWDM system can be viewed as a
parallel set of optical channels, each using a slightly different light wavelength, but all sharing
a single transmission medium. This new technical solution can increase the capacity of
existing networks without the need for expensive recabling and can tremendously reduce the
cost of network upgrades.
Internet Protocol (IP) over DWDM is the concept of sending data packets over an optical
layer by using DWDM for its capacity and other operations. The optical network provides
end-to-end services completely in the optical domain without having to convert the signal to
the electrical domain during transit. Transmitting IP directly over DWDM supports bit rates of
OC-192 (transmitting speed of up to 9953.28 Mbps). For more information about network
technologies bandwidth examples, see Table 2-1 on page 44.
Over the past five years, many service providers have deployed DWDM networks as the
underlying and enabling layer 0/1 (physical layer and data link layer) optical technology. This
technology easily supports new protocols, such as GigE or 10 GigE and Fibre Channel, in
their basic formats.
49
How does the DWDM work? Figure 2-1 illustrates a simple explanation.
Transmitters
Receivers
Combining
signals
Separating
signals
Transmission on fiber
Terminal A
Transponder
interfaces
...
Transponder
interfaces
Direct
connections
Post
amplifier
Line
amplifiers
Post
amplifier
50
Direct
connections
51
2.1.8 Firewalls
A firewall is a device that protects private local area networks (LANs) from intrusion of
computers that are accessing the Internet. It can be a hardware device or a software program
that is running on a secure host that provides a gateway between two networks. The device
has two network interfaces, one in the protected zone and one in the exposed one. The
firewall examines all inbound and outbound traffic and filters it based on the specified criteria.
Firewalls can filter packets based on their source and destination addresses and different port
numbers, which is known as address filtering. Firewalls can also filter specific types of
network traffic, which is known as protocol filtering. The decision to forward or reject traffic
depends on the protocol that is used, for example, HTTP, FTP, or telnet. Firewalls can also
filter traffic by packet attribute or state.
Network architecture is designed around a seven-layer model. In a network, a single protocol
can travel over more than one physical support (layer one) because the physical layer is
dissociated from the protocol layers (layers three to seven). Similarly, a single physical cable
can carry more than one protocol. The TCP/IP model is older than the OSI industry standard
model, which is why it does not comply in every respect. The first four layers are so closely
analogous to OSI layers, however, that interoperability is a day-to-day reality.
Figure 2-3 shows the contrast between the OSI and Internet Protocol network models.
OSI Model
TCP/IP Model
7 Application
6 Presentation
5 Application
5 Session
4 Transport
3 Network
2 Data Link
2 Data Link
1 Physical
1 Physical
Firewalls operate at different layers and can use different criteria to restrict traffic. The
lowest layer at which a firewall can work is layer three. In the OSI model, this layer is the
network layer. In TCP/IP, it is the Internet Protocol layer. This layer is concerned with routing
packets to their destinations. At this layer, a firewall can determine whether a packet is from a
trusted source, but it cannot be concerned with what it contains or what other packets it is
associated with. Firewalls that operate at the transport layer know a little more about a packet
and can grant or deny access, depending on more sophisticated criteria. At the application
level, firewalls know a great deal about what is going on and can be selective in
granting access.
Clustered environments might or might not have firewalls in place between their servers.
Even when they do, they might not implement them in the same way. We provide details about
what we used in our environment. Our test environments were configured to replicate
52
between buildings in our complex. A firewall was in place between each building and we
opened up ports and enabled rules for our cluster communication to work.
The first step was to identify the cluster network topology. When we determined the routable
IPs required at each site, we placed a request to the network team to allow communication
between the corresponding pairs. Figure 2-4 shows a logical view of the firewalls in place
between the sites that were used for our test scenarios.
Rules
Rules
10.12.5.36
10.114.124.40
10.12.5.37
10.114.124.44
BLDG 8
P770
BLDG 12
P575
P770
P575
to 10.114.124.40
to 10.114.124.44
to 10.114.124.40
to 10.114.124.44
no authentication
no authentication
no authentication
no authentication
for
for
for
for
All Ports
All Ports
All Ports
All Ports
10.12.5.36 to
10.12.5.36 to
10.12.5.37 to
10.12.5.37 to
Open
Open
Open
Open
10.114.124.40
10.114.124.44
10.114.124.40
10.114.124.44
to 10.12.5.36
to 10.12.5.36
to 10.12.5.37
to 10.12.5.37
authentication
authentication
authentication
authentication
for
for
for
for
10.12.5.36
10.12.5.36
10.12.5.37
10.12.5.37
53
We began our testing with all ports open to avoid any issues and later restricted it to only the
required ports.
Required ports
For the ports that are used by Resource Monitoring and Control (RMC), Hardware
Management Console (HMC), and Dynamic Logical Partition (DLPAR) operations, see More
Dynamic LPAR considerations on page 30.
The /etc/services file defines the sockets and protocols that are used for network services
on a system. The ports and protocols that are used by the PowerHA components are defined
as follows:
clinfo_deadman 6176/tcp
clinfo_client 6174/tcp
clsmuxpd 6270/tcp
clm_lkm 6150/tcp
clm_smux 6175/tcp
godm 6177/tcp
topsvcs 6178/udp
grpsvcs 6179/udp
emsvcs 6180/udp
clcomd 6191/tcp
In addition, when the PowerHA Enterprise Edition is installed, the following entry for the port
number and connection protocol is automatically added to the /etc/services file:
rpv 6192/tcp
The entry is added on each node on the local and remote sites on which you installed the
software. This default value enables the RPV server and RPV client to start immediately after
they are configured (that is, to be in the available state). For more information, see the
HACMP for AIX 6.1 Geographic LVM: Planning and Administration Guide, SA23-1338.
54
ISP
(external)
FW/IPS
Routers
LAN
Switches
Network
Servers
Storage
SAN
Switches
Storage
For more information, see the HACMP for AIX Geographic LVM: Planning and Administration
Guide, SA23-1338. See also the selected and relevant storage solutions planning and
administration guides.
55
As a base reference for topology, consider the basic dual-node PowerHA cluster that is
illustrated in Figure 2-6.
ISP (external)
FW/IPS
Routers
Switches
Network
Servers
Storage
Switches
Storage
Figure 2-6 Basic PowerHA dual node cluster
56
ISP (external)
FW/IPS
Routers
LAN
Switches
Network
Servers
Storage
SAN
Switches
Storage
Between the sites, the storage synchronization is performed from the active node at one site
to the secondary site for that service. Therefore, if both sites have both active and passive
services, the storage synchronization goes in both directions.
All component failures are handled locally at each site. Only site failures or planned site
maintenance result in failover across sites. This configuration is preferred if you have
applications active at both sites to maximize the utilization of the nodes. We used manual
failback after the primary site failover to the remote or recovery site.
57
Two nodes at the primary site and one node at the recovery site
In the two nodes at the primary site shown in Figure 2-8, the site has a working PowerHA
cluster. Also both nodes can run services with one node as the primary node and one node
as the secondary node locally. The secondary-site node is commonly dedicated for recovery
if the primary site fails. This node usually does not provide online services until such time.
ISP (external)
FW/IPS
Routers
LAN
Switches
Network
Servers
Storage
SAN
Switches
Storage
Figure 2-8 Two nodes at the primary site and one node at the recovery site
Between the sites, the storage synchronization is performed from the active node at the
primary site to the secondary site. All component failures are handled locally only at the
primary site. The secondary site elevates any node failure to a site failure.
This configuration is supported. However, if maintenance is being performed on the backup
site, the nodes at the backup site might be down. Also the data that is updated at the primary
site cannot be synchronized to the backup site if host synchronization is used. When the node
at the backup site comes up, the primary site must synchronize all changes that occurred
while the backup site was down.
58
ISP (external)
FW/IPS
Routers
LAN
Switches
Network
Servers
Storage
SAN
Switches
Storage
Between the sites, the storage synchronization is performed from the active node at the
primary site to the secondary site. This design is more restrictive. It can handle disk or
adapter failures locally, but node failures are propagated to site failures because no local peer
nodes are available in this model.
59
The most commonly used PowerHA network types are ether for IP networks and rs232 and
diskhb for non-IP heartbeat traffic. Other network types are available for configuring clusters
that span multiple sites. Other than when using an implementation with GLVM, the only
difference between an ether type network and an XD_ip network is the amount of time that
the cluster waits before declaring an interface failure and initiating an adapter swap.
Table 2-3 contrasts the failure detection rate between the IP network types. The FDR that is
used by a cluster network interface module defaults to the predefined Normal value, as shown
in the rows in Table 2-3.
Table 2-3 Failure detection rate - IP network types
Type ether network
setting
Seconds between
heartbeats
Failure cycle
Slow
12
48
Normal
10
20
Fast
10
Seconds between
heartbeats
Failure cycle
Slow
16
96
Normal
2.5
12
60
Fast
12
48
Table 2-4 contrasts the detection rate between serial non-IP network types commonly used in
a local cluster and an XD_rs232 network that can be implemented between two sites.
Table 2-4 Failure detection rate for non-IP networks
rs232 and diskhb
network types
Seconds between
heartbeats
Failure cycle
Slow
48
Normal
20
Fast
10
XD_rs232 network
type
Seconds between
heartbeats
Failure cycle
Slow
10
60
Normal
2.5
30
Fast
24
The values that are documented in Table 2-3 and Table 2-4 correspond to the three
predefined FDR values within the cluster software, where the default value is Normal.
Although the default values are typically sufficient, if different values are desired, the setting
for each network type can be customized to slower or faster settings.
60
LAN #1
ISL
Network
IP
IP
Server
Nodes
This method is the most common deployment in single-site PowerHA clusters. It can
sometimes be extended between sites by using virtual LAN (VLAN) technology (VLANs are
often associated with IP subnetworks), such as VLAN trunking protocol. This capability
usually is made up of one or more network devices that are interconnected with trunks such
as with Inter-Switch Link (ISL). When an IP service label move is requested or required, the
server nodes are all in the same subnet (VLAN). Also the client nodes outside of the cluster
can still use this service IP label to access the resource group service.
An IP address at one site might not be valid at the other site because of subnet issues.
Therefore, it might not be possible or desirable to interconnect the network devices, although
it is possible to configure site-specific service IP labels to handle this situation.
61
Site service IP
Label active
Site service IP
Label active
LAN #1
LAN #2
LAN #2
LAN #1
Network
IP
Network
IP
IP
IP
Server
Nodes
Server
Nodes
Except for the site-specific limitation, these service IP labels have all the same functions as
regular service IP labels, except the ability to work with NFS cross-mounts between the sites.
The panel in Figure 2-12 shows the additional attribute that is available when you define
service IP labels in PowerHA.
Change/Show a Service IP Label/Address (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* IP Label/Address
New IP Label/Address
Netmask(IPv4)/Prefix Length(IPv6)
* Network Name
Alternate HW Address to accompany IP Label/Address
Associated Site
[Entry Fields]
svcxd_a1_sv
[]
[24]
[net_ether_01]
[]
svc_sitea
62
+
+
+
On the same cluster, Example 2-1 shows how the resource group service IP definition might
look.
Example 2-1 Resource group service IP definition
# clshowres
Resource Group Name
Participating Node Name(s)
svcxd_b1
Startup Policy
Failover Policy
RG_sitea
svcxd_a1 svcxd_a2 svcxd_b2
Online On First Available Node
Failover To Next Priority Node
In The List
Never Failback
Prefer Primary Site
Node Priority
svcxd_a1_sv svcxd_b1_sv
Failback Policy
Site Relationship
Service IP Label
..............
For more information about setting up site-specific service IP labels, see the HACMP for AIX
6.1 Administration Guide, SC23-4862
http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.hacmp.admngd/
hacmpadmngd_pdf.pdf
With site-specific service IP labels, consider how the client nodes outside of the cluster
become aware that the service is provided over another IP address. The most efficient and
reliable solutions are when the client node application that uses the cluster node service is
cluster aware and can detect and reroute communication from the failing node or site to the
active node or site as required.
You can use several ways to inform client nodes of IP service label changes. However,
examine how and when the client application handles IP address resolution, whether it is on
startup or when needed. Upon startup, it usually requires a restart of the application. When
needed, it might introduce name resolution delays, which in turn might be handled outside of
the application by using either static host tables (/etc/hosts) or DNS caching only on the local
client node. Dynamic alteration of DNS records is another possibility, but avoid manipulating
DNS structures if possible.
Time settings: DNS changes are not propagated in real time throughout a segmented
DNS structure. Individual records either have the Start of Authority (SOA) time settings, or
might have individual record-specific time settings. In either case, they are not real time.
Example 2-2 shows a simple nslookup of the ibm.com domain. The expire attribute in this
case is seven days. It indicates how long the information can be held before it is no longer
considered authoritative, such as with a secondary or cache-only domain name server. In this
case however, the refresh is once every hour, which is the time interval between
polling checks from the secondary to the primary for record changes (indicated by a higher
serial number).
Example 2-2 Using nslookup to view domain name SOA record
bjro@paco24:/Users/bjro: nslookup
> set querytype=soa
> ibm.com
Server:192.168.0.254
Address:192.168.0.254#53
63
Non-authoritative answer:
ibm.com
origin = ns.watson.ibm.com
mail addr = dnstech.us.ibm.com
serial = 2010040100
refresh = 3600
retry = 1800
expire = 604800
minimum = 10800
Authoritative answers can be found from:
ibm.comnameserver = internet-server.zurich.ibm.com.
ibm.comnameserver = ns.watson.ibm.com.
ibm.comnameserver = ns.austin.ibm.com.
ibm.comnameserver = ns.almaden.ibm.com.
ns.almaden.ibm.cominternet address = 198.4.83.35
internet-server.zurich.ibm.comhas AAAA address 2001:620:20:fe01:100::2000
ns.austin.ibm.cominternet address = 192.35.232.34
ns.watson.ibm.cominternet address = 129.34.20.80
If the client nodes can use a domain name server with support for the RFC 2782, you can use
the DNS Resource Record (RR) SRV (SeRVice) to move a service from host to host and to
designate some hosts as primary servers for a service and others as backups. For more
information, go to RFC 2782 at:
http://tools.ietf.org/html/rfc2782
The SRV RR has the following format:
_Service._Proto.Name TTL Class SRV Priority Weight Port Target
As stated in RFC 2782, A client must attempt to contact the target host with the
lowest-numbered priority it can reach.
64
65
Site A
Site B
TCP/IP network
Node B
Node A
SAN-A
devices
SAN-B
device
LVM mirror
Storage A
Storage B
Independent storage subsystems are on each site. Storage subsystems might be of different
models and providers. Although it is not a requirement, for performance and functionality
considerations, we implemented storage environments with similar characteristics at both sites.
SAN zoning and volume mapping on each storage environment must be configured to enable
each node in the cluster to access both local and remote storage volumes.
The SAN can be expanded beyond one site by using communication-extending technologies
that also determine the maximum distance of the FC connections between the sites. The
following list provides examples of the types of technologies that might be used:
Direct FC links between FC switches that use long-wave gigabit interface converter
(GBIC) modules in switches at both sides and single-mode interconnect optical links
Fibre Channel over IP (FCIP) connections that use independent modules, or specific
modules on SAN switches, to transport the FC frames over TCP/IP communication links
Wave division multiplexing (WDM) devices, which includes coarse wavelength division
multiplexing (CWDM) and dense wave-length division multiplexing (DWDM)
For more information about WDM devices, see 2.1.7, DWDM on page 49.
66
When you use cross-site LVM, consider changing the default values of the following attributes
for the FC SCSI protocol devices (fscsi) on all host bus adapter (HBA) ports:
fc_err_recov = fast_fail to enable fast I/O failure recovery
dyntrk = yes to enable the FC device drivers to reroute the traffic to the target device in
case the SCSI ID has changed
To check the current attributes of the HBA ports, see the lsattr command that is shown in
Example 2-3.
Example 2-3 State of the FC SCSI protocol devices
# lsattr -El
attach
dyntrk
fc_err_recov
scsi_id
sw_fc_class
fscsi0
switch
yes
fast_fail
0x70400
3
For more information about the two attributes and their interaction, see Fast I/O Failure and
Dynamic Tracking interaction in the System p and AIX Information Center at:
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix
.prftungd/doc/prftungd/fast_fail_dynamic_interaction.htm
The shared data in the cluster is mirrored between the storage volumes that are placed in
different sites by the AIX LVM. LVM data mirroring keeps both copies of the data in sync.
PowerHA drives automatic LVM mirroring synchronization, and after the failed site joins the
cluster, it automatically fixes removed and missing volumes (physical volumes, or PV, states
removed and missing) and synchronizes the data. Automatic synchronization is not possible
for all cases, but you can use C-SPOC to synchronize the data from the surviving mirrors to
stale mirrors after a disk or site failure.
67
Site B
TCP/IP network
Node A
Node B
SAN-A
SAN-B
Storage replication
Storage A
Storage B
Figure 2-14 Typical PowerHA Enterprise Edition environment using storage replication
Nodes in each site have access to the local storage by using either SCSI or FC links. No path
is defined between a node in a site, and the storage from the remote site to access the data
volumes on the remote storage during a failover or failback of the replicated volume pairs.
In general, storage subsystems from both sites are connected for data replication by using
dedicated communication links. Depending on the storage type, there are a couple of
technologies available for the replication links (see 2.1.6, Fibre Channel principles of
distance on page 48, and 2.1.7, DWDM on page 49):
Fibre Channel (FC) and FCIP (transports the FC frames over an Internet Protocol network)
IBM ESCON
Consult the storage product and the PowerHA Enterprise Edition documentation for the
supported types of technologies available for a specific product and PowerHA Enterprise
Edition version.
The cluster nodes need to access the storage subsystems at both sites to manage the
replicated pairs. Depending on the storage subsystem and the management software that is
used with PowerHA Enterprise Edition, this type of connection can be supported in the
following ways:
An Internet Protocol network that uses one or two IP segments that are routed between
sites with PowerHA Enterprise Edition that uses IBM storage subsystems
FC/SCSI links between the nodes and the storage in each site, and the storage replication
links for EMC Symmetrix replication (SRDF)
68
PowerHA Enterprise Edition 6.1 supports the PowerHA Enterprise Edition for Metro Mirror
Software environment for storage-based replication (as per the 6.1 Release Notes, only ESS
Model 800 is supported). This solution is available for the following IBM storage systems:
Hardware type
PPRC is managed
by
Synchronous
ESS 800
Copy Services
Servera (CSS, on
storage controller)
Synchronous
ESS (800) or DS
(8000, 6000) or
intermix of any of
these
DSCLI
management, via
ESSNIb Server on
either storage
controller or HMC
Synchronous/
asynchronous
SAN Volume
Controller (hardware
as supported by SAN
Volume Controller
services)
SVC management
of Copy Services on
SVC-specific
hardware
a. Copy Services Server (CSS) represents the service running on the ESS controller or SVC node, managing the
copy services on the storage.
b. ESSNI stands for Enterprise Storage Server Network Interface. The ESSNI server runs on HMC (DS6000/8000)
or storage controller (ESS).
c. DSCLI represents the command-line interface used to manage the storage configuration and copy services
operations.
PowerHA Enterprise Edition with different PPRC management types can coexist on the same
PowerHA cluster only if the PPRC pairs are managed by one of the PPRC solutions at a time.
See the latest support information for which PPRC solutions can successfully coexist on a
single PowerHA cluster.
69
The data replication between storage subsystems can be performed on FC or ESCON links.
PowerHA Enterprise Edition requires that inter-storage links be available to carry data in both
directions at the same time. If ESCON links are used, a minimum of two ESCON links is
required because each link can carry data in only one direction at a time. Have at least four
links to improve throughput and to provide redundancy of the ESCON cables and adapters.
PowerHA Enterprise Edition supports ESS/DS Metro Mirror replication in the following
configurations:
Direct management PPRC (only for ESS systems)
DSCLI PPRC management (for a cluster that uses DS6000, DS8000, or ESS Model 800)
70
ESS storage can also be used in DSCLI management configurations. In this case, you also
must install the ESS CLI software on the cluster nodes. Cluster software for DSCLI
configurations expects the ESS CLI file sets to be installed in the /opt/ibm/ibm2105cli
non-default directory.
DSCLI PPRC resources: ESS storage resources (LSS and LUNs) are considered DSCLI
PPRC resources in this type of configuration because they are managed by using the
DSCLI interface and not the ESS CLI.
The XD Release Notes always have the most up-to-date information about the required
software levels. They are available in the /usr/es/sbin/cluster/release_notes_xd file.
71
SAN Volume Controller hardware supports only FC protocol for data traffic inside and
between sites. It requires a SAN switched environment. FCIP routers can also be used to
transport the FC data frames over an Internet Protocol network between the sites.
Management of the SAN Volume Controller PPRC pairs is performed by using SSH over an
Internet Protocol network. Each cluster node must have the openssh package installed and
configured to access the SAN Volume Controllers in both sites.
PowerHA Enterprise Edition with SAN Volume Controller replication supports the following
options for data replication between the SAN Volume Controller clusters:
Metro Mirror providing synchronous remote copy. Changes are sent to both primary and
secondary copies, and the write confirmation is received only after the operations are
complete at both sites.
Global Mirror providing asynchronous replication. Global Mirror periodically starts a
point-in-time copy at the primary site without impacting the I/O to the source volumes. This
feature was introduced in SAN Volume Controller Version 4.1.
Global Mirror is used for greater distances. Such factors as distance, bandwidth, and
latency of the communication links between the sites can affect application performance
when using synchronous mirroring. For these considerations, Global Mirror can be an
alternative replication solution to SAN Volume Controller Metro Mirror. According to the
latest flash, SAN Volume Controller 5.1 support is now available.
72
In the current integration release, the following Symmetrix models are supported:
DMX-3
DMX-4
V-MAX
For more information about the Symmetrix models and firmware versions that are supported
for the PowerHA Enterprise Edition integration, contact the EMC support representative. Also
consult the available bulletins for RFA. For more documentation resources, see the EMC
Powerlink website at:
https://powerlink.emc.com
You can achieve EMC SRDF storage management with the EMC-supplied Symmetrix
command-line interface (SYMCLI) from the PowerHA environment. Using SYMCLI allows
PowerHA software to automatically manage SRDF links and manage switching the direction
of the SRDF relationships when a site failure occurs. If a primary site failure does occur, the
target (secondary) site can take control of the managed resource groups that contain SRDF
replicated resources from the primary site. For more information about the prerequisites and
configuration of PowerHA Enterprise Edition cluster that uses EMC SRDF, see Chapter 7,
Configuring PowerHA SystemMirror Enterprise Edition with SRDF replication on page 267.
The XD Release Notes have the most up-to-date information about the required software
levels. They are available in the /usr/es/sbin/cluster/release_notes_xd file.
73
For a SAN Volume Controller environment, specific zoning requirements apply because of the
storage virtualization role of the SAN Volume Controller. Figure 2-15 shows a typical
configuration that uses SAN Volume Controller inter-cluster communication between two sites
for remote copy services.
Site A
Site B
Node B
Node A
fcs1
fcs0
SAN SWB1
SAN SWA1
SAN SWB1
T1
T3
SAN SWB2
T2
T4
T2
T4
T1
T3
SVC B(4F2)
SVC A(8G4)
I0300
I0301
I0600
I0601
DS8K A
DS8K B
Figure 2-15 Configuration example with SAN Volume Controller inter-cluster communication
Our example has two SAN Volume Controller clusters, each with two internal nodes (one I/O
group). Four FC switches are grouped in two pairs. Each pair contains a switch from site A
connected to a switch at site B in the same fabric. Each SAN Volume Controller internal node
uses four ports to connect to the SAN switches:
T1 and T3 from each node are connected to the fabric 1 (SWA1-SWB1).
T2 and T4 ports are connected to the fabric 2 (SWA2-SWB2).
The following examples show the relevant part of the cfgshow command output that is used in a
Brocade SAN environment. The zone definitions use aliases that are associated with the WWPN
of the HBAs on the nodes, DS8000 storage I/O ports, and SAN Volume Controller ports.
The following types of zones are defined:
Host-to-SVC. Each host HBA is zoned with a port on each internal node of the SAN Volume
Controller. Example 2-4 shows the zone configuration that is defined on each fabric for
node A and SVC A (8G4). A similar configuration applies for node B accessing SVC B.
Example 2-4 Zoning configurations for node access to SAN Volume Controller (site A)
Fabric 1:
zone: nodeA_fcs0__svc_8g4_T1
nodeA_fcs0; svc_8g4_n1_T1;
74
svc_8g4_n2_T1
Fabric 2:
zone: nodeA_fcs1__svc_8g4_T2
nodeA_fcs1; svc_8g4_n1_T2;
svc_8g4_n2_T2
SVC-to-storage. All ports of the SAN Volume Controller cluster nodes and storage within
the same fabric are part of the same zone. Example 2-5 shows the zoning configuration
between the SAN Volume Controller ports and DS8K storage at site A. A similar
configuration applies to site B.
Example 2-5 Zoning configuration for SAN Volume Controller to storage access (site A)
Fabric 1:
zone:
svc_8g4__ds8ka_fabric1
svc_8g4_n1_T1; svc_8g4_n1_T3;
svc_8g4_n2_T1; svc_8g4_n2_T3;
ds8ka_I0300; ds8ka_I0600
Fabric 2:
zone:
svc_8g4__ds8ka_fabric2
svc_8g4_n1_T2; svc_8g4_n1_T4;
svc_8g4_n2_T2; svc_8g4_n2_T4;
ds8ka_I0301; ds8ka_I0601
SVC-to-SVC for the inter-cluster communication relationship and remote copy services:
We define all SAN Volume Controller ports from both sites, part of the same fabric in the
same zone on each fabric (Example 2-6).
Example 2-6 Zoning configuration for SAN Volume Controller inter-cluster relationship
Fabric 1:
zone:
svc_8g4__svc_4f2_fabric1
svc_8g4_n1_T1; svc_8g4_n1_T3;
svc_8g4_n2_T1; svc_8g4_n2_T3;
svc_4f2_n1_T1; svc_4f2_n1_T3;
svc_4f2_n2_T1; svc_4f2_n2_T3
Fabric 2:
zone:
svc_8g4__svc_4f2_fabric2
svc_8g4_n1_T2; svc_8g4_n1_T4;
svc_8g4_n2_T2; svc_8g4_n2_T4;
svc_4f2_n1_T2; svc_4f2_n1_T4;
svc_4f2_n2_T4; svc_4f2_n2_T4
75
networks. GLVM can use synchronous or asynchronous modes to replicate the data between
the local and the remote copy of the data.
Storage volumes in a GLVM environment might be allocated to a single host or shared
between multiple cluster nodes within the same site. The data replication layer is integrated
with AIX LVM mirroring functions to provide flexible and simplified management. For more
information about the GLVM configuration, see Chapter 8, Configuring PowerHA
SystemMirror Enterprise Edition with Geographic Logical Volume Manager on page 339.
LPAR X
AIX
Rootvg
en0 vfc0
VIO1 A
Power HA_node 2
LPAR Y
AIX
Rootvg
Data
en0 vfc0
vfc1
VIO1 B
VIO2 A
LPAR Z
Data
vfc1
VIO2 B
LAN
SAN
External
Storage
Enclosure
Disk
Rootvg
volumes
Data
Replication
Designing virtualized cluster solutions with best practices in mind ensures that the
environments have the same or better availability as ones that use dedicated adapters. This
approach includes the use of dual VIO servers, Shared Ethernet Adapter (SEA) failover, and
76
For more information and considerations about the IBM PowerVM Virtual I/O Server, see
Power Systems: Virtual I/O Server at:
http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/topic/iphb1/iphb1.pdf
77
Figure 2-17 shows an example of how the environment backing the PowerHA cluster nodes
can look in a virtualized environment.
Virtual I/O Server (VIOS1)
ent3
(LA)
Frame1
ent1
(phy)
ent0
(phy)
ent4
(SEA)
ent2
(virt)
en0
Control
Channel
Control
Channel
ent5
(virt)
ent0
(virt)
ent5
(virt)
ent4
(SEA)
ent2
(virt)
ent3
(LA)
ent1
(phy)
ent0
(phy)
Hypervisor
Ethernet Switch
Ethernet Switch
Hypervisor
Frame2
ent1
(phy)
ent0
(phy)
ent2
(virt)
ent5
(virt)
ent0
(virt)
Control
Channel
ent3
(LA)
ent4
(SEA)
en0
ent5
(virt)
Control
Channel
ent2
(virt)
ent4
(SEA)
ent1
(phy)
ent0
(phy)
ent3
(LA)
The following rules apply for Ethernet configurations that use the Standard and Enterprise
Editions of the IBM PowerHA solution:
IPAT by way of aliasing must be used.
All virtual Ethernet interfaces that are defined to HACMP should be treated as
single-adapter networks and use the ping_client_list attribute to monitor and detect failure
of the network interfaces (/usr/es/sbin/cluster/netmon.cf file).
The Shared Ethernet Adapter bridge is active on one VIOS until failover. The VIOS with
the lowest priority value is used as the primary link bridge.
Link aggregation of EtherChannel configurations on VIOS: Link aggregation or
EtherChannel configurations on the VIOS have been implemented for extra bandwidth.
In a PowerHA environment that use Live Partition Mobility between two frames, there are
times when the primary and failover LPARs might end up on the same side. In this scenario, if
IP connectivity to the outside world is lost, heartbeats between the nodes can continue to
pass through the hypervisor unless entries in the /usr/es/sbin/cluster/netmon.cf file with
the appropriate formatting were implemented.
78
Figure 2-18 shows how this situation can occur if maintenance was being performed on the
primary machine. In this example, both the primary and secondary cluster nodes are
temporarily hosted on the same physical machine. While in this state, the physical target
server introduces a temporary single point of failure.
Frame 1
VIOS 1
SAN Storage
rootvg
VIOS 2
Frame 2
VIOS 1
VIOS 2
rootvg
PowerHA
Node 2
datavg
LPM Operation
PowerHA
Node 1
Figure 2-18 Live Partition Mobility operation - both cluster nodes on same frame
The new format for entries within the netmon.cf file assists the netmon logic probe IPs and
interfaces outside of the machine to establish an accurate status.
Format: For single-adapter virtualized networks in a PowerHA cluster, verify that you are
using the new /usr/es/sbin/cluster/netmon.cf format, as explained in APAR IZ01332:
NEW NETMON FUNCTIONALITY TO SUPPORT HACMP ON VIO at:
http://www.ibm.com/support/docview.wss?uid=isg1IZ01332
See the following sample format:
10.12.4.11
10.12.4.13
!REQD en2 100.12.7.9
!REQD en2 100.12.7.10
!REQD host1.ibm 100.12.7.9
!REQD host1.ibm host4.ibm
!REQD 100.12.7.20 100.12.7.10
In a multisite cluster, the use of virtual interfaces can be advantageous. Because the two sites
cannot be on the same network segment, the use of a single interface on the clients can help
reduce the number of IPs required to fulfill the PowerHA subnetting requirements.
IP requirement: Another requirement for IPs is that they must be pingable from the Boot
IP. For more information, see Choosing IP addresses for the netmon.cf file in the AIX 6/1
Information Center at:
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.
hacmp.plangd/ha_plan_netmon_cf_file.htm
79
With the use of a single static address at each site, the cluster can establish a network
heartbeat ring while still providing adapter redundancy and load balancing.
If only a single virtual interface was defined on the client, load balancing can be achieved by
aggregating multiple interfaces for the SEA defined on the VIOS. The simple choice is to
configure the link aggregate in a network interface backup (NIB) fashion and connect each
adapter to a different switch. This approach provides switch redundancy, but no load
balancing. To set the round-robin policy and load balance within that link aggregate, you must
connect the active links to the same switch. Therefore, when you evaluate this method,
consider a minimum of three adapters for the link aggregate. The first two active links connect
to the first switch, and the third backup adapter connects to a second switch. Load balancing
occurs between the active links, and the backup adapter is not used until either the first switch
fails or all of the active adapters in the link aggregate go offline.
Tip: There are no limitations in PowerHA System Mirror regarding whether the network
interfaces are virtual or dedicated at each site. Therefore, one site can be use dedicated
adapters, and the other site can use virtual ones.
80
VIOS
VIO
Client
WWPN
VIO
Client
WWPN
WWPN
NPIV
Enabled
SAN
WWPN
VIOS
WWPN
WWPN
NPIV
WWPN
VIO
Client
VIO
Client
VIOS
NPIV
VIO
Client
WWPN
NPIV
WWPN
NPIV
VIO
Client
VIOS
WWPN
WWPN
WWPN
VIO
Client
VIO
Client
The physical adapter is allocated to the VIO server, but in contrast to the VSCSI model. After
the virtual fiber adapters are defined to the clients and the new WWPNs are noted in the
zoning and mapping definitions, the LUNs are assigned to the clients. They are assigned the
same way that they use the dedicated adapters. Therefore, the LUN mappings bypass the
VIOS and allow the physical volumes to be detected on the client because they are in a
dedicated environment.
Figure 2-20 shows a sample virtualization configuration that contrasts VSCSI volumes with
virtual fiber volumes. In this example, the first three volumes, which include the rootvg, are
attached by using VSCSI server (vhost) and client definitions (vscsi). The next two volumes
(hdisk3 and hdisk4) in the npiv_vg volume group are presented through virtual fiber adapters.
The disk characteristics for these two LUNs show up on the host in the same fashion as when
using dedicated HBAs.
Mercury
VIOS 1
hdisk
HBA
hdisk
vhost0
vhost0
MPIO
vscsi1
hdisk1
hdisk2
fcs0
NPIV
HBA
VIOS 2
hdisk0
vscsi0
Hypervisor
HBA
LUNS
VSCSI
Node A
NPIV
HBA
fcs1
MPIO
hdisk3
hdisk4
} rootvg
}
}
vscsi_vg
npiv_vg
LUNS
NPIV
STORAGE
SUBSYSTEM
Zeus
VIOS 1
hdisk0
vscsi0
hdisk
HBA
hdisk
vhost0
vhost0
MPIO
Hypervisor
HBA
VIOS 2
Node B
NPIV
HBA
vscsi1
hdisk1
hdisk2
fcs0
NPIV
HBA
fcs1
MPIO
hdisk3
hdisk4
} rootvg
}
}
vscsi_vg
npiv_vg
81
PowerHA Enterprise Edition is supported on IBM Virtual I/O clients for SAN Volume
Controller, DS/ESS PPRC, and GLVM environments. The VIOS allows a machine to be
divided into LPARs, with each LPAR running a separate OS image, allowing the sharing of
physical resources between the LPARs including virtual SCSI and virtual Ethernet. The VIOS
can also support the NPIV feature, which provides a virtual Fiber Channel adapter to the
client partition, allowing transparent access to external storage subsystems. PowerHA
Enterprise Edition 6.1supports both VSCSI and NPIV storage configurations with VIOS 2.1.
Figure 2-21 shows an example of using Virtual SCSI for a SAN Volume Controller replication
environment.
Figure 2-21 Using virtual SCSI in a SAN Volume Controller PPRC environment
The VIOS has a few disks that can be SCSI or Fibre Channel SAN disks. The VIO clients use
the VIO client device driver just as they would with a regular local device disk to communicate
with the matching server VIO device driver. Then, the VIOS does the disk transfers on behalf
of the VIO client. Because the SAN Volume Controller devices are not directly attached to the
VIO clients, normal query commands, such as lscfg, lsvpcfg, and datapath query device,
cannot be used to extract the necessary SAN Volume Controller vdisk information.
The PowerHA Enterprise Edition solution has the support for using the disk resources that are
defined on the VIO clients in the cluster environment, for example:
It allows disks that are defined as HACMP resources to be traced back to the source
physical disk on the SAN Volume Controller.
When you define the SAN Volume Controller PPRC resources, you use VDisks from the
SAN Volume Controller configuration.
Cluster events run SAN Volume Controller commands against VDisks. It maps hdisk on
VIOS client to the VDisk on SAN Volume Controller.
The disk definition in VIOS to the client partition is one-to-one between the VDisk that is
assigned to the VIOS and the hdisk that is defined in the VIO client. Configurations that use
logical volumes or file-backed devices that are mapped to the VIO clients are not allowed.
82
No special configuration steps are required for defining the disks to the client partitions. Use
the following steps to configure the disks on the client partition. We assume that the SAN
Volume Controller configuration is prepared and that the mdisks are already defined:
1. On the SAN Volume Controller clusters:
a. Identify the managed disks MDisks:
svcinfo lsmdisk
b. Identify or create the managed disk groups MDisk group by using one of the following
commands:
svcinfo lsmdiskgrp
svctask mkmdiskgrp
c. Identify or create the virtual disks by using one of the following commands:
svcinfo lsvdisk
svctask mkvdisk
d. Map the VDisks to the VIO servers as hosts:
svctask mkvdiskhostmap
2. On the VIOS:
a. Access to the regular AIX command-line interface on the VIO server:
oem_setup_env
b. Run cfgmgr or, to speed up the process, run with the flag on the vio#:
# cfgmgr -vl vio0
c. Identify the hdisks/vpaths mapped to the SAN Volume Controller VDisks on the
servers:
odmget -q "id=unique_id" CuAt
d. Select the disk to export by running lsdev to show the virtual SCSI server adapters that
can be used for mapping with a physical disk.
e. Run the mkvdev command by using the appropriate hdisk numbers to create the virtual
target device. (This command maps the LUNs to the virtual I/O clients.)
$ mkvdev -vdev hdisk# -vadapter vhost# -dev vhdisk#
f. To bring the vtd and vhosts from the available state to the defined state, run the
following commands:
# rmdev -l
# rmdev -l
<vtd devices>
vhost#
83
84
NPIV/Live Partition Mobility: The SAN Volume Controller Global Mirror and DS Metro
Mirror scenarios in this book were built by using NPIV/Live Partition Mobility capable
configurations at the primary site that replicated to a non-virtualized remote site.
From a positioning standpoint, Live Partition Mobility is designed for planned maintenance.
For the movement to work, all the resources must be virtualized. The minimum HMC
requirement for Live Partition Mobility was V7 R3.2, which required all of the systems to be
visible by the same HMC. Inter-HMC Live Partition Mobility operations became available in
the V7 R3.4 HMC update where that requirement was lifted and each system can now be
managed by its own HMC. If ssh authentication is enabled between the HMCs, then during
the Live Partition Mobility selections, there is an option to specify the target HMC and server
to which to migrate. When the move is initialized, a validation process must complete and
then the servers memory is cached and moved over when fully synchronized. The cutover
results in approximately a 2-second pause in the processing.
In contrast, PowerHA is designed for both planned and unplanned outages. The cluster
software can move the application by initiating a graceful stop of the resources and by
invoking the start scripts at the failover target node. If an unexpected outage occurs, the
cluster software detects the failure, initiates a takeover, and manages the activation of
resources automatically.
One of the assumptions for Live Partition Mobility to function is that all nodes start from the
SAN. You can begin to see where NPIV function is really beneficial in DR implementations.
Such an environment has the same look and feel as an environment that uses dedicated
adapters, and provides a view of the cluster LUNs that is not affected by the VSCSI limitations.
85
To better understand this concept, we must first identify the various components that back the
environment. Figure 2-22 shows a common Live Partition Mobility capable clustered
environment.
Cluster Name: Cluster 1
Network: net_ether0
FRAME B
FRAME A
VIO
HBA
HBAs
SAN
hdisk2
hdisk1
SWITCH
VIO
VIO
hdisk0
SAN
HBAs
HBA
Storage
Area
Network
hdisk0
HBA
sharedvg
{
{
Node A
HBA
rootvg
Node B
VIO
Redundant Switches
hdisk2
hdisk1
}
}
rootvg
sharedvg
SWITCH
Storage Controllers
Storage
Subsystem
Figure 2-22 Live Partition Mobility with SAN booting points of failure
Multitude of clients: Even though Figure 2-22 only shows one client LPAR behind the VIO
servers, typical environments back a multitude of clients.
Figure 2-22 highlights how many layers of redundancy are overlooked by simply pulling all of
the fiber interconnects from one side. Working our way from the storage subsystem up, we
can make the following assumptions:
LUNs presented from SAN-attached enclosures are typically raided on the backend and
protected from physical disk drive failures.
The use of multiple storage controller connections also provides protection against
internal component failures.
The use of a redundant fabric provides protection against the loss of a SAN switch.
The use of dual VIOS provides multipathing and protects against a VIO or physical HBA
failure.
Virtualization and Live Partition Mobility can provide significant benefits in most
environments. This configuration helps you understand the significance of implementing a
redundant infrastructure and certain new complexities about component failure testing in a
virtualized environment.
86
87
Test scenarios: All of the test scenarios in this book were performed using PowerHA
Enterprise Edition 6.1 SP1 on AIX 6.1 TL2 - SP3, with the exception of the GLVM
migration section.
The Enterprise Edition requires the installation and acceptance of license agreements for
both the Standard Edition cluster.license file set and the Enterprise Edition
cluster.xd.license file set, as shown in Table 2-6, for the remainder of the file sets to install.
Table 2-6 PowerHA Enterprise Edition - required file set
Required package
cluster.xd.license
The base file sets in the Standard Edition are required to install the Enterprise Edition file
sets. The Enterprise package levels must match those of the base runtime level
(cluster.es.server.rte). Table 2-7 displays the itemized list of file sets for each of the
integrated offerings.
Table 2-7 PowerHA Enterprise Edition - integrated offering solution file sets
Replication management type
Direct management
cluster.es.pprc.rte
cluster.es.pprc.cmds
cluster.msg.en_US.pprc
DSCLI
cluster.es.spprc.cmds
cluster.es.spprc.rte
cluster.es.pprc.rte
cluster.es.pprc.cmds
cluster.msg.en_US.pprc
cluster.es.svcpprc.cmds
cluster.es.svcpprc.rte
cluster.msg.en_US.svcpprc
Geographic
glvm.rpv.client
glvm.rpv.msg.en_US
glvm.rpv.server
glvm.rpv.util
glvm.rpv.man.en_US
glvm.rpv.msg.en_US
cluster.msg.en_US.glvm
EMC SRDF
cluster.es.sr.cmds
cluster.es.sr.rte
cluster.msg.en_US.sr
Installation notes
If you are using IPv6 networking:
Contact IBM support to have the latest version of the HMC microcode (bundle 64.30.78.0
or later) installed.
For customers who are using SNMP traps notification (for asynchronous alerts of changes
in the PPRC and consistency group status), a known problem exists with the handling of
traps in a network that use only IPv6. IBM intends to remove this restriction in a future
service pack. See APAR IZ60486.
88
PowerHA min.
version
AIX 5.3
AIX 6.1
VIOS/NPIV
GLVM Synchronous
5.2 + IY66555
Yes - ML2
Yes
Yes
GLVM Asynchronous
5.5 SP1
No
Yes
5.2
Yes
Yes
5.5
Yes - TL9
RSCT 2.4.10.0
Yes - TL2
RSCT 2.5.2.0
5.1
Yes
Yes
5.2 + IY73937
5.3 + IY74112
Yes - ML1
Yes
EMC SRDF
DMX - 3
DMX - 4
V-MAX
6.1
Yes - TL9
RSCT 2.4.12.0
No
89
ESS
DS8000
DS6000
SVC
6.1
2.2.0.0 (2, 3)
2.2.0.4 (2)
2.2.0.0 (2)
2.2.0.3 (2, 3)
5.3
2.2.0.0 (1, 2)
2.2.0.4 (1, 2)
2.2.0.0 (1, 2)
2.2.0.3 (2, 3)
1. Requires AIX 5.3 TL04 (with all the latest PTFs) or later or AIX 5.3 TL6 SP1 (5300-06-01) or
with VIOS V1.4 or later.
2. Persistent Reservation with PowerHA is not supported. Shared volume groups that are
managed by HACMP and accessed through SDDPCM must be in enhanced concurrent mode.
3. If you are running the SAN Volume Controller release that includes APAR IC55826 (SAN
Volume Controller V4.2.1.6 and later) or the SAN Volume Controller release before V4.2.1.6
with an interim fix on LONG BUSY, the following AIX APARs and AIX interim fix are required
for your AIX TL:
AIX53 TL06: APAR IZ06622 IZ20198 IZ28285 (IZ28285.080728.epkg.Z)
AIX53 TL07: APAR IZ06490 IZ19199 IZ26561 (IZ26561.080728.epkg.Z)
AIX53 TL08: APAR IZ07063 IZ20199 IZ26655 (IZ26655.080728.epkg.Z)
AIX61 TL00: APAR IZ09534 IZ20201 IZ26657 (IZ26657.080728.epkg.Z)
AIX61 TL01: APAR IZ06905 IZ20202 IZ26658 (IZ26658.080728.epkg.Z)
The third column of APARs might not be published yet. However, they are available in the
interim fix format. You can download these interim fixes from the AIX interim fix website, which
is an anonymous FTP site.
Before you install SDDPCM, you must uninstall the ibm2105.rte (Version 32.6.100.XX) and
devices.fcp.disk.ibm.rte (Version 1.0.0.XX) SDD host attachment packages along with the
SDD package. The SDD driver and SDDPCM module cannot coexist on one server.
You must install the ESS, DS8000, DS6000, or SAN Volume Controller host attachment for
SDDPCM:
devices.fcp.disk.ibm.mpio.rte (version 1.0.0.12)
90
This version of SDDPCM package requires the host to run with least AIX53 TL04 with
the latest PTFs. SVC attachment can be achieved with AIX 5.3 TL03 with the
following APARs installed:
AIX53 TL03 APAR: IY79165, IY81545
Important: ESS SCSI devices are not supported by SDDPCM, nor is an ESS host
attachment for SCSI MPIO.
You might have installed an earlier version (earlier than v33.6.100.9) of SDDPCM host
attachment and are going to update to a new SDDPCM host attachment Version 1.0.0.0 or
later. In this case, you must remove all SDDPCM mpio devices before you start to update this
package. The reason is that the ESS device type changed, starting from version 33.6.100.9.
ESS
DS8000
DS6000
SVC
6.1
1.7.1.0
1.7.2.0
1.7.0.0
1.7.1.0
5.3
1.7.1.0 *
1.7.2.0 *
1.7.0.0 *
1.7.1.0 a
For the most current information about the latest release of SDD, see the Troublshoot page at:
http://www.ibm.com/servers/storage/support/software/sdd
91
CLI location: We assume that the command-line interface is installed in its old default
location, /opt/ibm/ibm2105cli. You might be required to create a link from the current
default installation directory to this location.
IBM ESS host attachment script disk device driver: ibm2105.rte version 32.6.100.25 or
later. This driver supports all DS and ESS storage units.
For DS8000 when used with PowerHA/XD for Metro Mirror with DSCLI Management:
DSCLI Version 5.3.1.236 or later on AIX 6.1
DSCLI Version 5.1.730.216 or later on AIX 5.3
For DS6000 when used with PowerHA/XD for Metro Mirror with DSCLI Management:
DSCLI Version 5.3.1.236 or later on AIX 6.1
DSCLI Version 5.1.730.216 or later on AIX 5.3
For ESS 800 when used with PowerHA/XD for Metro Mirror with DSCLI Management:
ESSCLI Version 2.4.4.129 or later when running on AIX 6.1
ESSCLI Version 2.4.4.129 or later when running on AIX 5.3
Before you install or migrate to PowerHA/XD for Metro Mirror with DSCLI, ensure that you
installed the following prerequisite software:
PowerHA/XD for Metro Mirror with DSCLI requires AIX 5.3 or later.
PowerHA/XD for Metro Mirror with DSCLI requires HACMP Version 5.4 or later.
When you combine heterogeneous storage units (ESS800, DS6000, and DS8000):
The ESS CLI version as stated for ESS800 support must be installed on all nodes in
the cluster.
DSCLI versions must be installed as stated for each storage type.
SDD or SDDPCM drivers must be installed on the nodes local to each storage type.
2.5.5 Requirements for PowerHA Enterprise Edition for Metro Mirror with SAN
Volume Controller
The openssh Version 3.6.1 or later (for access to SAN Volume Controller interfaces) software
and microcode levels are required.
When using SDDPCM:
Subsystem Device Driver Path Control Module SDDPCM): v 2.2.0.0 or later
IBM Host attachment scripts:
92
93
94
Part 2
Part
95
96
Chapter 3.
97
Site NY
Site POK
Node
Rebecca
Node Zhifa
WAN
IBM Power Server
AIX 6.1
PowerHA 6.1
SAN
IBM DS4800
IBM DS8000
hdisk 1
hdisk 2
hdisk 3
pokvg
LVM COPY 1
hdisk 5
hdisk 6
hdisk 7
hdisk 9
hdisk 4
hdisk 8
LVM COPY 2
hdisk 11
hdisk 12
LVM COPY 2
hdisk 13
nyvg
hdisk 10
hdisk 14
hdisk 15
hdisk 16
LVM COPY 1
Figure 3-1 shows that cross-site LVM mirroring consists of at least two sites with one node
server that is connected to an external storage at each site. These sites are connected by
using an IP network and a SAN network. The data availability is ensured through the LVM
mirroring between the volumes that are on separate storage subsystems on separate sites.
These remote disks can be combined into a volume group by using the AIX Logical Volume
Manager. This volume group can be imported to the nodes at separate sites. You can create
logical volumes and set up an LVM mirror with a copy at each site. The number of active sites
in a cross-site LVM mirroring that is supported in PowerHA environment is limited to two.
If a site failure occurs, PowerHA SystemMirror performs a takeover of the resources to the
secondary site according to the cluster policy configuration. It activates all defined volume
groups from the surviving mirrored copy. If one storage subsystem fails, data access is not
98
interrupted and applications can access data from the active mirroring copy on the surviving
disk subsystem.
PowerHA SystemMirror also drives automatic LVM mirroring synchronization after site or
storage failure. When the failed site joins the cluster, it automatically fixes removed and
missing volumes (PV states removed and missing) and synchronizes the data. Automatic
synchronization is not possible for all cases, but we can use the C-SPOC menu to
synchronize the data from the surviving mirrors to stale mirrors after a disk or site failure.
The limitation of this solution is the distance between the primary site and the secondary site.
The distance is closely related to the SAN technology because it uses synchronous mirroring
that requires high bandwidth. The following protocol technologies are usually used for
connection between sites in a cross-site LVM mirroring:
Fibre Channel over IP (FCIP)
FCIP is a protocol specification that was developed by the Internet Engineering Task
Force (IETF). It allows a device to transparently tunnel Fibre Channel frames over an IP
network. An FCIP gateway or edge device attaches to a Fibre Channel switch and
provides an interface to the IP network. At the remote SAN, another FCIP device receives
incoming FCIP traffic and places Fibre Channel frames back onto the SAN. FCIP devices
provide Fibre Channel expansion port connectivity, creating a single Fibre Channel fabric.
Wave division multiplexing (WDM) devices
This technology includes the following multiplexing types:
Coarse wavelength division multiplexing (CWDM), which is the less expensive
component among the WDM technology
Dense wave length division multiplexing (DWDM), which is explained in 2.1.7, DWDM
on page 49.
Cross-site LVM mirroring has the following advantages:
It reduces license costs because it requires only the IBM PowerHA SystemMirror
standard license.
Easy setup and maintenance, especially for a customer who has good AIX LVM or
PowerHA skills.
Cross-site LVM mirroring has the following considerations:
It supports only synchronous replication.
You must provide SAN or Fibre Connection between sites.
99
The new SAN Volume Controller offering provides support for a split I/O group (Figure 3-2)
configuration. It allows the SAN Volume Controller cluster nodes in the same I/O group to be
split across buildings within a complex. This offering was formerly available through RPQ with
older SAN Volume Controller versions. It is now considered a supported configuration in the
SAN Volume Controller 5.1 code update.
Failure
Domain 1
Failure
Domain 2
Failure
Domain 3
ISL 1
Node 1
Node 2
ISL 2
Disk system that
supports
"Extended Quorum"
VDisk
Mirroring
SVC Quorum 1
SVC Quorum 2
Active Quorum
Active Quorum
Active Quorum
This solution assumes the use of the SAN Volume Controller as the storage front end within
the environment and is comparable to a configuration that uses AIX logical volume mirroring.
The difference is that, instead of detecting two physical volumes, only a single VDisk is visible
at the host level. The mirroring is not visible. To the cluster, it looks like a traditional
shared-disk environment. However, SAN Volume Controller provides protection against the
loss of a storage enclosure, and, if a failure occurs, any stale partitions are not an issue for
the host.
In implementations that use AIX 6.1, use mirror pools to harden logical volume mirroring and
prevent the accidental spanning of logical partitions onto the wrong mirror copy whenever
extending logical volumes through the command line. A solution that uses VDisk mirroring
avoids this extra step and can simplify the management of the environment at the host level. It
also avoids the extra processor utilization at the host level to maintain the mirrors and
manage tasks such as mirror write consistency (MWC).
This approach has the following advantages:
100
101
As data is synchronously transferred, the distance between the local and the remote disk
subsystems determines the effect on application response time. Figure 3-3 illustrates the
sequence of a write update with Metro Mirror.
Power
Server
Power
Server
4
Write acknowledge
Server
write
1
Write
2 to secondary
LUN or
volume
Primary
(source)
LUN or
volume
3
Write complete
acknowledgment
DS8000
DS8000
Secondary
(target)
When the application performs a write update operation to a source volume, the process has
the following flow:
1.
2.
3.
4.
The Fibre Channel connection between the local and the remote disk subsystems can be
direct, through a switch, or through other supported distance solutions such as DWDM.
102
The PowerHA SystemMirror with Metro Mirror provides automated copy split in case of
primary site failure and automated reintegration when the primary site becomes available.
Figure 3-4 shows a typical configuration for PowerHA SystemMirror with Metro Mirror.
Location 1
Location 2
WAN
DS/ESS 2
DS/ESS 1
DS/ESS
WDM,OC3,IP
DS/ESS Specialist
Copy Services
DS/ESS Specialist
Copy Services
DS/ESS Net
103
Figure 3-5 illustrates the logical diagram of a PowerHA SystemMirror solution with GLVM.
Site UK
XD_IP / XD_Data
XD_IP / XD_Data
WAN
Node B1
Node A2
LAN
Power 550
AIX 6.1
PowerHA 6.1
GLVM 6.1
Power 550
AIX 6.1
PowerHA 6.1
GLVM 6.1
hdisk 1
hdisk 2
hdisk 3
hdisk 4
hdisk 5
hdisk 6
hdisk 7
hdisk 8
Power 550
AIX 6.1
GLVM 6.1
PowerHA 6.1
hdisk 1
hdisk 5
hdisk 2
hdisk 6
hdisk 3
hdisk 7
hdisk 4
hdisk 8
GLVM performs the remote mirroring of AIX logical volumes by using the basic AIX LVM
functions for optimal performance and ease of configuration and maintenance.
PowerHA/XD GLVM provides two essential functions:
Remote data mirroring
Automated failover and failback
Together these functions provide high-availability support for applications and data across a
standard Internet Protocol network to a remote site.
PowerHA SystemMirror with GLVM provides the following capabilities:
It allows automatic detection of and response to site and network failures in the
geographic cluster without user intervention.
It performs automatic site takeover and recovery and keeps mission-critical applications
highly available through application failover and monitoring.
It allows for simplified configuration of volume groups, logical volumes, and
resource groups.
It uses the Internet Protocol network for remote mirroring over an unlimited distance.
It supports maximum sized logical volumes.
104
Figure 3-6 shows a diagram of how PowerHA SystemMirror with GLVM works.
New York
UK
Node 2
Node 1
Application
RPV Server
kernel extension
LVM
rpserver0
RPV Client
device driver
hdisk4
hdisk8
hdisk4
Figure 3-6 PowerHA SystemMirror with GLVM work process when node 1 is active
Figure 3-6 shows an example of two sites, with one node at each site. Node 1 has one
physical volume, hdisk4, and the remote node has one physical volume also, hdisk4. These
disks are configured to be one volume group and mirror each others. Viewing the replication
from node 1, we see that the destination physical volume hdisk4 on node 2 is presented on
Node1 as hdisk8. Currently, node 2 functions as an RPV server, and node 1 functions as an
RPV client.
105
The reverse condition happens when node 1 is down or the application moves to node 2
(Figure 3-7). The replication occurs from node 2 to node 1 and the physical volume hdisk4
from node 1 is presented in node 2 as hdisk8.
UK
New York
Node 1
Node 2
Application
RPV Server
kernel extension
LVM
rpserver0
RPV Client
device driver
hdisk4
hdisk8
hdisk4
Figure 3-7 PowerHA SystemMirror with GLVM work process when application moves to node 2
PowerHA SystemMirror Enterprise Edition with GLVM has the following key components,
among others:
Remote physical volume (RPV)
RPV is a pseudo device driver that provides access to the remote disks as though they
are locally attached. The remote system must be connected by using a Internet Protocol
network. The distance between the sites is limited by the latency and bandwidth of the
connecting networks.
The RPV consists of two parts:
RPV client
This pseudo device driver runs on the local machine and allows the AIX LVM to access
remote physical volumes as though they are local. The RPV clients are seen as hdisk
devices, which are logical representations of the remote physical volume.
The RPV client device driver appears as an ordinary disk device (for example, RPV
client device hdisk8) and has all its I/O directed to the remote RPV server. It is unaware
of the nodes and networks.
In PowerHA/XD, concurrent access is not supported for GLVM. Therefore, when you
are accessing the RPV clients, the local equivalent RPV servers and remote RPV
clients must be in a defined state.
RPV server
The RPV server runs on the remote machine, one for each physical volume that is being
replicated. The RPV server can listen to a number of remote RPV clients on separate
hosts to handle failover. The RPV server is an instance of the kernel extension of the
RPV device driver with names such as rpvserver0 and is not an actual physical device.
106
107
Example 3-2 shows the results from a 5-minute run of the same rand.ksh script. The output
shows the total IOPS and the overall throughput that is driven through the LUN that is hosting
the unmirrored logical volume.
Example 3-2 Output from ndisk64 test that is run against unmirrored volume
# ./rand.ksh 300
# cat rsync5.03.20.2010
Command: All Writes -R -r 0 -M 30 -s 12G -b 4k -t 300 -f /dev/rmikeylv
Synchronous Disk test (regular read/write)
No. of processes = 30
I/O type
= Random
Block size
= 4096
Read-Write
= Write Only
Sync type: none = just close the file
Number of files = 1
File size
= 12884901888 bytes = 12582912 KB = 12288 MB
108
Run time
= 300 seconds
Snooze %
= 0 percent
----> Running test with block Size=4096 (4KB) ..............................
Proc - <-----Disk IO----> | <-----Throughput------> RunTime
Num TOTAL
IO/sec |
MB/sec
KB/sec Seconds
1 330990
1103.3 |
4.31
4413.22
300.00
2 334098
1113.7 |
4.35
4454.62
300.00
...............
...............
29 334247
1114.2 |
4.35
4456.62
300.00
30 333953
1113.2 |
4.35
4452.66
300.00
TOTALS
9982634 33275.5 |
129.98 Rand procs= 30 read= 0% bs= 4KB
The total IOPS show a value of 33275.5, which resulted from a test that was run by using 30
threads against an unmirrored SAN-attached SAN Volume Controller LUN. In our scenarios,
more tests were performed by using different thread counts and different local logical
volumes, which had mirroring in place. The summary of the results is described in the next
section.
IOPS
1200.0
Unmirrored
1000.0
Local Mirror
800.0
Remote Mirror
600.0
400.0
200.0
0.0
1
10
30
# Threads
Figure 3-8 Cross-site LVM mirroring impact
109
In the second cluster, we tested SAN-attached volumes from a SAN Volume Controller backed
by DS8000 storage. The mirrored LUNs used for our testing were being replicated by using
Metro Mirror and Global Mirror Copy Service functions. The distance between our two sites was
the same as in our other test scenarios, but we had no control over the additional load that was
being imposed on the SAN Volume Controller by other servers that were attached.
Figure 3-9 shows the mirroring penalty and how the Global Mirror replication provides better
throughput than the Metro Mirrored volumes.
IOPS
25000
Unmirrored
20000
Metro Mirror
15000
Global Mirror
10000
5000
0
1
10
30
# Threads
In our last clustered environment, we tested DS8000 mirrored and unmirrored LUNs that
replicated between separate storage units, which were at each site. The distance and SAN
infrastructure were also the same as in the previous tests.
110
Figure 3-10 shows the affect of the Metro Mirror replication as it pertains to the number of
IOPS. We elected not to test DS Global Mirror because it currently has no integration with the
PowerHA Enterprise Edition.
IOPS
1000
800
Unmirrored
Metro Mirror
600
400
200
0
1
10
30
# Threads
111
servers running at 60% utilization and 100% utilization, highlighting the difference between
the two mirroring solutions.
Ultimately, the study determined that writes ended up taking less time in the cross-site LVM
configuration than in the Metro-Mirrored environment, and slightly longer for reads
(Figure 3-11).
Figure 3-11 Remote mirroring performance results from customer proof of concept
The study explains that this is a result of AIX LVM Mirroring submitting both write requests at
nearly the same time. The write acknowledgement is sent back from both disk subsystems as
soon as they both have the data in the cache. When both acknowledgements are received by
the host, the application is signaled to go ahead to the next transaction. In Metro Mirroring,
the host system sends a write request to local storage and the enclosure then sends a
second write request to the remote subsystem. Acknowledgement must be received back at
the host before the application can proceed.
3.5.5 Summary
There are different solutions available to replicate the data in a campus-style disaster
recovery environment. Evaluating the I/O characteristics in the environment and considering
the performance impact that is imposed by the replication should all be part of the design
planning. Our test results seem to fall in line with the results documented in the case study.
Although we did not focus on the read aspect, the write characteristics showed AIX Logical
Volume Mirroring providing more consistent results. Since logical volume mirroring is a
function specific to AIX and the individual host, make further considerations when you select
a technology for data replication. The disk mirroring functions are not platform-specific and
can be used to provide a common replication method between servers in the environment.
The use of IBM resources to assist with the design planning and performance evaluation are
encouraged.
112
Chapter 4.
113
Site UK
WAN
Node Zhifa
Node Rebecca
Power 550
AIX 6.1
Power HA 6.1
Power 550
AIX 6.1
Power HA 6.1
SAN switch
SAN switch
DWDM
DWDM
IBM DS8000
IBM DS4800
Creating a cluster
Create a cluster and name it XLVM_cluster. You create it by using SMIT menus (Figure 4-2).
We enter the smitty hacmp command. Then, select Extended Configuration Extended
Topology Configuration Configure an HACMP Cluster Add/Change/Show an
HACMP Cluster.
Add/Change/Show an HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[XLVM_Cluster]
* Cluster Name
NOTE: HACMP must be RESTARTED
on all nodes in order for change to take effect
Figure 4-2 SMIT menu for creating the cluster
* Node Name
Communication Path to Node
[Entry Fields]
[Zhifa]
[192.168.8.101]
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
115
Run the smitty hacmp command once again to add the second node. Then, add the cluster
site by running the following SMIT command. Enter the smitty hacmp command, and then,
select Extended Configuration Extended Topology Configuration Configure
HACMP Sites Add a Site. Figure 4-4 shows the Add Site panel.
Add Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[POK]
Zhifa
* Site Name
* Site Nodes
Site Nodes
For each site, define at least one node that is in the site.
You can add multiple nodes by leaving a blank space between the names. Each node can
only belong to one site.
Run the smitty hacmp command again to create site NY.
IP network
The IP network uses TCP/IP communication and has the following functions:
IP services
IP base
IP persistent
In this example, we use the configuration that is shown in Table 4-1 for the IP network.
Table 4-1 List of IP addresses in our environment
116
Node Zhifa
Node Rebecca
IP base
192.168.8.101
192.168.8.102
IP services
192.168.100.53
192.168.100.92
IP persistent
192.168.100.171
192.168.100.171
Register the IP addresses from Table 4-1 in the /etc/hosts file (Example 4-1).
Example 4-1 List of IP addresses and host names in the /etc/hosts file
# service addresses
192.168.100.53 zhifa_svc
192.168.100.92 rebecca_svc
XLVM_550_1_A_SVC
XLVM_550_2_B_SVC
# boot addresses
192.168.8.101 zhifa_boot
192.168.8.102 rebecca_boot
XLVM_550_1_A_boot
XLVM_550_2_B_boot
# persistent addresses
192.168.100.171 rebecca_per XLVM_550_2_B
192.168.100.171 zhifa_per XLVM_550_1_A Zhifa
To configure the IP network, run the SMIT command (Figure 4-5). Enter the smitty hacmp
command. Then, select Extended Configuration Extended Topology Configuration
Configure HACMP Networks Add a Network to the HACMP Cluster.
Add an IP-Based Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_ether_01]
ether
[255.255.255.0]
[Yes]
[]
Then, configure the communication network for node Zhifa and node Rebecca by running the
smitty hacmp SMIT command. Then, select Extended Configuration Extended
Topology Configuration Configure HACMP Communication Interfaces/Devices
Add Communication Interfaces/Devices (Figure 4-6). As shown in Figure 4-6, we
registered the IP base of the communication interfaces.
Add a Communication Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
IP Label/Address
Network Type
Network Name
Node Name
Network Interface
[Entry Fields]
[zhifa_boot]
ether
net_ether_01
[Zhifa]
[]
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
117
Add the IP persistent address during the network configuration by running the smitty hacmp
SMIT command. Then, select Extended Configuration Extended Topology
Configuration Configure HACMP Persistent Node IP Label/Addresses Add a
Persistent Node IP Label/Addresses. Figure 4-7 shows the SMIT menu for adding the IP
persistent address.
Add a Persistent Node IP Label/Address
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Node Name
* Network Name
* Node IP Label/Address
Netmask(IPv4)/Prefix Length(IPv6)
[Entry Fields]
Zhifa
[net_ether_01]
[zhifa_per]
[255.255.255.0]
+
+
Run the smitty hacmp command to create the IP persistent address for node Rebecca. After
you create the IP base and IP persistent addresses, create the non-IP network.
Non-IP network
A non-IP network is used for cluster heartbeat to prevent the split-brain condition in a cluster
environment. Certain devices can be used as a non-IP network. In this case, we use a hard
disk as a cluster non-IP network.
Two kinds of disk heartbeats are possible:
Traditional disk heartbeat
Disk heartbeat is a form of non-IP heartbeat that uses the existing shared disks of any disk
type. This feature, which was introduced in HACMP 5.1, is the most common and
preferred method of non-IP heartbeat. It eliminates the need for serial cables or 8-port
asynchronous adapters. Also, it can easily accommodate greater distances between
nodes when using a SAN environment.
This feature requires usage of enhanced concurrent volume groups to allow access to the
disk by each node. It uses a special reserved area on the disks to read and write the
heartbeat data. Because it uses a reserved area, it allows the use of existing data volume
groups without losing any additional storage space.
It is possible to use a dedicated disk or LUN for disk heartbeat. However, because disk
heartbeat uses the reserved space, the remaining data storage is not used. The bigger the
disk/LUN you use solely for this purpose, the more space is not used. However, you can
use it later for extra storage space if needed.
A traditional disk heartbeat network is a point-to-point network. If more than two nodes
exist in your cluster, you need a minimum of N number of non-IP heartbeat networks,
where N represents the number of nodes in the cluster. For example, a three-node cluster
requires at least three non-IP heartbeat networks.
Multinode disk heartbeat
HACMP 5.4.1 introduced another form of disk heartbeating called multi-node disk
heartbeat (mndhb). Unlike traditional disk heartbeat network, it is not a single point-to-point
network. Instead, as its name implies, it allows multiple nodes to use the same disk.
However, it requires configuring a logical volume on an enhanced concurrent volume group.
118
Where this heartbeat can reduce the total number of disks that are required for non-IP
heartbeating, we configured it with multiple disks to eliminate a single point of failure.
Multinode disk heartbeating also offers the ability to start one of the following actions:
Halt
Fence
Shutdown
Takeover
In our environment, we create a traditional disk heartbeat by using two disks, one in each
node, for redundancy purposes. This redundancy is set to keep alive the cluster in case of
one disk subsystem fails. The first network uses disks from the IBM DS8000 storage, and the
second network uses disks from the IBM DS4800 storage. The first network uses physical
volume hdisk1, and the second network uses physical volume hdisk5. These disks are
assigned into a volume group, pokvg, that is created in 4.1.2, Configuring the cross-site LVM
disk mirroring dependency on page 121.
To create a disk heartbeat network:
1. Add a disk heartbeat network. Run smitty hacmp.
2. Select Extended Configuration Extended Topology Configuration Configure
HACMP Networks Add a Network to the HACMP Cluster.
3. Enter the Network Name net_diskhb_01, select the Network Type diskhb, and press
Enter (Figure 4-8).
Add a Serial Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Network Name
* Network Type
[Entry Fields]
[net_diskhb_01]
diskhb
We create net_diskhb_01 for the heartbeat network by using the DS8000, and then create
net_diskhb_02 for the heartbeat network by using the DS4800.
4. To add the heartbeat devices, run the smitty hacmp SMIT command. Then, select
Extended Configuration Extended Topology Configuration Configure HACMP
Communication Interfaces/Devices Add Communication Interfaces/Devices.
Figure 4-9 on page 120 shows the creation of a communication device in the DS8000 for
node Zhifa. Continue this process to create a communication device in the DS8000 for
node Rebecca. Then, create a communication device in the DS4800 for nodes Zhifa and
Rebecca.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
119
*
*
*
*
*
[Entry Fields]
[ds8k_hb]
diskhb
net_diskhb_01
[/dev/hdisk1]
[Zhifa]
Device Name
Network Type
Network Name
Device Path
Node Name
Action
dhb_read -p devicename
dhb_read -p devicename -r
dhb_read -p devicename -t
To test the diskhb network connectivity, we set node Zhifa as the receiver, and set node
Rebecca as the transmitter:
1. On node Zhifa, enter:
dhb_read -p hdisk1 -r
2. On node Rebecca, enter:
dhb_read -p hdisk1 -t
If the link between the nodes is operational, both nodes show Link operating normally
(Example 4-2).
Example 4-2 Disk heartbeat link testing
Node Zhifa:
root@Zhifa / > dhb_read -p hdisk1 -r
DHB CLASSIC MODE
First node byte offset: 61440
Second node byte offset: 62976
Handshaking byte offset: 65024
Test byte offset: 64512
Receive Mode:
Waiting for response . . .
Magic number = 0x87654321
Magic number = 0x87654321
120
dhb_read -p hdisk1 -t
offset:
offset:
offset:
offset:
61440
62976
65024
64512
Transmit Mode:
Magic number = 0x87654321
Detected remote utility in receive mode.
Magic number = 0x87654321
Magic number = 0x87654321
Link operating normally
The volume groups that are associated with the disks used for the disk heartbeating network
do not have to be defined as resources within a resource group. However, if it uses the shared
volume group for disk heartbeating, as in our case, then it can be defined as a resource in the
resource group.
Disk heartbeat testing: Disk heartbeat testing can be done only when PowerHA is not
running on the nodes.
devices.common.IBM.mpio.rte
devices.fcp.disk.ibm.mpio.rte
devices.common.IBM.mpio.rte
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
121
Example 4-4 shows the disk configuration for each node after we configure the storage and
installing the MPIO device driver. hdisk1 - hdisk4 are on IBM DS8000 storage, and hdisk5 hdisk8 are on IBM DS4800 storage.
Example 4-4 Disk configuration on node Zhifa and node Rebecca
Node Zhifa:
root@Zhifa / > lsdev -Cc disk
hdisk0 Available
Virtual SCSI Disk Drive
hdisk1 Available 24-T1-01 IBM MPIO FC 2107
hdisk2 Available 24-T1-01 IBM MPIO FC 2107
hdisk3 Available 24-T1-01 IBM MPIO FC 2107
hdisk4 Available 24-T1-01 IBM MPIO FC 2107
hdisk5 Available 24-T1-01 IBM MPIO DS4800 Array Disk
hdisk6 Available 24-T1-01 IBM MPIO DS4800 Array Disk
hdisk7 Available 24-T1-01 IBM MPIO DS4800 Array Disk
hdisk8 Available 24-T1-01 IBM MPIO DS4800 Array Disk
hdisk9 Available 24-T1-01 IBM MPIO FC 2107
Node Rebecca:
root@Rebecca / >
hdisk0 Available
hdisk1 Available
hdisk2 Available
hdisk3 Available
hdisk4 Available
hdisk5 Available
hdisk6 Available
hdisk7 Available
hdisk8 Available
We also must check the PVID of each disk to see the corresponding disk in node Zhifa with
disk in node Rebecca. We need this information later when we configure disk and site
dependency. Example 4-5 shows our disk PVID configuration.
hdisk1 - hdisk4 in node Zhifa and Rebecca are the same disks that come from IBM DS8000
storage, and hdisk5 - hdisk8 are the same disks in node Zhifa and Rebecca that come from
IBM DS48000 storage. In the disk/site definition for cross-site LVM mirroring, we select hdisk1
- hdisk4 owned by site POK and hdisk5 - hdisk8 owned by site NY.
Example 4-5 Disk PVID in node Zhifa and node Rebecca
Node Zhifa:
root@Zhifa / > lspv
hdisk0
000fe411492d4136
hdisk1
000fe4110742ad37
hdisk2
000fe4110742ad9f
hdisk3
000fe4110742ae0f
hdisk4
000fe4110742ae6f
hdisk5
000fe41107345f77
hdisk6
000fe41107346f8a
hdisk7
000fe41107347b1e
hdisk8
000fe4110734851b
hdisk9
000fe41164116629
Node Rebecca:
122
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
123
As shown in Figure 4-10, we select hdisk1 - hdisk4 in node Zhifa and node Rebecca owned
by site POK. Then, continue configuring the cross-site LVM mirroring disk definition for node
Rebecca.
Add Disk/Site Definition for Cross-Site LVM Mirroring
Ty+--------------------------------------------------------------------------+
Pr|
Disks PVID
|
|
|
| Move cursor to desired item and press F7.
|
* |
ONE OR MORE items can be selected.
|
* | Press Enter AFTER making all selections.
|
|
|
| > 000fe4110742ad37
hdisk1 Rebecca
|
| > 000fe4110742ad37
hdisk1 Zhifa
|
| > 000fe4110742ad9f
hdisk2 Rebecca
|
| > 000fe4110742ad9f
hdisk2 Zhifa
|
| > 000fe4110742ae0f
hdisk3 Rebecca
|
| > 000fe4110742ae0f
hdisk3 Zhifa
|
| > 000fe4110742ae6f
hdisk4 Zhifa
|
| > 000fe4110742ae6f
hdisk4 Rebecca
|
|
000fe41107345f77
hdisk5 Zhifa
|
|
000fe41107346f8a
hdisk6 Zhifa
|
|
000fe41107345f77
hdisk5 Rebecca
|
|
000fe41107347b1e
hdisk7 Zhifa
|
|
000fe4110734851b
hdisk8 Zhifa
|
|
000fe41107346f8a
hdisk6 Rebecca
|
|
000fe41107347b1e
hdisk7 Rebecca
|
|
000fe4110734851b
hdisk8 Rebecca
|
|
|
F9+--------------------------------------------------------------------------+
Figure 4-10 Disk selection for cross-site LVM disk/site definition
Current disk configuration: The Configure Disk/Site Locations for Cross-Site LVM
Mirroring menu selection functions correctly only if the disk discovery file reflects the
current disk configuration.
We can change the disk/site dependency later by entering the smitty cl_xslvmm command
and selecting Change/Show Disk/Site Definition for Cross-Site LVM Mirroring. We can
also remove the site and disk dependencies later by using the smitty cl_xslvmm command
and selecting Remove Disk/Site Definition for Cross-Site LVM Mirroring.
124
DR Site
Node A
Node B
AIX
LVM
Applications or
Filesystems
DWDM
DWDM
Storage 1
PV1
PV2
AIX
LVM
Applications or
Filesystems
Storage 2
PV3
PV4
PV5
PV6
Mirror pool #2
Mirror pool #1
Volume Group
Figure 4-11 Mirror pool diagram
Figure 4-11 shows a geographically mirrored volume group where the disks at the production
site are placed into mirror pool #1, and the disks at the disaster recovery site are placed into
mirror pool #2. Both mirror pools are configured to become one volume group that is used for
synchronous cross-site LVM mirroring.
To support a mirror pool configuration, certain parameters must be followed when you create
a volume group. The mirror pool requires that volume groups enable the super strict feature.
The super strict function performs the following tasks:
Checks that the local and remote physical volumes cannot belong to the same mirror pool.
Checks for no more than three mirror pools per volume group.
Checks that each mirror pool contains at least one copy of each logical volume:
When we create a logical volume, we must configure it so that each mirror pool gets a
copy. However, if we create a mirror pool in a volume group where logical volumes
exist, logical volume copies are not automatically created in the new mirror pool. We
must create them by running the mirrorvg or mklvcopy commands.
Asynchronous GLVM mirroring requires a new type of logical volume for caching of
asynchronous write requests. This logical volume should not be mirrored across sites.
Super strict mirror pools handle this new aio_cache logical volume type as a special
case.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
125
datalv1
datalv2
datalv3
datalv2
PV1
loglv4
PV2
PV3
datalv1
datalv2
datalv3
datalv2
PV4
PV5
126
loglv4
PV6
The volume group has a total of five logical volumes. The user data is stored in three logical
volumes. Logical volumes datalv1, datalv2, and datalv3 contain file systems, and logical
volume loglv4 contains the file system log. These four logical volumes are mirrored across
both sites because they have copies in both mirror pools.
In another case, such as in GLVM asynchronous mirroring, we need to create a cache logical
volume, aiocache. We can create logical volume aiocachelv1 in a local site and logical volume
aiocachelv2 in a remote site. Both are used to cache asynchronous write requests. They are
not mirrored across both sites.
In Figure 4-12, the volume group is varied online at the production site. Writes to the local
disks in mirror pool #1 and mirror pool #2 occur synchronously.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
127
Select both nodes for creating volume group pokvg. Then, select the physical volume for this
volume group (Figure 4-14).
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Set Characteristics of a Volume Group
Enable a Volume Group for Fast Disk Takeover or Concurrent Access
+--------------------------------------------------------------------------+
|
Physical Volume Names
|
|
|
| Move cursor to desired item and press F7.
|
|
ONE OR MORE items can be selected.
|
| Press Enter AFTER making all selections.
|
|
|
| > 000fe41107345f77 ( hdisk5 on all selected nodes )
|
| > 000fe41107346f8a ( hdisk6 on all selected nodes )
|
|
000fe41107347b1e ( hdisk7 on all selected nodes )
|
|
000fe4110734851b ( hdisk8 on all selected nodes )
|
| > 000fe4110742ad37 ( hdisk1 on all selected nodes )
|
| > 000fe4110742ad9f ( hdisk2 on all selected nodes )
|
|
000fe4110742ae0f ( hdisk3 on all selected nodes )
|
|
000fe4110742ae6f ( hdisk4 on all selected nodes )
|
F9+--------------------------------------------------------------------------+
Figure 4-14 Disk selection for creating a volume group
Press F7 to select the appropriate disks. The volume group pokvg is in hdisk1 and hdisk2 in
node Zhifa and is mirrored to hdisk5 and hdisk6 in node Rebecca.
128
After you select the physical volume, select the volume group type. Four options are available,
and, in our case, we choose a scalable volume group. The scalable volume group is a
prerequisite for creating a mirror pool, which we use in our scenario (Figure 4-15).
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Set Characteristics of a Volume Group
Enable a Volume Group for Fast Disk Takeover or Concurrent Access
Import a Volume Group
Mirror a Volume Group
Unmirror a Volume Group
Manage Concurrent Access Volume Groups for Multi-Node Disk Heartbeat
+--------------------------------------------------------------------------+
|
Volume Group Type
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
Legacy
|
|
Original
|
|
Big
|
|
Scalable
|
|
|
F9+--------------------------------------------------------------------------+
Figure 4-15 Selecting volume group type
We must choose Scalable, but if we chose the big volume group, as shown in Figure 4-15, we
can change it to a scalable volume group by using the following AIX command:
# chvg -G pokvg
You can also use the SMIT menu to change the volume group to the scalable volume group.
Run smitty chvg, choose the volume group name, and change the Change to scalable VG
format? parameter to yes.
chvg man page: The volume group must be varied offline before you run the chvg
command. Other considerations are described in the chvg man page. For more information
about converting your existing volume groups to scalable VG format, see the AIX
documentation.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
129
After you select the scalable volume group, complete the volume group parameters
(Figure 4-16).
Create a Big Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
Node Names
Resource Group Name
PVID
VOLUME GROUP name
Physical partition SIZE in megabytes
Volume group MAJOR NUMBER
Enable Cross-Site LVM Mirroring Verification
Enable Fast Disk Takeover or Concurrent Access
Volume Group Type
[Entry Fields]
Rebecca,Zhifa
[]
000fe41107345f77 000f>
[pokvg]
64
[34]
false
Fast Disk Takeover or>
Big
+
#
+
+
Set the mirror pool strictness volume group parameter to superstrict. If have not set it yet, you
can use the SMIT menu to enable it. Run smitty chvg, and then choose the name of the
volume groups that must be superstrict enabled (Figure 4-17).
Change a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
pokvg
no
no
no
no
no
128
n
n
+
+
+
+
+
+
32
256
Superstrict
+
+
+
When you are finished creating the volume group pokvg, continue by creating the second
volume group nyvg.
130
Example 4-7 shows that hdisk1 and hdisk2 are configured as mirror pool mp_pok, hdisk5 and
hdisk6 are configured as mirror pool mp_ny, and the replication method that is used is
synchronous replication.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
131
To create a jfs2 logical volume with mirror pool configuration, run the smitty mklv SMIT
command (Figure 4-18).
Add a Logical Volume
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Logical volume NAME
* VOLUME GROUP name
* Number of LOGICAL PARTITIONS
PHYSICAL VOLUME names
Logical volume TYPE
POSITION on physical volume
RANGE of physical volumes
MAXIMUM NUMBER of PHYSICAL VOLUMES
to use for allocation
Number of COPIES of each logical
partition
Mirror Write Consistency?
Allocate each logical partition copy
on a SEPARATE physical volume?
RELOCATE the logical volume during reorganization?
Logical volume LABEL
MAXIMUM NUMBER of LOGICAL PARTITIONS
Enable BAD BLOCK relocation?
SCHEDULING POLICY for writing/reading
logical partition copies
Enable WRITE VERIFY?
File containing ALLOCATION MAP
Stripe Size?
Serialize IO?
Mirror Pool for First Copy
Mirror Pool for Second Copy
Mirror Pool for Third Copy
[Entry Fields]
[lv1_mp]
pokvg
[10]
[]
[jfs2]
outer_middle
minimum
[]
#
+
+
+
+
#
active
yes
+
+
yes
[]
[512]
no
parallel
no
[]
[Not Striped]
no
mp_pok
mp_ny
#
+
+
+
+
+
+
+
Consider the following parameters (Figure 4-18) when you create a logical volume with the
mirror pool configuration:
Logical volume type
This parameter relates with the type of logical volume. We can choose jfs, jfs2, sysdump,
paging, jfslog, jfs2log, boot, and aio_cache by pressing F4.
Number of COPIES of each logical partition
This parameter must be completed with the number of mirror pools on that volume group
because each mirror pool has a copy of every logical volume on that volume group, except
for the aio_cache logical volume.
Enable BAD BLOCK relocation.
This parameter must be set to no. AIX is intelligent enough to detect bad blocks and then
relocate the data on the disk drives. Whenever a problem occurs on the disks, data
relocation takes place automatically for data protection. In a mirror pool environment, this
132
feature must be set to no. The reason is that we have a group of disks, and each group of
disks has every copy of the logical volume and cannot be mixed with each other.
Mirror Pool for First Copy
This parameter indicates the mirror pool name where the first copy of this logical volume
is.
Mirror Pool for Second Copy
This parameter indicates the mirror pool name where the second copy of this logical
volume is.
Mirror Pool for Third Copy
This parameter indicates the mirror pool name where the third copy of this logical volume
is. You can leave it blank if you do not have the third mirror pool.
However, if you create a mirror pool in a volume group where logical volumes exist, the logical
volume copies are not automatically created in the new mirror pool. You must copy them by
running the mirrorvg or mklvcopy commands.
Logical volume with mirror pool configuration: Creating a logical volume with a mirror
pool configuration in PowerHA 6.1 can be performed by using the AIX command line or
smitty lv. The C-SPOC menu does not support this feature yet.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
133
Choose the volume group where the logical volume resides, and then choose the file system
type. In this environment, we choose Enhanced Journal File Systems (Figure 4-20).
Add an Enhanced Journaled File System on a Previously Defined Logical Volume
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
Resource Group
* Node Names
Logical Volume name
Volume Group
* MOUNT POINT
PERMISSIONS
Mount OPTIONS
Block Size (bytes)
Inline Log?
Inline Log size (MBytes)
Logical Volume for Log
Extended Attribute Format
Enable Quota Management?
[Entry Fields]
pokrg
Rebecca,Zhifa
lv1_mp
pokvg
[/data_zhifa]
read/write
[]
4096
yes
[]
Version 1
no
/
+
+
+
+
#
+
+
+
Set the inline log parameter to yes. The inline log parameter distributes the log over all disks
that are involved in the logical volume. You need to also apply this parameter in a mirror pool
configuration.
134
After you create the file system, check whether the file system is already distributed within the
mirror pools with the lsvg command. Figure 4-21 shows that the file system data_zhifa
resides in separate mirror pools and in separate physical volumes.
root@Zhifa / > lsvg -M pokvg
pokvg
hdisk5:1-637
hdisk6:1-134
hdisk6:135
lv1_mp:1:2
hdisk6:136
lv1_mp:2:2
hdisk6:137
lv1_mp:3:2
hdisk6:138
lv1_mp:4:2
hdisk6:139
lv1_mp:5:2
hdisk6:140
lv1_mp:6:2
hdisk6:141
lv1_mp:7:2
hdisk6:142
lv1_mp:8:2
hdisk6:143
lv1_mp:9:2
hdisk6:144
lv1_mp:10:2
hdisk6:145-637
hdisk1:1-637
hdisk2:1-134
hdisk2:135
lv1_mp:1:1
hdisk2:136
lv1_mp:2:1
hdisk2:137
lv1_mp:3:1
hdisk2:138
lv1_mp:4:1
hdisk2:139
lv1_mp:5:1
hdisk2:140
lv1_mp:6:1
hdisk2:141
lv1_mp:7:1
hdisk2:142
lv1_mp:8:1
hdisk2:143
lv1_mp:9:1
hdisk2:144
lv1_mp:10:1
root@Zhifa / > lsvg -l pokvg
pokvg:
LV NAME
TYPE
lv1_mp
jfs2
LPs
10
PPs
20
PVs
2
LV STATE
open/syncd
MOUNT POINT
/data_zhifa
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
135
Completet the parameters that are shown in Figure 4-22. Certain parameters are the same
parameters that are used in the local cluster configuration. The additional parameter for the
cross-site LVM mirroring cluster is the Inter-Site Management Policy. This parameter relates
with the resource group recovery policy to allow or disallow the cluster manager to move a
resource group to another site in case the resource group goes into an error state.
Add a Resource Group (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[pokrg]
Startup Policy
Failover Policy
Failback Policy
+
+
+
This is default selection and ignores the site dependency settings for
the resource group.
Online On Either SiteThe resources group may be acquired by any site in its resource
chain. When a site failure occurs, the resource group is acquired by
the highest priority standby site. When the failed site rejoins, the
resource group remains with its new owner.
Online On Both SitesThe resource group is acquired by both sites. This selection defines
the concurrent capable resource group.
After you create the resource group pokrg, create the second resource group nyrg.
136
Create the cluster application server by using the SMIT menu (Figure 4-23). Enter the smitty
hacmp command. Then, select Extended Configuration Extended
Resource Configuration HACMP Extended Resource Configuration Configure
HACMP Applications Servers Configure HACMP Application Servers Add an
Application Server (Figure 4-23).
Enter the application server name and the link to the start script file. Then, continue with the
creation of the second application server, ny_app.
Add Application Server
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Server Name
* Start Script
* Stop Script
Application Monitor Name(s)
[Entry Fields]
[pok_app]
[/IBM/start_pok.sc]
[/IBM/stop_pok.sc]
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
137
Figure 4-24 shows the SMIT menu after we choose the service IP label/address mode.
Add a Service IP Label/Address configurable on Multiple Nodes (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* IP Label/Address
zhifa_svc
Netmask(IPv4)/Prefix Length(IPv6)
[255.255.255.0]
* Network Name
net_ether_01
Alternate Hardware Address to accompany IP Label/A []
ddress
Associated Site
ignore
Then, continue with creating the second service IP label/address for nyrg.
138
Figure 4-25 shows the parameters to enter for the integration, such as service IP
labels/addresses, application servers, and volume groups.
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Resource Group Name
Inter-site Management Policy
Participating Nodes from Primary Site
Participating Nodes from Secondary Site
[Entry Fields]
pokrg
Prefer Primary Site
Zhifa
Rebecca
Startup Policy
Failover Policy
Failback Policy
Failback Timer Policy (empty is immediate)
Service IP Labels/Addresses
Application Servers
[zhifa_svc]
[pok_app]
+
+
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Default choice for data divergence recovery
(Asynchronous GLVM Mirroring Only)
Allow varyon with missing data updates?
[pokvg ]
true
false
ignore
+
+
+
+
true
After you change the property of the first resource group, continue with the second resource
group, nyrg.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
139
Resource Group
* Node Names
Volume group name
SIZE of file system
Unit Size
*
Number of units
* MOUNT POINT
PERMISSIONS
Mount OPTIONS
Block Size (bytes)
Inline Log?
Inline Log size (MBytes)
Logical Volume for Log
Extended Attribute Format
Enable Quota Management?
M
[10]
[/test]
read/write
[]
4096
yes
[]
Version 1
no
+
#
/
+
+
+
+
#
+
+
+
As shown in Figure 4-26, enter the necessary parameters, and then press Enter. In this case,
the command fails (Figure 4-27).
COMMAND STATUS
Command: failed
stdout: yes
stderr: no
140
Also when trying to create a file system by using the smitty jfs2 command with Add an
Enhanced Journaled File System, the result fails with the message shown in Figure 4-28.
COMMAND STATUS
Command: failed
stdout: yes
stderr: no
Finally, we found that we must create the logical volume first by using AIX commands or the
smitty mklv menu. Then, we can continue with the task of creating file systems by using the
cluster C-SPOC menu.
Mirror pool configuration: In a mirror pool configuration, create the logical volume by
using AIX command or smitty mklv first. Then, continue by creating file systems by using
the cluster C-SPOC menu. We cannot create a file system directly by using the C-SPOC
menu.
Check the distribution of new file systems by using the lsvg -m pokvg AIX command. The
result shows that the new file system is created in hdisk1 and hdisk5, where these two disks
are part of a separate mirror pool, which means that the mirror pool configuration works well.
stdout: yes
stderr: no
PowerHA 6.1 SP1 APAR: This APAR is a known PowerHA 6.1 SP1 APAR:
IZ70894 C-SPOC CHANGE/SHOW JFS2 FAILS WITH "CL_CHFS: _GET_RGNODES"
Then, change the file system size by using the smitty chjfs2 command, and add the
capacity. Run this command on the node where the corresponding file systems is mounted at
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
141
the time. The result is that the file system size increased and it also expanded to a different
mirror pool. By using this method, the other node in the cluster recognizes that the file system
has changed, and we do not need to resynchronize the cluster to update the other node.
ONLINE
ONLINE Zhifa@POK
Rebecca@NY
ONLINE
OFFLINE
Rebecca@NY
Zhifa@POK
142
You can check the status of the resource group before and after node failure by entering the
/usr/es/sbin/cluster/utilities/clRGinfo command.
143
applications continue to work without interruption and the volume groups and file systems
remain available. After the failure, check the availability of the disks and the status of the
logical volume copy synchronization. The hdisk1 and hdisk2 from the DS8000 storage are
marked as missing. The status of the logical volume that uses these disks is stale
(Example 4-8).
Example 4-8 Status of hdisk after one storage failure
LPs
6
6
400
5
TOTAL PPs
637
637
FREE PPs
226
631
FREE DISTRIBUTION
128..00..00..00..98
637
637
226
631
128..00..00..00..98
PPs
12
12
800
10
PVs
2
2
2
2
LV STATE
open/stale
open/stale
closed/syncd
open/stale
MOUNT POINT
/pok_data1
/data2
N/A
/testfs
After this test, rezone the SAN switch to make the DS8000 storage available again. Then, use
the C-SPOC to synchronize and make all hdisk devices available. Enter the smitty hacmp
command. Select System Management (C-SPOC) Storage Volume Groups
Synchronize LVM Mirrors Synchronize by Volume Group. Then, select the appropriate
VG.
When you check the status of the logical volume copy in all volume groups after
synchronization, the result is in the syncd status (Example 4-9).
Example 4-9 Synchronization and disk status after the storage synchronization
Command: OK
stdout: yes
stderr: no
144
pokvg:
PV_NAME
PV STATE
hdisk5
active
hdisk6
active
128..121..127..127..128
hdisk1
active
hdisk2
active
128..121..127..127..128
root@Zhifa / > lsvg -l pokvg
pokvg:
LV NAME
TYPE
lv_1
jfs2
mplv2
jfs2
mikeylvm
jfs
test_lv
jfs2
root@Zhifa / >
LPs
6
6
400
5
TOTAL PPs
637
637
FREE PPs
226
631
FREE DISTRIBUTION
128..00..00..00..98
637
637
226
631
128..00..00..00..98
PPs
12
12
800
10
PVs
2
2
2
2
LV STATE
open/syncd
open/syncd
closed/syncd
open/syncd
MOUNT POINT
/pok_data1
/data2
N/A
/testfs
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
145
146
After you choose the appropriate disks, verify the fields in the final menu, and press Enter to
add the additional disk to the volume group process. You must have a pair of disks at another
site to enable the mirroring policy. You also must add one more disk from the storage at the
second site. In this case, we add hdisk8 from the DS4800 storage at site NY as a pair of
hdisk4.
To remove a volume from a volume group, enter smitty hacmp. Select System Management
(C-SPOC) Storage Volume Groups Set Characteristic of a Volume Group
Remove a Volume from a Volume Group. Then, select the appropriate volume group.
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
147
Then, choose the appropriate disk to remove (Figure 4-32). Verify the fields in the final menu,
and press Enter to remove the disks from a volume group.
Set Characteristics of a Volume Group
Move cursor to desired item and press Enter.
Add a Volume to a Volume Group
Change/Show characteristics of a Volume Group
Remove a Volume from a Volume Group
Enable/Disable a Volume Group for Cross-Site LVM Mirroring Verification
+--------------------------------------------------------------------------+
|
Physical Volume Names
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
Zhifa
hdisk3
|
|
Zhifa
hdisk4
|
|
Zhifa
hdisk7
|
|
Zhifa
hdisk8
|
F9+--------------------------------------------------------------------------+
Figure 4-32 Removing a disk from a shared volume group
Repeat this process to remove the disk pair that is in another site.
148
Choose the appropriate volume group and logical volume in the list that is displayed. Finally,
increase the size of a shared logical volume (Figure 4-33).
Increase the Size of a Logical Volume
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
pokvg
pokrg
test_lv
[3]
outer_middle
minimum
[1024]
yes
[]
#
+
+
#
+
After you add the space, verify that the partition mapping is correct by running the lslv -m
lvname AIX command. Example 4-10 shows the output of this command.
Example 4-10 Checking the logical volume disk mapping in a shared volume group
PP3
PV3
Chapter 4. Configuring PowerHA Standard Edition with cross-site logical volume mirroring
149
150
Part 3
Part
Extended distance
disaster recovery short
overview
This part describes storage integration features and implementation options available with the
IBM PowerHA SystemMirror Enterprise Edition for disaster recovery across sites.
This part includes the following chapters:
Chapter 5, Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and
Global Mirror on page 153
Chapter 6, Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro
Mirror on page 237
Chapter 7, Configuring PowerHA SystemMirror Enterprise Edition with SRDF replication
on page 267
Chapter 8, Configuring PowerHA SystemMirror Enterprise Edition with Geographic
Logical Volume Manager on page 339
151
152
Chapter 5.
Configuring PowerHA
SystemMirror Enterprise Edition
with Metro Mirror and Global
Mirror
The PowerHA Enterprise Edition-based SAN Volume Controller has several copy services.
This chapter explains the steps to plan, configure, and test the disaster recovery solution.
This chapter includes the following sections:
Scenario description
Planning and prerequisites overview
Installing and configuring PowerHA Enterprise Edition for SAN Volume Controller
Adding and removing disks to PowerHA Enterprise Edition for SAN Volume Controller
Testing PowerHA Enterprise Edition with SAN Volume Controller
Troubleshooting PowerHA Enterprise Edition for SAN Volume Controller
153
SVC_Site A
net_ether_01
xd_ip
net_diskhb_01
svcxd_A1
net_diskhb_02
svcxd_B1
svcxd_A2
P550
svcxd_B2
P575
P575
P550
SAN
SAN
PPRC Links
B12_4F2
B8_8G4
SVC
SVC
DS 4000
DS 8000
DS 8000
Figure 5-1 PowerHA Enterprise Edition for the SAN Volume Controller scenario
154
In this scenario, each site has seven disks that are defined through each SAN Volume
Controller cluster. The svc_sitea site uses four disks for a Metro Mirror PPRC configuration,
and the svc_siteb site uses three disks for a Global Mirror PPRC configuration (Figure 5-2).
P550
P550
svcxd_a1
svcxd_a2
P575
P575
svcxd_b1
svcxd_b2
SAN
SAN
B12_4F2
B8_8G4
METRO MIRROR
hdisk2
hdisk2
METRO MIRROR
SITEAMETROVG
hdisk3
hdisk3
METRO MIRROR
hdisk4
METRO MIRROR
hdisk5
hdisk6
hdisk7
hdisk8
hdisk4
hdisk5
GLOBAL MIRROR
GLOBAL MIRROR
GLOBAL MIRROR
hdisk6
hdisk7
SITEBGLOBALVG
hdisk8
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
155
We configure two VIOS. In each VIOS, we configure two Shared Ethernet Adapters failover
(SEA-failover), one for net_XD_ip_01 and another for net_ether_01 (Figure 5-3). For virtual
disk storage, we use NPIV in each VIO, providing Virtual Fibre Channel adapters for the
partitions.
SVCXD_A1
en0
en1
fcs0
fcs1
xd_ip
net_ether_01
npiv
sea0
sea1
Vio1
sea0
sea1
Vio2
npiv
SAN
DS 4000 DS 8000
(ROOTVG)
References: For more information about Power Systems virtualization features, see the
IBM PowerVM Virtualization Introduction and Configuration, SG24-7940.
For information about NPIV configuration, see also N_Port ID virtualization, in the IBM
PowerVM Virtualization Managing and Monitoring, SG24-7590.
156
5.2.1 Planning
Before you configure PowerHA Enterprise Edition for SAN Volume Controller, check the
following areas:
PowerHA Enterprise Edition sites and nodes are identified.
SAN Volume Controller cluster and PPRC licenses are configured.
SAN Volume Controller virtual disks (VDisk), relationships, and consistency groups are
identified.
The resource groups that will contain the SAN Volume Controller-managed PPRC
resources are planned.
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
157
Software requirement
To implement PowerHA Enterprise Edition with SAN Volume Controller, you must ensure that
you have the required software.
You must install the following file sets:
cluster.es.svcpprc.cmds
cluster.es.svcpprc.rte
Optional: cluster.msg.en_US.svcpprc
The required software and microcode levels are openssh Version 3.6.1 or later for access to
SAN Volume Controller interfaces.
When you run SAN Volume Controller Version 4.x, you must install the following components:
Storage microcode/LIC versions as per SAN Volume Controller support requirements
Subsystem Device Driver (SDD) V1.6.3.0 or later
IBM Host attachment scripts:
devices.fcp.disk.ibm.rte 1.0.0.9 or later
ibm2105.rte 32.6.100.25 or later (as specified by SAN Volume Controller support)
When you use SDDPCM, you must install the following components:
Subsystem Device Driver Path Control Module SDDPCM): V2.2.0.0 or later
IBM Host attachment scripts:
devices.fcp.disk.ibm.mpio.rte 1.0.0.10 or later
ibm2105.rte 32.6.100.25 or later (as specified by SAN Volume Controller support)
When you use Virtual I/O Server 1.5.1.x, you must install the following components:
Subsystem Device Driver Path Control Module SDDPCM): v 2.2.0.0 or later
IBM Host attachment scripts: devices.fcp.disk.ibm.mpio.rte 1.0.0.10 or later
158
You can also check each UID by using the command line. Assuming that ssh access from the
client to the SAN Volume Controller is configured, run:
ssh admin@svc_cluster_ip svcinfo lshostvdiskmap |more
You can also grep on the host alias name to narrow the list (Example 5-1).
Example 5-1 Using grep to narrow the VDisk list
[svcxd_a1][/]> ssh admin@B8_8G4 svcinfo lshostvdiskmap | grep svc_haxd0001
0
SVC_550_1_A1
0
0
svc_haxd0001
600507680190026C4000000000000000
C05076004FAA0027
Collect this information from each VDisk to be used for the PPRC relationship from the SAN
Volume Controller cluster in both sites.
On the AIX clients, the UID is in the ODM. To get it running, enter the following command:
odmget -q "attribute=unique_id" CuAt
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
159
The VDisk UID is contained in this attribute at the fifth numeric position. Example 5-2 shows
the VDisk UID in bold. In this example, hdisk2 matches VDisk svc_haxd0001. Repeat the
command to match and record to create proper replicated relationships.
Example 5-2 VDisk UID in ODM attribute
CuAt:
name = "hdisk2"
attribute = "unique_id"
value = "33213600507680190026C400000000000000004214503IBMfcp"
type = "R"
generic = "D"
rep = "nl"
nls_index = 42
Because the nodes from site svc_sitea share disks, you need only to get the VDisk UID from
one of these nodes. You can do the same on a node from site svc_siteb.
cluster.es.svcpprc.cmds
cluster.es.svcpprc.rte
cluster.msg.en_US.svcpprc
cluster.xd.license
After you install the required file sets, you see the cluster file sets that are installed (Example 5-3).
Example 5-3 Cluster file sets listing
cluster.es.svcpprc.rte
cluster.license
cluster.xd.license
cluster.es.client.clcomd
cluster.es.client.lib
cluster.es.client.rte
cluster.es.client.wsm
cluster.es.cspoc.rte
cluster.es.server.diag
cluster.es.server.events
cluster.es.server.rte
cluster.es.server.utils
cluster.es.svcpprc.rte
6.1.0.1
6.1.0.0
6.1.0.0
6.1.0.1
6.1.0.1
6.1.0.1
6.1.0.0
6.1.0.0
6.1.0.0
6.1.0.0
6.1.0.1
6.1.0.1
6.1.0.0
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
COMMITTED
This remainder of this section explains how to configure PowerHA Enterprise Edition for SAN
Volume Controller.
5.3.1 Topology
In this scenario, configure a four-node cluster (two at each site) by using two networks
(net_ether_01 and net_XD_ip_01). Because svc_sitea and svc_siteb are in different
segments of the network, we use site-specific service IP label configuration for the service IP
address. Table 5-1 lists the IP address topology.
Table 5-1 IP address topology
Site
LPAR
Boot
Persistent
Service
XD_IP
svc_sitea
svcxd_a1
192.168.8.103
192.168.100.173
192.168.100.54
10.12.5.36
svc_sitea
svcxd_a2
192.168.8.104
192.168.100.174
192.168.100.94
10.12.5.37
svc_siteb
svcxd_b1
192.168.12.103
10.10.12.103
10.10.12.113
10.114.124.40
svc_siteb
svcxd_b2
192.168.12.104
10.10.12.104
10.10.12.114
10.114.124.44
255.255.255.0
255.255.255.0
255.255.255.0
255.255.252.0
# persistent addresses
192.168.100.173 svcxd_a1
192.168.100.174 svcxd_a2
10.10.12.103
svcxd_b1
10.10.12.104
svcxd_b2
# service addresses
192.168.100.54 svcxd_a1_sv
192.168.100.94 svcxd_a2_sv
10.10.12.113
svcxd_b1_sv
10.10.12.114
svcxd_b2_sv
# boot addresses
192.168.8.103 svcxd_a1_boot
192.168.8.104 svcxd_a2_boot
192.168.12.103 svcxd_b1_boot
192.168.12.104 svcxd_b2_boot
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
161
#XD_IP network
10.12.5.36
svcxd_a1_xdip
10.12.5.37
svcxd_a2_xdip
10.114.124.40 svcxd_b1_xdip
10.114.124.44 svcxd_b2_xdip
#SVC_CLUSTER
10.12.5.55 B8_8G4
10.114.63.250 B12_4F2
We use the disk with the PVID 000fe4112579ef78 on the nodes svcxd_a1 and svcxd_a2 to
configure net_diskhb_01. We use the disk with the PVID 00ca02ef25b24924 on the nodes
svcxd_a1svcxd_b1 and svcxd_b2 to configure net_diskhb_02 (Example 5-5).
Example 5-5 Disks that are used for the heartbeat configuration
[svcxd_a1][/]>
hdisk9
[svcxd_a2][/]>
hdisk9
[svcxd_b1][/]>
hdisk9
[svcxd_b2][/]>
hdisk9
vg_hba
vg_hba
vg_hbb
vg_hbb
Important: These disks are used for heartbeating only between nodes that belong to the
same site.
162
* Cluster Name
NOTE: HACMP must be RESTARTED
on all nodes in order for change to take effect
Figure 5-5 Adding the cluster to the topology configuration
* Node Name
Communication Path to Node
[Entry Fields]
[svcxd_a1]
[svcxd_a1_xdip]
Because svc_sitea and svc_siteb are in separate network segments, we use the IP address
that is used in XD_IP network as the communication path. Repeat the procedure that is
shown in Figure 5-6 to add each node to the cluster. After you complete the task, you see the
nodes as shown in Example 5-6.
Example 5-6 Nodes names
[svcxd_a1][/]> clnodename
svcxd_a1
svcxd_a2
svcxd_b1
svcxd_b2
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
163
* Site Name
* Site Nodes
+
[Entry Fields]
[svc_sitea]
svcxd_a1 svcxd_a2
+
+
In this scenario, two sites, svc_sitea and svc_siteb, are added. Nodes svcxd_a1 and
svcxd_a2 are part of svc_sitea, and svcxd_b1 and svcxd_b2 are part of svc_sitea. Repeat
the procedure that is shown in Figure 5-7 for each site in the cluster. After you complete the
definition of the sites, they are listed as shown in Example 5-7.
Example 5-7 Sites listing
[svcxd_a1][/]> cllssite
--------------------------------------------------Sitename
Site Nodes
Dominance
--------------------------------------------------svc_sitea
svcxd_a1 svcxd_a2
yes
svc_siteb
svcxd_b1 svcxd_b2
no
Protection Type
NONE
NONE
164
3. Change the Enable IP Address Takeover via Alias to YES (Figure 5-8).
Add an IP-Based Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
+
[Entry Fields]
[net_ether_01]
ether
[255.255.255.0]
[Yes]
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
4. After you add the network net_ether_01, add the network interfaces by entering the SMIT
smitty hacmp command. Then, select Extended Configuration Extended Topology
Configuration Configure HACMP Communication Interfaces/Devices Add
Communication Interfaces/Devices Add Pre-defined Communication Interfaces
and Devices Communication Interfaces net_ether_01.
5. In the Add a Communication Interface panel (Figure 5-9), for IP Label/Address, select
each correspondent boot address for each node.
6. Repeat the procedure that is shown in Figure 5-9 for each node in the cluster.
Add a Communication Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
IP Label/Address
Network Type
Network Name
Node Name
Network Interface
[Entry Fields]
[svcxd_a1_boot]
ether
net_ether_01
[svcxd_a1]
[]
[svcxd_a1][/]> cltopinfo
Cluster Name: svcxd
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 4 node(s) and 1 network(s) defined
NODE svcxd_a1:
Network net_ether_01
svcxd_a1_boot
192.168.8.103
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
165
NODE svcxd_a2:
Network net_ether_01
svcxd_a2_boot
192.168.8.104
Network net_ether_01
svcxd_b1_boot
192.168.12.103
NODE svcxd_b1:
NODE svcxd_b2:
Network net_ether_01
svcxd_b2_boot
192.168.12.104
7. Add the persistent IP address by entering the SMIT smitty hacmp command. Then, select
Extended Configuration Extended Topology Configuration Configure HACMP
Persistent Node IP Label/Addresses Add a Persistent Node IP Label/Address
Select a Node (Figure 5-10).
Add a Persistent Node IP Label/Address
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Node Name
* Network Name
* Node IP Label/Address
Netmask(IPv4)/Prefix Length(IPv6)
[Entry Fields]
svcxd_a1
[net_ether_01] +
[svcxd_a1] +
[]
8. Repeat the procedure that is shown in Figure 5-10 for each node in the cluster.
166
4. After you create the net_XD_ip_01 network, add the network interfaces. Enter the smitty
hacmp command. Then, select Extended Configuration Extended Topology
Configuration Configure HACMP Communication Interfaces/Devices Add
Communication Interfaces/Devices Add Pre-defined Communication Interfaces
and Devices Communication Interfaces net_XD_ip_01 (Figure 5-12).
5. Repeat the procedure that is shown in Figure 5-12 for each node in the cluster.
Add a Communication Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
IP Label/Address
Network Type
Network Name
Node Name
Network Interface
[Entry Fields]
[svcxd_a1_xdip] +
XD_ip
net_XD_ip_01
[svcxd_a1] +
[]
6. After you complete this procedure, synchronize the cluster. Example 5-9 shows the
configuration.
Example 5-9 Cluster topology
[svcxd_a1][/]> cltopinfo
Cluster Name: svcxd
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 4 node(s) and 2clnetwork(s) defined
NODE svcxd_a1:
Network net_XD_ip_01
svcxd_a1_xdip
10.12.5.36
Network net_ether_01
svcxd_a1_boot
192.168.8.103
NODE svcxd_a2:
Network net_XD_ip_01
svcxd_a2_xdip
10.12.5.37
Network net_ether_01
svcxd_a2_boot
192.168.8.104
NODE svcxd_b1:
Network net_XD_ip_01
svcxd_b1_xdip
10.114.124.40
Network net_ether_01
svcxd_b1_boot
192.168.12.103
NODE svcxd_b2:
Network net_XD_ip_01
svcxd_b2_xdip
10.114.124.44
Network net_ether_01
svcxd_b2_boot
192.168.12.104
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
167
svcxd_a1
svcxd_a2
hdisk9
hdisk9
000fe4112579ef78
000fe4112579ef78
svcxd_b1
svcxd_b2
hdisk9
hdisk9
00ca02ef25b24924
00ca02ef25b24924
[svcxd_a1][/]> cltopinfo
Cluster Name: svcxd
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 4 node(s) and 4 network(s) defined
NODE svcxd_a1:
Network net_XD_ip_01
svcxd_a1_xdip
10.12.5.36
Network net_diskhb_01
svcxd_a1_hdisk9_01
/dev/hdisk9
Network net_diskhb_02
Network net_ether_01
svcxd_a1_boot
192.168.8.103
168
NODE svcxd_a2:
Network net_XD_ip_01
svcxd_a2_xdip
10.12.5.37
Network net_diskhb_01
svcxd_a2_hdisk9_01
/dev/hdisk9
Network net_diskhb_02
Network net_ether_01
svcxd_a2_boot
192.168.8.104
NODE svcxd_b1:
Network net_XD_ip_01
svcxd_b1_xdip
10.114.124.40
Network net_diskhb_01
Network net_diskhb_02
svcxd_b1_hdisk9_01
/dev/hdisk9
Network net_ether_01
svcxd_b1_boot
192.168.12.103
NODE svcxd_b2:
Network net_XD_ip_01
svcxd_b2_xdip
10.114.124.44
Network net_diskhb_01
Network net_diskhb_02
svcxd_b2_hdisk9_01
/dev/hdisk9
Network net_ether_01
svcxd_b2_boot
192.168.12.104
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
169
[svcxd_a1][/]> cltopinfo
Cluster Name: svcxd
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 4 node(s) and 4 network(s) defined
NODE svcxd_a1:
Network net_XD_ip_01
svcxd_a1_xdip
10.12.5.36
Network net_diskhb_01
svcxd_a1_hdisk9_01
/dev/hdisk9
Network net_diskhb_02
Network net_ether_01
svcxd_b1_sv
10.10.12.113
svcxd_a2_sv
192.168.100.94
svcxd_a1_sv
192.168.100.54
svcxd_b2_sv
10.10.12.114
svcxd_a1_boot
192.168.8.103
NODE svcxd_a2:
Network net_XD_ip_01
svcxd_a2_xdip
10.12.5.37
Network net_diskhb_01
svcxd_a2_hdisk9_01
/dev/hdisk9
Network net_diskhb_02
Network net_ether_01
svcxd_b1_sv
10.10.12.113
svcxd_a2_sv
192.168.100.94
svcxd_a1_sv
192.168.100.54
svcxd_b2_sv
10.10.12.114
svcxd_a2_boot
192.168.8.104
NODE svcxd_b1:
Network net_XD_ip_01
svcxd_b1_xdip
10.114.124.40
Network net_diskhb_01
Network net_diskhb_02
svcxd_b1_hdisk9_01
/dev/hdisk9
Network net_ether_01
svcxd_b1_sv
10.10.12.113
svcxd_a2_sv
192.168.100.94
svcxd_a1_sv
192.168.100.54
svcxd_b2_sv
10.10.12.114
svcxd_b1_boot
192.168.12.103
NODE svcxd_b2:
Network net_XD_ip_01
svcxd_b2_xdip
10.114.124.44
Network net_diskhb_01
Network net_diskhb_02
svcxd_b2_hdisk9_01
/dev/hdisk9
Network net_ether_01
svcxd_b1_sv
10.10.12.113
svcxd_a2_sv
192.168.100.94
170
svcxd_a1_sv
svcxd_b2_sv
svcxd_b2_boot
192.168.100.54
10.10.12.114
192.168.12.104
*
*
*
*
[Entry Fields]
[B8_8G4]
[Master]
svc_sitea
[10.12.5.55]
[]
[B12_4F2]
+
+
In this scenario, both SAN Volume Controller clusters have master roles. The same procedure
that is shown in Figure 5-16 is repeated for the SAN Volume Controller cluster B12_4F2 at
site svc_siteb. After you complete the SAN Volume Controller cluster definition, Example 5-12
shows its configuration.
Example 5-12 SAN Volume Controller cluster definition
[svcxd_a1][/]> cllssvc -a
#SVCNAME ROLE SITENAME IPADDR IPADDR2 RPARTNER
B8_8G4 Master svc_sitea 10.12.5.55 B12_4F2
B12_4F2 Master svc_siteb 10.114.63.250 B8_8G4
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
171
* Relationship Name
* Master VDisk Info
* Auxiliary VDisk Info
[haxd_svc_v0001@B12_4F2]
[Entry Fields]
[svc_disk2]
[svc_haxd0001@B8_8G4]
After you complete the procedure, the SVC PPRC relationships are displayed as shown in
Example 5-13.
Example 5-13 SVC PPRC relationship
[svcxd_a1][/]> cllsrelationship -a
relationship_name MasterVdisk_info AuxiliaryVdisk_info
svc_disk2
svc_haxd0001@B8_8G4 haxd_svc_v0001@B12_4F2
svc_disk3
svc_haxd0002@B8_8G4 haxd_svc_v0002@B12_4F2
svc_disk4
svc_haxd0003@B8_8G4 haxd_svc_v0003@B12_4F2
svc_disk5
svc_haxd0004@B8_8G4 haxd_svc_v0004@B12_4F2
172
svc_disk6
svc_disk7
svc_disk8
haxd_svc_v0005@B12_4F2 svc_haxd0005@B8_8G4
haxd_svc_v0006@B12_4F2 svc_haxd0006@B8_8G4
haxd_svc_v0007@B12_4F2 svc_haxd0007@B8_8G4
*
*
*
*
*
*
[Entry Fields]
[svc_metro]
[B8_8G4] +
[B12_4F2] +
[svc_disk2 svc_disk3 svc_disk4 svc_disk5] +
[METRO] +
[AUTO] +
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
173
After you complete the parameters, you see the SVC PPRC Resource listing as shown in
Example 5-14.
Example 5-14 SPVC PPRC resources
[svcxd_a1][/]> cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
[svcxd_a1][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
[svcxd_a1][/]>
siteametrovg:
LV NAME
loglv00
lvsiteametro1
lvsiteametro2
lvsiteametro3
174
lspv
000fe4110889e1a9
000fe4112579eb7c
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
000fe4112579ef78
lsvg -l siteametrovg
TYPE
jfs2log
jfs2
jfs2
jfs2
LPs
1
4
4
4
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hba
PPs
1
4
4
4
PVs
1
1
1
1
LV STATE
open/syncd
open/syncd
open/syncd
open/syncd
active
active
MOUNT POINT
N/A
/dev/siteametro1
/dev/siteametro2
/dev/siteametro3
[svcxd_a1][/]> lsvg
sitebglobalvg:
LV NAME
loglv01
lvsitebglobal1
/dev/sitebglobal1
lvsitebglobal2
/dev/sitebglobal2
lvsitebglobal3
/dev/sitebglobal3
-l sitebglobalvg
TYPE
jfs2log
jfs2
LPs
1
4
PPs
1
4
PVs
1
1
LV STATE
closed/syncd
closed/syncd
jfs2
closed/syncd
jfs2
closed/syncd
MOUNT POINT
N/A
Major numbers: Determining a major number is required only when using NFS. However,
for clusters, keep the major numbers the same.
In the svcxd_a2 node, run the command to import the volumes groups (Example 5-16).
Example 5-16 Importing the volume groups
[svcxd_a2][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
lspv
000fe401088fc86b
00c7cd9e84341284
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
000fe4112579ef78
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hba
active
active
Ensure that AUTO VARYON on the volumes group is disabled. When you use enhanced
concurrent volume groups, this setting is the default setting. If it is not set, run the chvg -a n
<vgname> command for each volume group. Run the varyonvg <vgname> command to verify
that all logical volumes and file systems exist and can be mounted (Example 5-18).
Example 5-18 Volume groups
PPs
1
4
4
4
PVs
1
1
1
1
LV STATE
open/syncd
open/syncd
open/syncd
open/syncd
MOUNT POINT
N/A
/dev/siteametro1
/dev/siteametro2
/dev/siteametro3
PPs
PVs
LV STATE
MOUNT POINT
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
175
loglv01
lvsitebglobal1
lvsitebglobal2
lvsitebglobal3
jfs2log
jfs2
jfs2
jfs2
1
4
4
4
1
4
4
4
1
1
1
1
closed/syncd
closed/syncd
closed/syncd
closed/syncd
N/A
/dev/sitebglobal1
/dev/sitebglobal2
/dev/sitebgl
176
state inconsistent_copying
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
[svcxd_a1][/]> ssh admin@B8_8G4 svcinfo lsrcrelationship temp1
id 0
name svc_disk2
master_cluster_id 0000020064009B10
master_cluster_name B8_8G4
master_vdisk_id 0
master_vdisk_name svc_haxd0001
aux_cluster_id 0000020060A0469E
aux_cluster_name B12_4F2
aux_vdisk_id 0
aux_vdisk_name haxd_svc_v0001
primary master
consistency_group_id 0
consistency_group_name svc_metro
state consistent_synchrnized
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
After the copy completes with success, remove the SVC PPRC relationship (Example 5-22).
Example 5-22 Removing an SVC PPRC relationship
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
177
Important: Before you import the volume groups, check in the remote disk or system
whether the PVID is present by running the chdev -l hdisk# -a pv=yes command. This
PVID must match the hdisk of the opposite site as a true complete copy of the disk. The
disk is not a shared disk.
Example 5-24 checks the PVID on the svcxd_b1 node.
Example 5-24 PVID
[svcxd_b1][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
lspv
000fe401088fc86b
00c7cd9e84341284
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
00ca02ef25b24924
rootvg
rootvg
active
active
vg_hbb
[svcxd_b1][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
lspv
00ca02ef74a67ade
00ca02ef751884e8
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
00ca02ef25b24924
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hbb
active
active
Run the lsvg command on each volume group to check whether you can read them
(Example 5-27).
Example 5-27 Volume groups read
PPs
1
4
4
4
PVs
1
1
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
N/A
/dev/siteametro1
/dev/siteametro2
/dev/siteametro3
[svcxd_b1][/]> lsvg
sitebglobalvg:
LV NAME
loglv01
lvsitebglobal1
/dev/sitebglobal1
lvsitebglobal2
/dev/sitebglobal2
lvsitebglobal3
-l sitebglobalvg
TYPE
jfs2log
jfs2
LPs
1
4
PPs
1
4
PVs
1
1
LV STATE
open/syncd
open/syncd
jfs2
open/syncd
jfs2
open/syncd
MOUNT POINT
N/A
/dev/sitebglobal3
Before you import these volume groups to the node svcxd_b2, vary off all of the volume
groups that you imported in the last procedure. On node svcxd_b2, we checked the PVID and
imported volumes groups as shown in Example 5-28.
Example 5-28 Volume groups
PPs
1
4
4
4
PVs
1
1
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
N/A
/dev/siteametro1
/dev/siteametro2
/dev/siteametro3
MOUNT POINT
N/A
-l sitebglobalvg
TYPE
jfs2log
jfs2
LPs
1
4
PPs
1
4
PVs
1
1
LV STATE
closed/syncd
closed/syncd
jfs2
closed/syncd
jfs2
closed/syncd
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
179
[Entry Fields]
[RG_sitea]
Startup Policy
Failover Policy
Failback Policy
If you select this option, the resource group will not have ONLINE
SECONDARY instances. Use this option if you use cross-site LVM
mirroring. You can also use it with PowerHA Enterprise Edition for
Metro Mirror.
Online on Either Site During startup, the primary instance of the resource group is brought
ONLINE on the first node that meets the node policy criteria (either
site). The secondary instance is started on the other site. The primary
instance does not fall back when the original site rejoins.
Online on Both Sites During startup, the resource group (node policy must be defined as
online on all available nodes) is brought ONLINE on both sites. There
is no failover or failback.
In the Participating Nodes from Primary Site parameter, specify the nodes in order for the
primary site.
In the Participating Nodes from Secondary Site parameter, specify the nodes in order for the
secondary site.
Important: The resource groups that include PPRC replicated resources have the
following considerations:
Inter-site management policy of online both sides is not supported.
Startup policies of online using distribution policy and online on all available nodes are
not supported.
Failover policy failover by using dynamic node priority is not supported.
180
Complete the fields as desired. Repeat this action for any additional resource groups that you
need to add. In this scenario, we create two resource groups:
One with nodes svcxd_a1 and svcxd_a2 as the primary site (svc_sitea) for Metro Mirror
One with nodes svcxd_b1 svcxd_b2 as the secondary site (svc_siteb) for Global Mirror
Example 5-29 shows these two resource groups. Only the relevant fields are shown.
Example 5-29 Resources groups
RG_sitea
svcxd_a1 svcxd_a2 svcxd_b2 svcxd_b1
Online On Home Node Only
Failover To Next Priority Node In The List
Never Failback
Prefer Primary Site
RG_siteb
svcxd_b1 svcxd_b2 svcxd_a2 svcxd_a1
Online On Home Node Only
Failover To Next Priority Node In The List
Never Failback
Prefer Primary Site
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
181
After you complete this panel, for RG_sitea or RG_siteb resource group in this example, the
SMIT panel (Figure 5-20) is displayed.
Change/Show All Resources and Attributes for a Custom Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Resource Group Name
Inter-site Management Policy
Participating Nodes from Primary Site
Participating Nodes from Secondary Site
[Entry Fields]
RG_sitea
Prefer Primary Site
svcxd_a1 svcxd_a2
svcxd_b2 svcxd_b1
Startup Policy
Failover Policy
Node In The List
Failback Policy
Service IP Labels/Addresses
Application Servers
[svcxd_a1_sv svcxd_b1_sv]+
[]
Volume Groups
[siteametrovg ]
[svc_metro]
+
+
Figure 5-20 Add SAN Volume Controller replicated resources into resource group
Example 5-30 shows the details of the resource groups that are used in this scenario.
Example 5-30 Resources groups
182
RG_sitea
svcxd_a1 svcxd_a2 svcxd_b2
Online On First Available Node
Failover To Next Priority Node
Never Failback
Prefer Primary Site
svcxd_a1_sv svcxd_b1_sv
siteametrovg
svc_metro
RG_siteb
svcxd_b1 svcxd_b2 svcxd_a2
Online On Home Node Only
Failover To Next Priority Node
Never Failback
Prefer Primary Site
svcxd_a2_sv svcxd_b2_sv
sitebglobalvg
svc_global
5.4.1 Adding disks to PowerHA Enterprise Edition with SAN Volume Controller
In this scenario, you create one new VDisk in SAN Volume Controller cluster, which is B8_8G4
in this scenario, to be a master VDisk. You associate it with the nodes from site svc_sitea
(svcxd_a1 and svcxd_a2), in this scenario. You also create one new VDisk in SAN Volume
Controller Cluster, which is B12_4F2 in this scenario, to be auxiliary VDisk. You associate it
with the nodes from svc_siteb (svcxd_b1 and svcxd_b2), in this scenario (Example 5-31).
Example 5-31 New disks that are created from SAN Volume Controller cluster
[svcxd_a1][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
hdisk10
lspv
000fe4110889e1a9
000fe4112579eb7c
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
000fe4112579ef78
none
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hba
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
[svcxd_b1][/]>
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
hdisk10
lspv
00ca02ef74a67ade
00ca02ef751884e8
000fe4112579ec10
000fe4112579ee11
000fe4112579ee4c
000fe4112579ee89
000fe4112579eec4
000fe4112579eefe
000fe4112579ef3d
00ca02ef25b24924
none
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hbb
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
183
This scenario simulates how to add one more disk in the sitemetrovg volume group and then
how to add it to the replicated resource svc_metro in SVC PPRC (Example 5-32).
Example 5-32 SVC PPRC replicated resources
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8 GLOBAL
AUTO
First, place the PVID in the disk that was created to be the master VDisk in the nodes from
the svc_sitea (Example 5-33).
Example 5-33 Adding the PVID to the new disk
[svcxd_a1][/]> lspv
hdisk0
000fe4110889e1a9
hdisk1
000fe4112579eb7c
hdisk2
000fe4112579ec10
hdisk3
000fe4112579ee11
hdisk4
000fe4112579ee4c
hdisk5
000fe4112579ee89
hdisk6
000fe4112579eec4
hdisk7
000fe4112579eefe
hdisk8
000fe4112579ef3d
hdisk9
000fe4112579ef78
hdisk10
none
[svcxd_a1][/]> chdev -l hdisk10 -a pv=yes
hdisk10 changed
[svcxd_a1][/]> lspv
hdisk0
000fe4110889e1a9
hdisk1
000fe4112579eb7c
hdisk2
000fe4112579ec10
hdisk3
000fe4112579ee11
hdisk4
000fe4112579ee4c
hdisk5
000fe4112579ee89
hdisk6
000fe4112579eec4
hdisk7
000fe4112579eefe
hdisk8
000fe4112579ef3d
hdisk9
000fe4112579ef78
hdisk10
000fe41163c82dff
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hba
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hba
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
You must use the same procedure that is shown in Example 5-32 for node svcxd_a2 because
it shares the disks from svc_sitea.
184
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
185
186
RC_rel_name svc_disk3
RC_rel_id 2
RC_rel_name svc_disk4
RC_rel_id 3
RC_rel_name svc_disk5
RC_rel_id 24
RC_rel_name svc_disk10
Now, add the PVID in the disks from svc_siteb (Example 5-38).
Example 5-38 Adding the PVID in the new disk
svcxd_b1][/]> lspv
hdisk0
00ca02ef74a67ade
hdisk1
00ca02ef751884e8
hdisk2
000fe4112579ec10
hdisk3
000fe4112579ee11
hdisk4
000fe4112579ee4c
hdisk5
000fe4112579ee89
hdisk6
000fe4112579eec4
hdisk7
000fe4112579eefe
hdisk8
000fe4112579ef3d
hdisk9
00ca02ef25b24924
hdisk10
none
[svcxd_b1][/]> chdev -l hdisk10 -a pv=yes
hdisk10 changed
[svcxd_b1][/]> lspv
hdisk0
00ca02ef74a67ade
hdisk1
00ca02ef751884e8
hdisk2
000fe4112579ec10
hdisk3
000fe4112579ee11
hdisk4
000fe4112579ee4c
hdisk5
000fe4112579ee89
hdisk6
000fe4112579eec4
hdisk7
000fe4112579eefe
hdisk8
000fe4112579ef3d
hdisk9
00ca02ef25b24924
hdisk10
000fe41163c82dff
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hbb
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
rootvg
rootvg
siteametrovg
siteametrovg
siteametrovg
siteametrovg
sitebglobalvg
sitebglobalvg
sitebglobalvg
vg_hbb
None
active
active
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
concurrent
You must repeat this procedure for the svcxd_b2 node because it shares the disks from
svc_siteb.
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
187
In the next SMIT screen (Figure 5-21), select the volume group that you want to expand,
which is sitemetrovg in this case.
Select the Volume Group that will hold the new Logical Volume
| Move cursor to desired item and press Enter. Use arrow keys to scroll.
|
|
#Volume Group
Resource Group
Node List
|
siteametrovg
RG_sitea
svcxd_a1,svcxd_a2,svcxd_b1
|
sitebglobalvg
RG_siteb
svcxd_a1,svcxd_a2,svcxd_b1
|
siteametrovg
RG_sitea
svcxd_a2,svcxd_b1,svcxd_b2
|
sitebglobalvg
RG_siteb
svcxd_a2,svcxd_b1,svcxd_b2
|
vg_hbb
<Not in a Resource Group> svcxd_b1,svcxd_b2
||
|
|
|
|
|
|
|
In the Physical Volume Names panel (Figure 5-22), select the desired disk.
+--------------------------------------------------------------------------+
|
Physical Volume Names
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
000fe41163c82dff ( hdisk10 on all selected nodes )
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
| /=Find
n=Find Next
|
+--------------------------------------------------------------------------+
Figure 5-22 Selecting the desired physical volume
Check the options that are selected and press Enter (Figure 5-23).
Add a Volume to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
188
[Entry Fields]
siteametrovg
RG_sitea
svcxd_a1
hdisk10
After you complete the addition of the disk to the volume group, you see the result in both
nodes (Example 5-39).
Example 5-39 Checking the disks in the siteametrovg volume group
TOTAL PPs
1437
1437
1437
1437
637
FREE PPs
1036
1433
1433
1433
637
FREE DISTRIBUTION
288..00..173..287..288
288..283..287..287..288
288..283..287..287..288
288..283..287..287..288
128..127..127..127..128
TOTAL PPs
1437
1437
1437
1437
637
FREE PPs
1036
1433
1433
1433
637
FREE DISTRIBUTION
288..00..173..287..288
288..283..287..287..288
288..283..287..287..288
288..283..287..287..288
128..127..127..127..128
* Relationship Name
* Master VDisk Info
* Auxiliary VDisk Info
[Entry Fields]
[svc_disk10]
[svc_haxd0009@B8_8G4]
[haxd_svc_v0009@B12_4F2]
4. Add this new SVC PPRC relationship to the SVC PPRC replicated resources. In this case,
we add it to the replicated resource svc_metro, which we used for the disks from the
siteametrovg volume group.
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
189
d. In the Change/Show SVC PPRC Resource panel (Figure 5-26), add the svc_disk10
relationship.
Change / Show SVC PPRC Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
*
[Entry Fields]
SVC PPRC Consistency Group Name
svc_metro
New SVC PPRC Consistency Group Name
[]
Master SVC Cluster Name
[B8_8G4] +
Auxiliary SVC Cluster Name
[B12_4F2] +
List of Relationships [svc_disk2 svc_disk3 svc_disk4 svc_disk5 svc_disk10]
Copy Type
METRO +
HACMP Recovery Action
MANUAL +
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllsrelationship -a
relationship_name MasterVdisk_info AuxiliaryVdisk_info
svc_disk2
svc_haxd0001@B8_8G4 haxd_svc_v0001@B12_4F2
svc_disk3
svc_haxd0002@B8_8G4 haxd_svc_v0002@B12_4F2
svc_disk4
svc_haxd0003@B8_8G4 haxd_svc_v0003@B12_4F2
svc_disk5
svc_haxd0004@B8_8G4 haxd_svc_v0004@B12_4F2
svc_disk6
haxd_svc_v0005@B12_4F2 svc_haxd0005@B8_8G4
svc_disk7
haxd_svc_v0006@B12_4F2 svc_haxd0006@B8_8G4
190
svc_disk8
svc_disk10
haxd_svc_v0007@B12_4F2 svc_haxd0007@B8_8G4
svc_haxd0009@B8_8G4 haxd_svc_v0009@B12_4F2
Upon completion, you see the SVC PPRC resources as shown in Example 5-41.
Example 5-41 SVC PPRC resources
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 svc_disk10 METRO
MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
5.4.2 Removing disks from PowerHA Enterprise Edition with SAN Volume
Controller
To remove a disk in PowerHA with SAN Volume Controller, follow the next steps:
1. In PowerHA
a. Remove the disk from the volume group.
b. Remove the SVC PPRC relationship definition in the SVC PPRC Resource.
c. Remove the SVC PPRC relationship.
d. Synchronize the cluster.
2. In the SAN Volume Controller Cluster
a. Remove the SVC PPRC relationship definition from the consistent group.
b. Remove the SVC PPRC relationship definition.
c. Remove the VDisks definitions from both SAN Volume Controller clusters.
191
cluster B8_8G4. The consistency group svc_global in the Primary column is marked as
Master. In this case, SVC PPRC replication goes from SAN Volume Controller cluster
B12_4F2 to the B8_8G4 SAN Volume Controller cluster because their master cluster is the
SAN Volume Controller cluster B12_4F2. In the State column, you can check the status of the
copy between the SAN Volume Controller clusters.
Example 5-42 shows how to check the status of the svc_disk2 relationship by using the
command line.
Example 5-42 SVC PPRC relationship of svc_disk2
192
Verify the status of the SVC PPRC relationship by using the information in the following
sections:
master_cluster_name
The name of the master SAN Volume Controller cluster B8_8G4.
aux_cluster_name
primary
consistency_group_name
The name of the consistency group that is included.
state
The status of the copy between both SAN Volume Controller clusters.
Soft test
Test the PowerHA Enterprise Edition with SAN Volume Controller by using the C-SPOC
operation. As shown in Example 5-43, resource group RG_sitea is online on node svcxd_a1
on site svc_sitea. Resource group RG_siteb is online on node svcxd_b1 on site svc_siteb.
Example 5-43 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Resource group RG_sitea manages the SVC PPRC replicated resource svc_metro.
Resource group RG_siteb manages the SVC PPRC replicated resource svc_global.
Example 5-44 shows the SVC PPRC replicated resources.
Example 5-44 SVC PPRC replicated resource
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
193
In the first test, we move the resource group RG_sitea to the node svcxd_a2 on the same site
svc_sitea. To move the resource group:
1. Enter the smitty cl_admin command.
2. Select Resource Groups and Applications Move a Resource Group to Another
Node / Site Move Resource Groups to Another Node.
3. For Resource Group(s) to be Moved, select RG_sitea, and for Destination Node select
svcxd_a2 (Figure 5-28).
Move Resource Group(s) to Another Node
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
svcxd_a2
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
svcxd_a1@svc_s
ONLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-46 shows that the status of the relationship did not change because the node
from svc_sitea shares the disks.
Example 5-46 SVC PPRC relationship status
consistency_group_name svc_metro
state consistent_synchronized
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
5. Move RG_sitea resource group to the svcxd_b1 node on the svc_siteb site (Figure 5-29).
Move Resource Group(s) to Another Node
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
svcxd_b1
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
ONLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-48 shows that the status of the relationship changed. Now the SAN Volume
Controller cluster B8_8G4 works as auxiliary for B12_4F2 because the Primary field is
now aux. The field state is still in the consistent_synchronized state.
Example 5-48 SVC PPRC relationship status
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
195
aux_vdisk_name haxd_svc_v0001
primary aux
consistency_group_id 0
consistency_group_name svc_metro
state consistent_synchronized
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
7. Move the resource group RG_sitea to node svcxd_b2 on site svc_siteb (Figure 5-30).
Move Secondary Instance(s) of Resource Group(s) to Another Node
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
svcxd_b2
Example 5-49 shows that RG_sitea is online in the node svcxd_b2 on site svc_siteb.
Example 5-49 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-50 shows that the status of the relationship did not change, as the nodes from
svc_siteb share disks.
Example 5-50 SVC PPRC relationship status
196
aux_cluster_name B12_4F2
aux_vdisk_id 0
aux_vdisk_name haxd_svc_v0001
primary aux
consistency_group_id 0
consistency_group_name svc_metro
state consistent_synchronized
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
8. Return the resource group RG_sitea to node svcxd_a1 on site svc_sitea (Figure 5-31).
Move Resource Group(s) to Another Node
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
svcxd_a1
Example 5-51 shows that RG_sitea is online in node svcxd_a1 on site svc_sitea.
Example 5-51 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-52 shows that the status of the relationship changed. Now the SAN Volume
Controller cluster B12_4F2 works as an auxiliary of B8_8G4 because the Primary field is now
master. The State field still is in the consistent_synchronized state.
Example 5-52 SVC PPRC relationship status
197
master_vdisk_id 0
master_vdisk_name svc_haxd0001
aux_cluster_id 0000020060A0469E
aux_cluster_name B12_4F2
aux_vdisk_id 0
aux_vdisk_name haxd_svc_v0001
primary master
consistency_group_id 0
consistency_group_name svc_metro
state consistent_synchronized
bg_copy_priority 50
progress
freeze_time
status online
sync
copy_type metro
Hard test
Test PowerHA for SAN Volume Controller by using the halt -q command to simulate node
and site problems. Example 5-53 shows that the resource group RG_siteb is online on node
svcxd_b1 on site svc_siteb. This resource group manages the svc_global replicated resource
for SVC PPRC.
Example 5-53 Resource group status
[svcxd_b1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-54 shows the status of the svc_global consistency group that is managed from
RG_siteb.
Example 5-54 SVC PPRC consistency group status
198
status
sync
copy_type global
RC_rel_id 4
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Status: You can check the status of the relationship by using the command line of the SVC
PPRC consistency group or by using the SVC PPRC relationship. For more information,
see the SVC CLI Guide, SC26-7903.
In the first test, issue the halt -q command in node svcxd_b1 (Example 5-55).
Example 5-55 Simulating failure of a node
[svcxd_b1][/]> halt -q
....Halt completed....
Example 5-56 shows that RG_siteb is online in the svcxd_b2 node on the svc_siteb site.
Example 5-56 Resource group status
[svcxd_b2][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
OFFLINE
ONLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-57 shows that the status of the consistency group did not change because the
nodes from svc_siteb share disks.
Example 5-57 SVC PPRC consistency group status
199
sync
copy_type global
RC_rel_id 4
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Enter the halt -q command on the svcxd_b2 node, and simulate a failure of the svc_siteb
site (Example 5-58).
Example 5-58 Simulating a failure of a node
[svcxd_b2][/]> halt -q
....Halt completed....
Example 5-59 shows that RG_siteb is online on the svcxd_a2 node on the svc_sitea site.
Example 5-59 Resource group status
[svcxd_a2][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
OFFLINE
OFFLINE
ONLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-60 shows that the status of the consistency group changed. Now the SAN Volume
Controller cluster B12_4F2 works as an auxiliary of B8_8G4 because the primary field is now
aux. The state field still is in the consistent_synchronized state.
Example 5-60 SVC PPRC consistency group status
200
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Enter the halt -q command on the svcxd_a2 node, and simulate a failure of this node
(Example 5-61).
Example 5-61 Simulating failure of a node
[svcxd_b2][/]> halt -q
....Halt completed....
Example 5-62 shows that RG_siteb is online on the svcxd_a1 node on the svc_sitea site.
Example 5-62 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
OFFLINE
OFFLINE
OFFLINE
ONLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-63 shows that the status of the consistency group did not change because the
nodes from svc_sitea share disks.
Example 5-63 SVC PPRC consistency group status
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
201
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Enter the halt -q command on the svcxd_a1 node, and simulate a failure of the svc_sitea
site (Example 5-64). In this scenario, we test both resource groups.
Example 5-64 Simulating the failure of a node
[svcxd_a1][/]> halt -q
....Halt completed....
Example 5-65 shows that RG_siteb is online on the svcxd_b1 node on the svc_siteb site, and
RG_sitea is online on the svcxd_b2 node on the svc_siteb site.
Example 5-65 Resource group status
[svcxd_b1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-66 shows that the status of the consistency group changed. Now the SAN Volume
Controller cluster B8_8G42 works as an auxiliary of B12_4F2 because the primary field is
now master. The state field is still in the consistent_synchronized state.
Example 5-66 SVC PPRC consistency group status
202
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Example 5-67 shows that the status of the consistency group changed. Now the SAN Volume
Controller cluster B8_8G42 works as an auxiliary of B12_4F2 because the primary field is
now master. The state field is still in the consistent_synchronized state.
Example 5-67 SVC PPRC consistency group status
TOTAL PPs
1437
1437
1437
1437
FREE PPs
1036
1433
1433
1433
FREE DISTRIBUTION
288..00..173..287..288
288..283..287..287..288
288..283..287..287..288
288..283..287..287..288
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
203
Example 5-69 shows that RG_sitea is online on the svcxd_a1 node on the svc_sitea site.
Example 5-69 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
As shown in Example 5-70, remove the VDisk maps from the disks that are used in the
siteametrovg volume group from the svcxd_a1 and svcxd_b1 nodes.
Example 5-70 Removing the VDisk mapping
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
[svcxd_a1][/]>ssh
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
admin@B8_8G4
svctask
svctask
svctask
svctask
svctask
svctask
svctask
svctask
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
rmvdiskhostmap
-host
-host
-host
-host
-host
-host
-host
-host
SVC_550_1_A1
SVC_550_2_A2
SVC_550_1_A1
SVC_550_2_A2
SVC_550_1_A1
SVC_550_2_A2
SVC_550_1_A1
SVC_550_2_A2
svc_haxd0001
svc_haxd0001
svc_haxd0002
svc_haxd0002
svc_haxd0003
svc_haxd0003
svc_haxd0004
svc_haxd0005
As shown in Example 5-71, you now see the AIX errorlog messages about a physical disk
that is missing, an lvm I/O error, and quorum lost. The hacmp.out file shows messages about
a volume group failure.
Example 5-71 AIX errorlog and hacmp.log outputs
From hacmp.out:
HACMP Event Preamble
---------------------------------------------------------------------------Enqueued rg_move release event for resource group 'RG_sitea'.
Reason for recovery of Primary instance of Resource group 'RG_sitea' from
TEMP_ERROR state on node 'svcxd_a1' was 'Volume group failure'.
Enqueued rg_move acquire event for resource group 'RG_sitea'.
Cluster Resource State Change Complete Event has been enqueued.
---------------------------------------------------------------------------Mar 18 10:53:49 EVENT START: resource_state_change svcxd_a1
From errorlog:
LABEL:
Date/Time:
204
LVM_IO_FAIL
Thu Mar 18 10:53:46 2010
LABEL:
Date/Time:
LVM_SA_QUORCLOSE
Thu Mar 18 10:53:46 2010
LABEL:
Date/Time:
LVM_SA_PVMISS
Thu Mar 18 10:53:46 2010
Example 5-72 shows that RG_sitea has moved and is now online on the svcxd_b1 node on
the svc_siteb site.
Example 5-72 Resource group status
[svcxd_b1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-73 shows that the status of the consistency group changed. Now the SAN Volume
Controller cluster B8_8G42 works as an auxiliary of B12_4F2 because the primary field is
now aux. The state field is still in an consistent_synchronized state because, in this test, we
remove the map from the VDisk to the hosts, and the relationship still exists.
Example 5-73 SVC PPRC consistency group status
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
205
As shown in Example 5-74, add the VDisk maps to the disks used in the siteametrovg volume
group on the svcxd_a1 and svcxd_b1 nodes.
Example 5-74 Adding VDisk mapping
Example 5-75 shows that, after you run the cfgmgr command, you can access the disk again
from the SAN Volume Controller Cluster in the node on the svc_sitea site.
Example 5-75 Disks
206
[svcxd_a1][/]> cfgmgr
[svcxd_a1][/]> lsdev -Cc disk
hdisk0 Available 23-T1-01 MPIO
hdisk1 Available 23-T1-01 MPIO
hdisk2 Available 23-T1-01 MPIO
hdisk3 Available 23-T1-01 MPIO
hdisk4 Available 33-T1-01 MPIO
hdisk5 Available 33-T1-01 MPIO
hdisk6 Available 33-T1-01 MPIO
hdisk7 Available 33-T1-01 MPIO
hdisk8 Available 33-T1-01 MPIO
hdisk9 Available 23-T1-01 MPIO
FC
FC
FC
FC
FC
FC
FC
FC
FC
FC
2145
2145
2145
2145
2145
2145
2145
2145
2145
2145
[svcxd_a2][/]> cfgmgr
[svcxd_a2][/]> lsdev -Cc disk
hdisk0 Available 14-T1-01 MPIO
hdisk1 Available 14-T1-01 MPIO
hdisk2 Available 14-T1-01 MPIO
hdisk3 Available 14-T1-01 MPIO
hdisk4 Available 14-T1-01 MPIO
hdisk5 Available 14-T1-01 MPIO
hdisk6 Available 14-T1-01 MPIO
hdisk7 Available 14-T1-01 MPIO
hdisk8 Available 14-T1-01 MPIO
hdisk9 Available 14-T1-01 MPIO
FC
FC
FC
FC
FC
FC
FC
FC
FC
FC
2145
2145
2145
2145
2145
2145
2145
2145
2145
2145
Example 5-76 shows the status of the resource group after we move RG_sitea to the
svcxd_a1 node on the svc_sitea site.
Example 5-76 Resource group status
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE SECONDARY
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
ONLINE SECONDARY
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Example 5-77 shows that the status of the consistency group changed. Now the SAN Volume
Controller cluster B12 _4F2 works as an auxiliary of B8_8G4 because the primary field is now
master. The state field still is in the consistent_synchronized state.
Example 5-77 SVC PPRC consistency group status
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
207
We use the svc_global SVC PPRC Resource as an example in this test scenario to convert
the SVC PPRC replication mode. Example 5-78 lists the SVC PPRC resources.
Example 5-78 SVC PPRC resources
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
208
To change an SVC PPRC resource, enter the smitty hacmp command. Select Extended
Configuration Extended Resource Configuration HACMP Extended Resources
Configuration Configure SVC PPRC-Replicated Resources SVC PPRC-Replicated
Resource Configuration Change/Show an SVC PPRC Resource select the SVC
PPRC Resource (Figure 5-32).
+--------------------------------------------------------------------------+
|
Select the SVC PPRC Resource Name to Change/Show
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
svc_metro
|
|
svc_global
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
| /=Find
n=Find Next
|
+--------------------------------------------------------------------------+
Figure 5-32 SVC PPRC resources
In the Change/Show SVC PPRC Resource panel (Figure 5-33), change the Copy Type field
to METRO.
Change / Show SVC PPRC Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
*
[Entry Fields]
svc_global
[]
[B12_4F2]
[B8_8G4]
[svc_disk6 svc_disk7 svc_disk8]
METRO
AUTO
+
+
+
+
+
After you change the SVC PPRC resource, list the resource (Example 5-80).
Example 5-80 SVC PPRC resources
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
METRO AUTO
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
209
210
After you synchronize the cluster, check the status of the svc_global consistency group
(Example 5-82).
Example 5-82 The consistency group
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
211
After the copy process finishes, check the status of the svc_global consistency group
(Example 5-84).
Example 5-84 Consistency group status
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/utils]> ssh admin@B8_8G4 svcinfo lsrcconsistgrp svc_global
id 1
name svc_global
master_cluster_id 0000020060A0469E
master_cluster_name B12_4F2
aux_cluster_id 0000020064009B10
aux_cluster_name B8_8G4
primary master
state consistent_synchronized
relationship_count 3
freeze_time
status
sync
copy_type metro
RC_rel_id 4
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
*
*
*
*
*
[Entry Fields]
svc_global
[]
[B12_4F2] +
[B8_8G4] +
[svc_disk6 svc_disk7 svc_disk8] +
GLOBAL+
AUTO +
212
After you change the SVC PPRC resource, list the resource as shown in Example 5-85.
Example 5-85 SVC PPRC resources
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
Now synchronize the cluster. After you synchronize the cluster, check the status of the
svc_global consistency group (Example 5-86).
Example 5-86 Consistency group status
213
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
After you complete the copy, check status of the svc_global consistency group
(Example 5-88).
Example 5-88 Consistency group status after the copy
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc -a
svcpprc_consistencygrp MasterCluster
AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4 svc_disk5 METRO MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL AUTO
214
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Before you start this test, check that the svc_metro consistency group is Consistent
Synchronized between SAN Volume Controller cluster B8_8G4 and B12_4F2 (Figure 5-36).
After stopping the replication links between the SAN Volume Controller clusters B8_8G4 and
B12_4F2, the state of the consistency group changed. Example 5-90 shows the
idling_disconnected state in the svc_metro consistency group on the B8_8G4 cluster for
SAN Volume Controller. In this state, the VDisks in this half of the consistency group are all
operating in the primary role B8_8G4 and can accept read or write I/O operations.
Example 5-90 SVC PPRC consistency group
215
aux_cluster_name
primary master
state idling_disconnected
relationship_count 4
freeze_time
status
sync
copy_type metro
RC_rel_id 0
RC_rel_name svc_disk2
RC_rel_id 1
RC_rel_name svc_disk3
RC_rel_id 2
RC_rel_name svc_disk4
RC_rel_id 3
RC_rel_name svc_disk5
Example 5-91 shows the consistent_disconnected state for the svc_metro consistency
group on the B12_4F2 cluster for SAN Volume Controller. In this state, the VDisks in this half
of the consistency group are all operating in the secondary B12_4F2 role and can accept
read I/O operations, but not write I/O operations.
Example 5-91 SVC PPRC consistency group
216
If you prefer, you can check the state of the consistency group relationship in the SAN Volume
Controller console. Figure 5-37 shows the cluster B8_G4 for SAN Volume Controller for the
svc_metro consistency group in the idling_disconnected state.
In Figure 5-38, we see in SAN Volume Controller cluster B12_4F2 for the consistency group
svc_metro the consistent_disconnected state.
Now, to simulate a site failure on nodes from site svc_sitea, run the reboot -q command
(Example 5-92).
Example 5-92 Simulating the failure of a node
[svcxd_a1][/]> reboot -q
[svcxd_a2][/]> reboot -q
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
217
After the failure, RG_sitea resource group is in the ERROR state at the svcxd_b1 node on the
svc_siteb site (Figure 5-39).
[svcxd_b2][/var/hacmp/log]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
ERROR
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
The RG_sitea resource group cannot bring ONLINE on the svcxd-b1 site because of the
consistent_disconnected state of the svc_metro consistency group in the SAN Volume
Controller cluster B12_4F2.
In the hacmp.out file from the svcxd_b1 node, you can check the following instructions about
the SVC PPRC resource in MANUAL Recovery Action (Example 5-93).
Example 5-93 The hacmp.out file from the svcxd_b1 node
STEP 3: On the SVC GUI, check the states of the SVC consistency groups
If the consistency groups at the remote site are in "consistent_disconnected" state,
Select the consistency groups and run the "stoprcconsistgrp -access" command
against them to enable I/O to the backup VDisks.
ssh admin@10.12.5.55 svctask stoprcconsistgrp -access svc_metro
NOTE:If the consistency groups are in any other state please consult the
IBM Storage Systems SVC documentation for what further instructions are needed
STEP 4: Wait until the consistency groups state is "idling_disconnected"
STEP 5: Using smitty hacmp select the node you want the RG to be online at.
smitty hacmp -> System Management (C-SPOC) -> HACMP Resource Group and Application
Management -> Bring a Resource Group Online
for the node where you want the RG to come online
Once this completes the RG should be online on the selected site.
218
Case 2) If Production site SVC cluster and HACMP nodes are both UP.
STEP 3:
On the SVC GUI, check the states of the SVC consistency groups
If the consistency groups at the remote site are in "consistent_disconnected" state,
Select the consistency groups and run the "stoprcconsistgrp -access" command.
ssh admin@10.12.5.55 svctask stoprcconsistgrp -access svc_metro
NOTE: If the consistency groups are in any other state, please consult
the IBM Storage Systems SVC documentation for what further actions
are needed
Wait until the consistency groups state is "idling_disconnected" you can check them with
the following command.
ssh admin@10.12.5.55 svcinfo lsrcconsistgrp -delim : svc_metro
ssh admin@10.114.63.250 svcinfo lsrcconsistgrp -delim : svc_metro
STEP 4: Check and re-connect all physical links (HACMP and SVC links).
On the SVC GUI, check the states of the SVC consistency groups
If the consistency groups are in "idling" state,
determine which HACMP site that the RG should be coming online. Once you do that run
/usr/es/sbin/cluster/svcpprc/cmds/cllssvc -ah
B8_8G4 Master svc_sitea 10.12.5.55 B12_4F2
B12_4F2 Master svc_siteb 10.114.63.250 B8_8G4
so if you want the RG's online on the secondary site you would pick 10.114.63.250 which
is the auxiliary
So next run this command for all the consistency groups so we restart them in the
correct direction. This example is to use the aux site.. however if you wanted to use the
master site you should change -primary aux to be -primary master.
ssh admin@10.12.5.55 svctask startrcconsistgrp -force -primary [ aux | master ]
svc_metro
If the cluster links are not up we will get the error:
CMMVC5975E The operation was not performed because the cluster partnership is not
connected.
NOTE: If the consistency groups are in any other state please consult the
IBM Storage Systems SVC documentation for further instructions.
STEP 5:
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
219
freeze_time:
status:online
sync:
copy_type:global
RC_rel_id:98
RC_rel_name:FVT_REL1
Note:
Be sure before you fix the connection from the primary HACMP to secondary HACMP site
nodes that you make sure only one site is running HACMP with the resource groups online. If
they both have the resources when the network connection is repaired one of the 2 sites will be
halted.
STEP 6:
Using smitty hacmp select the node you want the RG to be online at.
smitty hacmp -> System Management (C-SPOC) -> HACMP Resource Group and Application
Management -> Bring a Resource Group Online
for the node where you want the RG to come online
Once this completes the RG should be online on the selected site.
Now, run the stoprcconsistgrp -access command to enable I/O to the backup VDisks in the
SAN Volume Controller Cluster B12_4F2. Then, you can see the idling_disconnected state
for the svc_metro consistency group (Example 5-94).
Example 5-94 Enabling I/O to back up SAN Volume Controller cluster VDisk
220
After you change the svc_metro consistency group to the idling_disconnected state, bring
the resource group online by manually using C-SPOC. In this scenario, we changed the state
of the consistency group, and the resource group is brought online automatically (Figure 5-40).
[svcxd_b2][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
ONLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
After you reconnect all physical PPRC links, the state of the svc_metro consistency group
changes to idling in both SAN Volume Controller clusters (Example 5-95). In this state, the
master disks and the auxiliary disks are operating in the primary role. Then, both are
accessible for write I/O. In this state, the relationship or consistency group accepts a Start
command.
Example 5-95 SVC PPRC consistency group
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
221
master_cluster_name B8_8G4
aux_cluster_id 0000020060A0469E
aux_cluster_name B12_4F2
primary
state idling
relationship_count 4
freeze_time
status
sync out_of_sync
copy_type metro
RC_rel_id 0
RC_rel_name svc_disk2
RC_rel_id 1
RC_rel_name svc_disk3
RC_rel_id 2
RC_rel_name svc_disk4
RC_rel_id 3
RC_rel_name svc_disk5
Next, restart the svc_metro consistency group now that the RG_sitea resource group is
online on the svc_siteb and accessing VDisks from SAN Volume Controller cluster B12_4F2.
For the svc_metro consistency group, the SAN Volume Controller cluster B12_4F2 is aux, and
SAN Volume Controller cluster B8_8G4 is master. Restart the copy from the SAN Volume
Controller cluster B12_4F2 to the SAN Volume Controller cluster B8_8G4 (Example 5-96).
Example 5-96 Restarting the consistency group svc_metro
222
Now after you check the consistent_synchronized state of the svc_metro consistency group,
and the cluster services are running on the nodes from the svc_sitea site, use C-SPOC to
move the RG_sitea resource group to the svcxd_a1 node (Figure 5-41).
[svcxd_b1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
Now the resource group is moved to the svc_sitea site, and the svc_metro consistency group
is changed from primary aux to primary master (that is, from B8_8G4 to B12_4F2)
(Example 5-97).
Example 5-97 SVC PPRC consistency group
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
223
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
224
After stopping the replication links between the SAN Volume Controller B8_8G4 and
B12_4F2 clusters, the consistency group state changed. Example 5-98 shows the
idling_disconnected state in the svc_global consistency group on the SAN Volume
Controller cluster B12_4F2. In this state, the VDisks in this half of the consistency group are
all operating in the primary role B12_4F2 and can accept read or write I/O operations.
Example 5-98 SVC PPRC consistency group
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
225
To simulate a site failure in the nodes from the svc_siteb site, run the reboot -q command
(Example 5-100).
Example 5-100 Simulating failure of a node
[svcxd_b1][/]> reboot -q
[svcxd_b2][/]> reboot -q
You can see that RG_sitea resource group is in the ONLINE state on the svcxd_a2 node on the
svc_sitea site (Figure 5-44).
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
OFFLINE
OFFLINE
ONLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
In the SAN Volume Controller cluster, the svc_global consistency group automatically
changed the state from consistent_disconnected to idling_disconnected with the cluster in
the SAN Volume Controller Cluster B8_8G4 (Example 5-101).
Example 5-101 SVC PPRC consistency group
name svc_global
master_cluster_id 0000020060A0469E
master_cluster_name B12_4F2
aux_cluster_id 0000020064009B10
aux_cluster_name
primary master
state idling_disconnected
relationship_count 3
freeze_time
status
sync
copy_type global
RC_rel_id 4
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
After reconnecting all the physical PPRC links, the state of the svc_global consistency group
changes to idling in both SAN Volume Controller clusters (Example 5-102).
Example 5-102 SVC PPRC consistency group status
227
status
sync out_of_sync
copy_type global
RC_rel_id 4
RC_rel_name svc_disk6
RC_rel_id 5
RC_rel_name svc_disk7
RC_rel_id 6
RC_rel_name svc_disk8
Next restart the svc_global consistency group in the correct direction now that the RG_siteb
resource group is online on svc_sitea and accessing VDisks from SAN Volume Controller
cluster B8_8G4.
For the svc_global consistency group, the SAN Volume Controller cluster B88G4 is aux and
SAN Volume Controller cluster B12_4F2 is master. Restart the copy from SAN Volume
Controller cluster B8_8G4 to SAN Volume Controller cluster B12_4F2 (Example 5-103).
Example 5-103 Restarting consistency group svc_metro
228
After you check the consistent_synchronized state of the svc_global consistency group and
that the cluster services are running in the nodes from the svc_siteb site, by using C-SPOC,
move the resource group RG_siteb to the node svcxd_b1 (Figure 5-45).
[svcxd_a1][/]> clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
svcxd_a1@svc_s
OFFLINE
svcxd_a2@svc_s
OFFLINE
svcxd_b2@svc_s
OFFLINE
svcxd_b1@svc_s
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
svcxd_b1@svc_s
svcxd_b2@svc_s
svcxd_a2@svc_s
svcxd_a1@svc_s
After the resource group moves to site svc_siteb, check that the svc_global consistency group
changes from primary aux to primary master (that is, from B12_4F2 to B8_8G4)
(Example 5-104).
Example 5-104 SVC PPRC consistency group status
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
229
230
empty
This state applies only to consistency groups. A consistency group has no relationship
and no other state information to show. It is entered when a consistency group is first
created. It is exited when the first relationship is added to the consistency group, at which
point the state of the relationship becomes the state of the consistency group.
To view the state of specific relationship groups that PowerHA manages, by using the
configured resource group, run the command that is shown in Example 5-106.
Example 5-106 SVC PPRC relationship
231
primary master
consistency_group_id 0
consistency_group_name svc_metro
state consistent_synchronized
bg_copy_priority 50
progress
freeze_time
status online
sync
To view the state of all consistency groups that PowerHA manages, by using the configured
resource group, run the command that is shown in Example 5-107.
Example 5-107 SVC PPRC consistency group status
232
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvc
B8_8G4
B12_4F2
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvc -a
#SVCNAME ROLE SITENAME IPADDR IPADDR2 RPARTNER
B8_8G4 Master svc_sitea 10.12.5.55 B12_4F2
B12_4F2 Master svc_siteb 10.114.63.250 B8_8G4
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvc -n B8_8G4
#SVCNAME ROLE SITENAME IPADDR IPADDR2 RPARTNER
B8_8G4 Master svc_sitea 10.12.5.55 B12_4F2
The cllsrelationship command shows information about all SVC PPRC relationships or a
specific PPRC relationship.
cllsrelationship [-n <relationship_name>] [-c] [-a] [-h]
If no resource name is specified, the names of all PPRC resources that are defined are listed.
If the -a flag is provided, full information about all PPRC relationships is displayed. If a
specific relationship is provided by using the -n flag, only information about this relationship is
displayed. The -c flag displays information in a colon-delimited format. The -h flag turns off
the display of column headers. Example 5-110 shows the results of running the
cllsrelationship command.
Example 5-110 Results of the cllsrelationship command
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllsrelationship
svc_disk2
svc_disk3
svc_disk4
svc_disk5
svc_disk6
svc_disk7
svc_disk8
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllsrelationship -a
relationship_name MasterVdisk_info AuxiliaryVdisk_info
svc_disk2
svc_haxd0001@B8_8G4 haxd_svc_v0001@B12_4F2
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
233
svc_disk3
svc_disk4
svc_disk5
svc_disk6
svc_disk7
svc_disk8
svc_haxd0002@B8_8G4 haxd_svc_v0002@B12_4F2
svc_haxd0003@B8_8G4 haxd_svc_v0003@B12_4F2
svc_haxd0004@B8_8G4 haxd_svc_v0004@B12_4F2
haxd_svc_v0005@B12_4F2 svc_haxd0005@B8_8G4
haxd_svc_v0006@B12_4F2 svc_haxd0006@B8_8G4
haxd_svc_v0007@B12_4F2 svc_haxd0007@B8_8G4
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> ./cllssvcpprc
svc_metro
svc_global
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> cllssvcpprc -a
svcpprc_consistencygrp MasterCluster AuxiliaryCluster relationships CopyType
RecoveryAction
svc_metro
B8_8G4
B12_4F2
svc_disk2 svc_disk3 svc_disk4
svc_disk5 METRO
MANUAL
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL
AUTO
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/cmds]> cllssvcpprc -n svc_global
svcpprc_consistencygrp MasterCluster AuxiliaryCluster relationships CopyType
RecoveryAction
svc_global
B12_4F2
B8_8G4
svc_disk6 svc_disk7 svc_disk8
GLOBAL
AUTO
The cl_verify_svcpprc_config command verifies the SAN Volume Controller definition in the
PowerHA configuration. This command is stored in the
/usr/es/sbin/cluster/svcpprc/utils directory (Example 5-112).
Example 5-112 Results of the cl_verify_svcpprc-config command
[svcxd_a1][/usr/es/sbin/cluster/svcpprc/utils]> ./cl_verify_svcpprc_config
Verifying HACMP-SVCPPRC configuration...
cl_verify_svcpprc_config: Checking available nodes
cl_verify_svcpprc_config: Retrieving disk information from node svcxd_a1
cl_verify_svcpprc_config: Retrieving disk information from node svcxd_a2
cl_verify_svcpprc_config: Retrieving disk information from node svcxd_b1
cl_verify_svcpprc_config: Retrieving disk information from node svcxd_b2
cl_verify_svcpprc_config: Checking available SVCs
cl_verify_svcpprc_config: Checking license, release level and disk map for SVC B8_8G4 at 10.12.5.55
234
cl_verify_svcpprc_config: Checking license, release level and disk map for SVC B12_4F2 at 10.114.63.250
cl_verify_svcpprc_config: Checking consistency groups
cl_verify_svcpprc_config: Checking consistency group svc_metro
cl_verify_svcpprc_config: Checking consistency group svc_global
cl_verify_svcpprc_config: Checking resource groups.
cl_verify_svcpprc_config: Checking resource group RG_sitea
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_a1 for resource group RG_sitea
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_a2 for resource group RG_sitea
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_b2 for resource group RG_sitea
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_b1 for resource group RG_sitea
cl_verify_svcpprc_config: Checking volume group siteametrovg in group RG_sitea on site svc_sitea
cl_verify_svcpprc_config: Checking volume group siteametrovg in group RG_sitea on site svc_siteb
cl_verify_svcpprc_config: Checking resource group RG_siteb
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_b1 for resource group RG_siteb
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_b2 for resource group RG_siteb
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_a2 for resource group RG_siteb
cl_verify_svcpprc_config: Checking SVC virtual disks on node svcxd_a1 for resource group RG_siteb
cl_verify_svcpprc_config: Checking volume group sitebglobalvg in group RG_siteb on site svc_sitea
cl_verify_svcpprc_config: Checking volume group sitebglobalvg in group RG_siteb on site svc_siteb
cl_verify_svcpprc_config: Verifying consistency groups against the SVC configuration
cl_verify_svcpprc_config: Establishing consistency group svc_metro
cl_verify_svcpprc_config: WARNING: Consistency Group svc_metro already exists
cl_verify_svcpprc_config: Verifying relationships for consistency group svc_metro
cl_verify_svcpprc_config: Verifying relationship svc_disk2 in consistency group svc_metro
cl_verify_svcpprc_config: Relationship svc_disk2 already exists for consistency group svc_metro
cl_verify_svcpprc_config: Verifying relationship svc_disk3 in consistency group svc_metro
cl_verify_svcpprc_config: Relationship svc_disk3 already exists for consistency group svc_metro
cl_verify_svcpprc_config: Verifying relationship svc_disk4 in consistency group svc_metro
cl_verify_svcpprc_config: Relationship svc_disk4 already exists for consistency group svc_metro
cl_verify_svcpprc_config: Verifying relationship svc_disk5 in consistency group svc_metro
cl_verify_svcpprc_config: Relationship svc_disk5 already exists for consistency group svc_metro
cl_verify_svcpprc_config: Establishing consistency group svc_global
cl_verify_svcpprc_config: WARNING: Consistency Group svc_global already exists
cl_verify_svcpprc_config: Verifying relationships for consistency group svc_global
cl_verify_svcpprc_config: Verifying relationship svc_disk6 in consistency group svc_global
cl_verify_svcpprc_config: Relationship svc_disk6 already exists for consistency group svc_global
cl_verify_svcpprc_config: Verifying relationship svc_disk7 in consistency group svc_global
cl_verify_svcpprc_config: Relationship svc_disk7 already exists for consistency group svc_global
cl_verify_svcpprc_config: Verifying relationship svc_disk8 in consistency group svc_global
cl_verify_svcpprc_config: Relationship svc_disk8 already exists for consistency group svc_global
HACMP-SVCPPRC configuration verified successfully. Status=0
After the successful verification of the SAN Volume Controller configurations, it establishes all
the SAN Volume Controller relationships that re defined to PowerHA on the SAN Volume
Controller clusters and adds them to the corresponding consistency groups.
Chapter 5. Configuring PowerHA SystemMirror Enterprise Edition with Metro Mirror and Global Mirror
235
236
Chapter 6.
Configuring PowerHA
SystemMirror Enterprise Edition
with ESS/DS Metro Mirror
This chapter explains how to install, configure, and use the IBM PowerHA SystemMirror
Enterprise Edition Metro Mirroring with disk system command-line interface (DSCLI)
management. This procedure is accomplished by using Peer-to-Peer Remote Copy (PPRC)
to maintain copies of data between sites on a set of paired disks. The PowerHA resource
group contains information about the volume pairs and the paths to communicate between
sites. PowerHA uses dscli commands to associate the PPRC-mirrored volumes to the active
site. Metro Mirroring supports only synchronous writes between the sites.
This chapter provides examples of implementing the setup, testing various failover scenarios,
and the addition and removal of disks from the cluster.
The chapter includes the following sections:
Planning
Software requirements
Considerations and restrictions
Environment example
Installing and configuring Metro Mirroring
Test scenarios
Adding and removing LUNs
Commands for troubleshooting or gathering information
PowerHA Enterprise Edition: SPPRC DSCLI security enhancements
237
6.1 Planning
Before you configure the DSCLI metro mirroring environment, you must decide the following
items:
Nodes and sites to used
Networks, including XD_ip networks between sites and local networks
The Copy Services Servers (CSS) to use from both sites
The disks to use
The vpaths to use, including volume IDS
The PPRC replicated resources
The port pairs for the PPRC paths
The volume pairs
The volume groups that are managed by the PPRC replicated resources and which
resource groups they will be a part of
Reference: For more planning information, see the HACMP for AIX 6.1 Planning and
Administration Guide, SC23-4863.
238
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
239
WAN
Site A
Site B
XD_DATA
DS_A1
XD_DATA
DS_A2
DS_B1
DS_B2
disk_hb
disk_hb
SAN_B8
SAN_B12
DS8K_B8
Source LSS
DS8K_B12
FCP PPRC
Target LSS
HMC_B12
CSS_server
HMC_B8
CSS_server
CSSs per site: The environment in Figure 6-1 shows one CSS (Storage HMC) per site.
However, you can also configure two CSSs per site for redundancy.
240
Site A
Site B
DS_A1
DS_B1
net_xd_ip_01
HYA1
DS_B1_xdip
10.114.124.75
DS_A1_xdip
10.12.5.38
HYB1
Zachary
Zachary
DS_A1_service
192.168.100.95
DS_A1_boot
192.168.8.105
net_ether_01
Benjamin
DS_B1_boot
192.168.12.105
WAN
net_ether_01
nancy_B
net_diskhb_02
nancy_A
net_diskhb_01
HYA2
Gabriel
DS_A2_xdip
10.12.5.39
DS_B2_xdip
10.114.124.76
DS_A2_service
192.168.100.95
DS_A2_boot
192.168.8.106
DS_B2_boot
192.168.12.106
DS_A2
Benjamin
HYB2
Gabriel
DS_B2
The following configuration information is used when testing our scenarios for this chapter.
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
241
DS_A1_service: 192.168.100.55
DS_B2_service: 10.10.12.116
DS_B1_service: 10.10.12.115
DS_A2_service: 192.168.100.95
DS_A1_boot: 192.168.8.105
NODE DS_A2:
Network net_XD_ip_01
DS_A2_xdip: 10.12.5.39
Network net_diskhb_01
hbdiska2: /dev/hdisk8
Network net_diskhb_02
Network net_ether_01
DS_A1_service: 192.168.100.55
DS_B2_service: 10.10.12.116
DS_B1_service: 10.10.12.115
DS_A2_service: 192.168.100.95
DS_A2_boot: 192.168.8.106
NODE DS_B1:
Network net_XD_ip_01
DS_B1_xdip: 10.114.124.75
Network net_diskhb_01
Network net_diskhb_02
hbdiskb1: /dev/hdisk9
Network net_ether_01
242
DS_A1_service: 192.168.100.55
DS_B2_service: 10.10.12.116
DS_B1_service: 10.10.12.115
DS_A2_service: 192.168.100.95
DS_B1_boot: 192.168.12.105
NODE DS_B2:
Network net_XD_ip_01
DS_B2_xdip: 10.114.124.76
Network net_diskhb_01
Network net_diskhb_02
hbdiskb2: /dev/hdisk9
Network net_ether_01
DS_A1_service: 192.168.100.55
DS_B2_service: 10.10.12.116
DS_B1_service: 10.10.12.115
DS_A2_service: 192.168.100.95
DS_B2_boot: 192.168.12.106
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
243
path0
path0
path0
path3
21-T1-01[FC]
21-T1-01[FC]
21-T1-01[FC]
31-T1-01[FC]
fscsi0
fscsi0
fscsi0
fscsi1
Site B
hdisk2
hdisk3
hdisk4
hdisk9
path3
path0
path3
path3
03-08-02[FC]
03-08-02[FC]
03-08-02[FC]
03-08-02[FC]
fscsi0
fscsi0
fscsi0
fscsi0
75854612001
75854612002
75854612003
75854613004
IBM
IBM
IBM
IBM
2107-900
2107-900
2107-900
2107-900
Alternatively, you can also use the lscfg -vl <hdisk> command to see this information
(Figure 6-4).
Site A
> lscfg -vl hdisk3 |grep Serial
Serial Number...............75BALB18002
Site B
> lscfg -vl hdisk3 | grep Serial
Serial Number...............75854612002
Figure 6-4 Results of the lscfg command to see the hdisk serial number
244
2. Modify the default dscli.profile file on both sites for simplification of the dscli
commands.
Attention: If the default dscli.profile file has the user name and password
uncommented, PPRC verification fails with a message similar to the following example:
spprc_verify_config[1621] dspmsg -s 7 pprc.cat 999 spprc_verify_config:
ERROR 10.12.6.17~~IBM.2107-75BALB1 does not match the HMC storage id for
PPRC replicated resource.
Rename the default /opt/ibm/dscli/profile/dscli.profile file if you entered a user
name and a password after you run the rmpprc command.
a. Edit the /opt/ibm/dscli/profile/dscli.profile file or create your own. If you create
your own profile, when you run the dscli command, use the -cfg <profilename>
option.
b. Uncomment and change the hmc1, hmc2, username, password, devid, and
remotedevid (Example 6-1).
Example 6-1 Changing the dscli profile fields
hmc1:
10.12.6.17
username: dinoadm
password: ds8kitso
devid:
IBM.2107-75BALB1
remotedevid:
IBM.2107-7585461
3. Get the worldwide node name on the remote site:
dscli lssi
Figure 6-5 shows the results of running this command.
root@DS_575_1_B1 /opt/ibm/dscli/profile > dscli -cfg default.dscli.profile
dscli> lssi
Name ID
Storage Unit
Model WWNN
State ESSNet
=============================================================================
ess11 IBM.2107-7585461 IBM.2107-7585460 922
5005076303FFC4D2 Online Enabled
dscli>
Figure 6-5 Results of the dscli lssi command
4. Get the local and attached ports associated with the WWNN. Use the dscli
lsavailpprcport command on the local site. Use the remote WWNN and the targetLSS
and sourceLSS (Figure 6-6).
root@DS_550_1_A1 /opt/ibm/dscli/profile > dscli -cfg dscli.profile.admin
dscli> lsavailpprcport -remotewwnn 5005076303FFC4D2 80:20
Local Port Attached Port Type
=============================
I0002
I0102
FCP
I0002
I0132
FCP
I0133
I0102
FCP
I0133
I0132
FCP
Figure 6-6 dscli lsavailpprcport by using dscli.profile.admin
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
245
5. Create a temporary PPRC relationship between the sites to mirror the data to the remote
secondary site.
a. Use the mkpprcpath command to create the path (Figure 6-7).
/opt/ibm/dscli/dscli mkpprcpath -srclss 80 -tgtlss 20 -remotewwnn
5005076303FFC4D2 I0002:I0102
/opt/ibm/dscli/dscli mkpprcpath -srclss 81 -tgtlss 30 -remotewwnn
5005076303FFC4D2 I0133:I0132
Figure 6-7 Results of the dscli command mkpprcpath by using dscli.profile.
b. Use the mkpprc command to map the PPRC. In our example in Figure 6-8, we have
eight disks.
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
mkpprc
mkpprc
mkpprc
mkpprc
mkpprc
mkpprc
mkpprc
mkpprc
-type
-type
-type
-type
-type
-type
-type
-type
mmir
mmir
mmir
mmir
mmir
mmir
mmir
mmir
-mode
-mode
-mode
-mode
-mode
-mode
-mode
-mode
full
full
full
full
full
full
full
full
8001:2001
8002:2002
8003:2003
8004:2004
8101:3001
8102:3002
8103:3003
8104:3004
c. Use the lspprc command to monitor the replication process and status. The Full
Duplex state indicates that it is replicated. Copy Pending means that it is still mirroring
data. The dscli lspprc command with the -l option shows the tracks that are not in
sync (Figure 6-9).
/opt/ibm/dscli/dscli lspprc -l 8001-8004:2001-2004 8101-8104:3001-3004
ID
8001:2001
8002:2002
8003:2003
8004:2004
8101:3001
8102:3002
8103:3003
8104:3004
State
Full Duplex
Copy Pending
Full Duplex
Full Duplex
Copy Pending
Copy Pending
Copy Pending
Copy Pending
The dscli lspprc command: The dscli lspprc command in Figure 6-9 shows Full
Duplex when the PPRC instance is accessible for read and write operations.
6. Create on the primary site the volume groups, logical volumes, and file systems to be
mirrored on the PPRC disks. All volume information is copied between the PPRC disk
pairs, including the pvid. The disks must have the same volume group name on both sites.
Run the following command on the other node at the primary site:
importvg -y <volumegroupname> <hdisk>
246
7. Remove the PPRC relationship after replication completes according to the lspprc
command shown in Figure 6-10. If the dscli.profile file was modified with a user name
and password, comment them or rename the profile. Type y when prompted to remove the
remote copy and volume pair relationship.
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
/opt/ibm/dscli/dscli
rmpprc
rmpprc
rmpprc
rmpprc
rmpprc
rmpprc
rmpprc
rmpprc
8001:2001
8002:2002
8003:2003
8004:2004
8101:3001
8102:3002
8103:3003
8104:3004
Figure 6-10 Results of the dscli command rmpprc with dscli.profile to remove the pprc relationship
8. On the remote site, import the volume groups on both nodes. Remember to use the same
volume group name:
importvg -y <volumegroupname> <hdisk>
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
247
3. In SMIT, select Copy Services Server Configuration Add a Copy Services Server
(Figure 6-12 and Figure 6-13 on page 248).
Hint: We configured two Copy Services Servers (CSSs) per site for redundancy.
*
*
*
*
*
CSS
CSS
CSS
CSS
CSS
[Entry Fields]
[HMC_DS8K_B8]
Asite +
[10.12.6.17]
[dinoadm]
[ds8kitso]
Subsystem Name
Site Name
IP Address
User ID
Password
*
*
*
*
*
CSS
CSS
CSS
CSS
CSS
[Entry Fields]
[HMC_DS8K_B12]
Bsite +
[10.114.232.150]
[itso_user]
[ds8kitso]
Subsystem Name
Site Name
IP Address
User ID
Password
4. In SMIT, select DSS Disk Subsystem Configuration Add an ESS Disk Subsystem
(Figure 6-14 and Figure 6-15 on page 249).
Add an ESS Disk Subsystem
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
248
[Entry Fields]
[HMC_DS8K_B8]
Asite +
[10.12.6.17]
[]
[dinoadm]
[ds8kitso]
[IBM.2107-75BALB1] +
HMC_DS8K_B8 +
IP address: IP address 10.12.6.17 is the IP address of the CSS that was configured in
Figure 6-12 on page 248.
*
*
*
*
*
*
*
*
*
*
*
*
[Entry Fields]
[Zachary_pprc]
[Asite Bsite] +
[8001->2001]
+
[80 20] +
mmir +
[I0002->I0102]
[I0102->I0002]
FCP +
OFF +
MANUAL +
[zachary]
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
249
*
*
*
*
*
*
*
*
*
*
*
*
[Entry Fields]
[Gabriel_pprc]
[Asite Bsite] +
[8003->2003 8004->2004]
[80 20] +
mmir +
[I0002->I0102]
[I0102->I0002]
FCP +
OFF +
AUTOMATED +
[gabriel]
*
*
*
*
*
*
*
*
*
*
*
*
[Entry Fields]
[Benjamin_pprc]
[Asite Bsite] +
[8101->3001 8102->3002]
[81 30] +
mmir +
[I0133->I0132]
[I0132->I0133]
FCP +
OFF +
AUTOMATED +
[benjamin]
250
Zachary
Online On Either Site +
[DS_A1 DS_A2] +
[DS_B1 DS_B2] +
Never Failback +
The Gabriel resource contains the PPRC resource Gabriel_pprc. The volume group
gabriel is to start with the primary site Asite on node DS_A2 (Figure 6-20).
Resource Group Name
Inter-site Management Policy
* Participating Nodes from Primary Site
Participating Nodes from Secondary Site
Startup Policy Online On Home Node Only
+
Failover Policy Failover To Next Priority Node > +
Failback Policy
Gabriel
Online On Either Site +
[DS_A2 DS_A1] +
[DS_B2 DS_B1] +
Never Failback +
The Benjamin resource contains the PPRC resource Benjamin_pprc. The volume group
benjamin is to start with the primary site Bsite on node DS_B2 (Figure 6-21).
Resource Group Name
New Resource Group Name
Inter-site Management Policy
* Participating Nodes from Primary Site
Participating Nodes from Secondary Site
Startup Policy Online On Home Node Only
+
Failover Policy Failover To Next Priority Node > +
Failback Policy
Benjamin
[]
Prefer Primary Site +
[DS_B2 DS_B1] +
[DS_A2 DS_A1] +
Never Failback +
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
251
6.5.6 Configuring the resources and attributes for the resource group
Run the smitty hacmp command. Then, select Extended Configuration Extended
Resource Configuration HACMP Extended Resource Group Configuration
Change/Show Resources and Attributes for a Resource Group to update the resource
groups with the extra required information such as volume groups, service IP addresses, and
pprc resources (Figure 6-22, Figure 6-23, and Figure 6-24).
Service IP Labels/Addresses [DS_A1_service DS_B1_service]
+
Volume Groups
[zachary] +
PPRC Replicated Resources
[Zachary_pprc] +
Figure 6-22 Changes for the Zachary resource group
Volume Groups
PPRC Replicated Resources
[gabriel ] +
[Gabriel_pprc] +
Moving a resource group from one site to another by using the C-SPOC commands
Halting both nodes on one site
The loss of local disk storage on one site
The loss of the PPRC connection between sites
The loss of the XD_ip network
Total site failure, loss of PPRC, and XD_ip networks
252
/ > clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
ONLINE
DS_A1@Asite
OFFLINE
DS_A2@Asite
OFFLINE
DS_B1@Bsite
ONLINE SECONDARY
DS_B2@Bsite
Gabriel
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
2. Use SMIT CSPOC to move the resource. Enter smitty cspoc. Then, select Resource
Groups and Applications Move a Resource Group to Another Node / Site
Move Resource Groups to Another Site. Select the ONLINE resource that you want
to move. In our example, Zachary is seen ONLINE on DS_A1 at the Asite and the
ONLINE_SECONDARY is DS_B2 at the Bsite (Example 6-3).
Example 6-3 Moving a resource group to another site
Node(s) / Site
DS_A1
DS_B2
DS_A2
DS_B2
DS_B2
DS_A2
/
/
/
/
/
/
Asite
Bsite
Asite
Bsite
Bsite
Asite
|
|
|
|
|
|
|
|
|
|
|
While the move is running, the clRGinfo command shows an intermediate state of
ACQUIRING (Example 6-4).
Example 6-4 Results of the clRGinfo command, intermediate state when moving a resource group
Zachary
ONLINE SECONDARY
OFFLINE
OFFLINE
ACQUIRING
DS_A1@Asite
DS_A2@Asite
DS_B1@Bsite
DS_B2@Bsite
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
253
Upon completion of the resource move, a clRGinfo shows the resource now ONLINE on
the Bsite and that a node at the Asite is now ONLINE_SECONDARY (Example 6-5).
Example 6-5 Results of the clRGinfo command, resource moved to another site
Zachary
ONLINE SECONDARY
OFFLINE
OFFLINE
ONLINE
DS_A1@Asite
DS_A2@Asite
DS_B1@Bsite
DS_B2@Bsite
clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
ONLINE
DS_A1@Asite
OFFLINE
DS_A2@Asite
ONLINE SECONDARY
DS_B1@Bsite
OFFLINE
DS_B2@Bsite
Gabriel
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
2. Enter the halt -q command on both nodes DS_A1 and DS_A2 simultaneously.
The resource groups move to the ONLINE SECONDARY nodes when the nodes at the
Asite are unresponsive. The PPRC disks are placed into a suspended state on the remote
site that is taking over the resources (Example 6-7).
Example 6-7 Results of the dscli lspprc 2001-30ff command on Bsite
2001:8001 Suspended
Host Source Metro Mirror 20
Disabled
Invalid
2003:8003 Suspended
Host Source Metro Mirror 20
Disabled
Invalid
2004:8004 Suspended
Host Source Metro Mirror 20
Disabled
Invaid
254
60
60
60
When the Asite nodes are restarted and PowerHA is started, the Bsite can take ownership
of the disks, as shown in Example 6-8 by the output of the dscli lspprc 2001-30ff
command.
Example 6-8 Results of the dscli lpprc command on Bsite
OFFLINE
ONLINE SECONDARY
ONLINE
OFFLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
ONLINE
OFFLINE
OFFLINE
ONLINE SECONDARY
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
Zachary
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_A1@Asite
DS_A2@Asite
DS_B1@Bsite
DS_B2@Bsite
Gabriel
ONLINE
OFFLINE
OFFLINE
ONLINE SECONDARY
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
ONLINE SECONDARY
DS_B2@Bsite
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
255
OFFLINE
OFFLINE
ONLINE
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
dscli> lsvolgrp
haxd_ds8k_a1
haxd_ds8k_a2
V6
V7
SCSI Mask
SCSI Mask
dscli> showvolgrp V6
Name haxd_ds8k_a1
ID
V6
Type SCSI Mask
Vols 8001 8002 8003 8004 8101 8102 8103 8104 A002 A102
dscli> showvolgrp V7
Name haxd_ds8k_a2
ID
V7
Type SCSI Mask
Vols 8001 8002 8003 8004 8101 8102 8103 8104 A003 A103
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
ONLINE SECONDARY
DS_A1@Asite
OFFLINE
DS_A2@Asite
ONLINE
DS_B1@Bsite
OFFLINE
DS_B2@Bsite
Gabriel
256
ONLINE SECONDARY
OFFLINE
ONLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
Benjamin
OFFLINE
DS_B1@Bsite
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
Failover: In Example 6-11 on page 256, if PowerHA is unable to read the disks (the
disks are missing), it performs a failover.
From the Bsite, the disks are now available for writes with a Full Duplex state
(Example 6-12).
Example 6-12 The dscli lspprc on Bsite
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
257
1. Observe where the resource groups are currently ONLINE. In Example 6-14, Zachary is
on DS_A1, Gabriel is on DS_A1, and Benjamin is on DS_A1. All resources are on Asite
DS_A1 for the simplicity of this test.
Example 6-14 clRGinfo before stopping XD_ip networks
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
ONLINE
DS_A1@Asite
OFFLINE
DS_A2@Asite
ONLINE SECONDARY
DS_B1@Bsite
OFFLINE
DS_B2@Bsite
Gabriel
OFFLINE
ONLINE
OFFLINE
ONLINE SECONDARY
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
OFFLINE
ONLINE SECONDARY
OFFLINE
ONLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
2. With the XD_ip on DS_A1 and on DS_A2 as en1, on both DS_A1 and DS_A2, enter the
ifconfig en1 down detach command.
3. Observe that from Asite the resources are all ONLINE on Asite and the disks are in
suspended state (Example 6-15 and Example 6-16 on page 258).
Example 6-15 clRGinfo command issued on DS_A1
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
ONLINE
DS_A1@Asite
OFFLINE
DS_A2@Asite
OFFLINE
DS_B1@Bsite
OFFLINE
DS_B2@Bsite
Gabriel
OFFLINE
ONLINE
OFFLINE
OFFLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
OFFLINE
OFFLINE
OFFLINE
ONLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
8004:2004 Suspended
8101:3001 Suspended
60
60
4. Observe that on the Bsite, the resources are all on the Bsite and the disks are all
suspended (Example 6-17 and Example 6-18).
Example 6-17 clRGinfo command issued on DS_B1
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zachary
OFFLINE
DS_A1@Asite
OFFLINE
DS_A2@Asite
ONLINE
DS_B1@Bsite
OFFLINE
DS_B2@Bsite
Gabriel
OFFLINE
OFFLINE
ONLINE
OFFLINE
DS_A2@Asite
DS_A1@Asite
DS_B2@Bsite
DS_B1@Bsite
Benjamin
ONLINE
OFFLINE
OFFLINE
OFFLINE
DS_B2@Bsite
DS_B1@Bsite
DS_A2@Asite
DS_A1@Asite
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
259
> lspv
hdisk0
000fe41108faf388
rootvg
active
hdisk1
000fe41168e0be14
None
hdisk2
000fe4112498bb13
zachary
active
hdisk3
000fe4112498bb7a
None
> pcmpath query essmap |grep hdisk3
hdisk3
path0
21-T1-01[FC] fscsi0 75BALB18002 IBM 2107-900 51.2GB
80
2
0000
0e
Y
R1-B1-H1-ZC
2
RAID5
hdisk3
path1
31-T1-01[FC] fscsi1 75BALB18002 IBM 2107-900 51.2GB
80
2
0000
0e
Y
R1-B2-H3-ZD 133
RAID5
2. Ensure that the resource group is ONLINE on the primary site.
3. Add the volume pair of disks to the PPRC resource. To change the PPRC resource, run
the smitty hacmp command. Select Extended Configuration Extended Resource
Configuration HADSCLI-Managed PPRC Replicated Resource ConfigurationCMP
Extended Resources Configuration PPRC-Managed Replicated Resources
Configuration DSCLI-Managed PPRC Replicated Resource Configuration
Change/Show a PPRC Resource.
260
*
*
*
*
*
*
*
*
*
[Entry Fields]
Zachary_pprc
[]
[Asite Bsite] +
[8001->2001 8002->2002]
[HMC_DS8K_B8 HMC_DS8K_B12+
[80 20] +
mmir +
[I0002->I0102]
[I0102->I0002]
FCP +
OFF +
MANUAL +
[zachary]
Full
Copy
Full
Full
Full
Full
Duplex
Pending
Duplex
Duplex
Duplex
Duplex
Metro
Metro
Metro
Metro
Metro
Metro
Mirror
Mirror
Mirror
Mirror
Mirror
Mirror
80
80
80
80
81
81
60
60
60
60
60
60
10.Add the physical volume to the cluster. Run the smitty cspoc command. Select
Storage Volume Groups Set Characteristics of a Volume Group.
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
261
11.In the Set Characteristics of Volume Group panel (Figure 6-27), select Add a Volume to a
Volume Group.
Set Characteristics of a Volume Group
Move cursor to desired item and press Enter.
Add a Volume to a Volume Group
Change/Show characteristics of a Volume Group
Remove a Volume from a Volume Group
Enable/Disable a Volume Group for Cross-Site LVM Mirroring Verification
+--------------------------------------------------------------------------+
|
Physical Volume Names
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
000fe4112498bb7a ( hdisk3 on all selected nodes )
|
|
000fe41124997862 ( hdisk7 on all selected nodes )
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 6-27 Adding hdisk3 to volume group zachary
12.In the Add a Volume to a Volume Group panel (Figure 6-28), select the volume group to
which to add the disk. Then, select the hdisk to add to the volume group.
Add a Volume to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
zachary
Zachary
DS_A1,DS_A2,DS_B1,DS_>
DS_A1
hdisk3
Attention: When you use CSPOC to modify a volume group, expect to see errors when
you contact the remote site. From the output of the SMIT command in Figure 6-28, we
notice the following messages:
cl_extendvg: Error executing clupdatevg zachary 000fe4112498bb13 on node DS_B1.
cl_extendvg: Error executing clupdatevg zachary 000fe4112498bb13 on node DS_B2.
262
13.Move the resource to the other site to do a lazy update of the volume group. Run the
smitty cspoc command. Select Resource Groups and Applications Move a
Resource Group to Another Node / Site Move Resource Groups to Another Site.
Select the resource group to move to the other site.
[Entry Fields]
zachary
Zachary
DS_A1,DS_A2,DS_B1,DS_B2
hdisk3
DS_A1
no +
Attention: Expect an error from the execution of the cspoc command in Figure 6-29
when it tries to reduce the volume group on the other site:
cl_reducevg: Error executing clupdatevg zachary 000fe4112498bb13 on node DS_B2.
cl_reducevg: Error executing clupdatevg zachary 000fe4112498bb13 on node DS_B1.
This behavior is normal and expected.
4. Move the resource to the other site so that the volume group is updated. Move the
resource group to the other node at the remote site to see the change.
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
263
5. Determine the volume pair to delete from the PPRC resource. The pcmpath query essmap
command is used to see the disk-to-volume mapping (Example 6-22).
Example 6-22 pcmpath query output to find volume
75BALB18002
The above output has been truncated and the 8002 is highlighted to show the
volume to be removed from the pprc resource
6. Remove the PPRC relationship. Enter the dscli rmpprc command.
7. Change the PPRC resource. Run the smitty hacmp command. Select Extended
Configuration Extended Resource Configuration HADSCLI-Managed PPRC
Replicated Resource ConfigurationCMP Extended Resources Configuration
PPRC-Managed Replicated Resources Configuration DSCLI-Managed PPRC
Replicated Resource Configuration. Select Change/Show a PPRC Resource.
8. In the Change/Show PPRC Resource panel, select the PPRC to change (Figure 6-30).
Change / Show PPRC Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
*
*
*
*
*
[Entry Fields]
Zachary_pprc
[]
[Asite Bsite] +
[8001->2001 8002->2002]
[HMC_DS8K_B8 HMC_DS8K_B12+
[80 20] +
mmir +
[I0002->I0102]
[I0102->I0002]
FCP +
OFF +
MANUAL +
[zachary]
9. Verify the PPRC replicated resource. Run the smitty hacmp command. Select
Extended Configuration Extended Resource Configuration HACMP
Extended Resources Configuration PPRC-Managed Replicated Resources
Configuration DSCLI-Managed PPRC Replicated Resource Configuration
Verify PPRC Configuration.
10.Remove the PPRC pair. On the node where the resource group is ONLINE, use the dscli
rmpprc command (for example, dscli rmpprc 8002:2002).
11.Verify and synchronize the cluster. Run the smitty hacmp command. Then, select
Extended Configuration Extended Verification and Synchronization.
264
lspprc
lskey
lssi
lsavailpprcport
lspprcpath
mkesconpprcpath
mkpprcpath
rmpprcpath
mkpprc
rmpprc
failbackpprc
failoverpprc
freezepprc
pausepprc
resumepprc
unfreezepprc
/ > /usr/es/sbin/cluster/pprc/spprc/cmds/cllsspprc -a
Resource
Sitelist
Volume Pairs
ESS Pair
LSS Pair
Port IDs SecPri Port IDs Link Type Crit Mode Recovery Action Volume Group
Zachary_pprc
Asite Bsite
8001->2001 8002->2002 HMC_DS8K_B8 HMC_DS8K_B12 80 20
I0002->I0102
I0102->I0002
FCP
OFF
AUTOMATED
zachary
Gabriel_pprc
Asite Bsite
8003->2003 8004->2004 HMC_DS8K_B8 HMC_DS8K_B12 80 20
I0002->I0102
I0102->I0002
FCP
OFF
AUTOMATED
gabriel
Benjamin_pprc Asite Bsite
8101->3001
HMC_DS8K_B8 HMC_DS8K_B12 81 30
I0133->I0132
I0132->I0133
FCP
OFF
AUTOMATED
benjamin
/ > /usr/es/sbin/cluster/pprc/spprc/cmds/cllscss -a
CSS Name
CLI Type
Sitename
CSS IP Address
HMC_DS8K_B8
DSCLI
Asite
10.12.6.17
HMC_DS8K_B12
DSCLI
Bsite
10.114.232.150
ESS ID
PPRC Type
PriSec
mmir
mmir
mmir
Username
Password
dinoadm
ds8kitso
itso_user ds8kitso
Chapter 6. Configuring PowerHA SystemMirror Enterprise Edition with ESS/DS Metro Mirror
265
essmap
adapter
-------fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
fscsi0
fscsi1
LUN SN
-------75BALB1A002
75BALB1A002
75BALB1A102
75BALB1A102
75BALB18001
75BALB18001
75BALB18002
75BALB18002
75BALB18003
75BALB18003
75BALB18004
75BALB18004
75BALB18101
75BALB18101
75BALB18102
75BALB18102
75BALB18103
75BALB18103
75BALB18104
75BALB18104
Type
-----------IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
IBM 2107-900
Size
---10.2GB
10.2GB
10.2GB
10.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
51.2GB
LSS
--a0
a0
a1
a1
80
80
80
80
80
80
80
80
81
81
81
81
81
81
81
81
Vol
--2
2
2
2
1
1
2
2
3
3
4
4
1
1
2
2
3
3
4
4
Rank
---0000
0000
ffff
ffff
0000
0000
0000
0000
0000
0000
0000
0000
ffff
ffff
ffff
ffff
ffff
ffff
ffff
ffff
C/A
--0e
0e
17
17
0e
0e
0e
0e
0e
0e
0e
0e
17
17
17
17
17
17
17
17
S
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
266
Chapter 7.
Configuring PowerHA
SystemMirror Enterprise Edition
with SRDF replication
This chapter provides a practical description of the PowerHA Enterprise Edition configuration
in an environment that uses EMC nationSymmetrix Remote Data Facility (SRDF) replication.
The chapter includes the following sections:
General considerations
Planning
Environment description
Installation and configuration
Test scenarios
Maintaining the cluster configuration with SRDF replicated resources
Troubleshooting PowerHA Enterprise Edition SRDF managed replicated resources
Commands for managing the SRDF environment
267
268
Resource groups that are managed by PowerHA cannot contain volume groups with disks
that are SRDF-protected and disks that are not SRDF-protected.
C-SPOC cannot perform the following LVM operations on nodes at the remote site (that
contain the target volumes):
The creation of a volume group.
Operations that require nodes at the target site to read from the target volumes. Such
operations cause an error message in CSPOC and include such functions as changing
file system size, changing the mount point, and adding LVM mirrors. However, nodes
on the same site as the source volumes can successfully perform these tasks, and the
changes are then propagated to the other site by using a lazy update.
C-SPOC operations: For C-SPOC operations to work on all other LVM operations,
perform all C-SPOC operations with the SRDF pairs in synchronized or consistent
states, and the cluster must be ACTIVE on all nodes.
SRDF has the following functional considerations:
Multihop configurations are not supported.
Mirroring to BCV devices is not supported.
Concurrent RDF configurations are not supported.
7.2 Planning
This section provides a description of the planning considerations when planning to integrate
IBM PowerHA with EMC SRDF.
269
270
XD_ip network 1
Site A
Site B
WAN
XD_ip network2
xdemca1
xdemcb1
xdemca2
xdemcb2
RDF FC Links
DMX-3
DMX-4
Figure 7-1 General overview of the PowerHA environment with SRDF replication
Each site has an EMC storage subsystem that is attached to the local nodes. Direct Matrix
Architecture (DMX) storage subsystems are connected to each other using redundant Fibre
Channel (FC) link connections.
Two TCP/IP links (networks) are dedicated for XD heartbeat between the sites. One Ethernet
adapter per node is dedicated for an XD_ip network. In our case, the same IP subnet is used
for the XD_ip network, but in a production environment, separate IP segments might be used
on each site that is being routed between each other. For a production environment configure
multiple communication paths between the sites, including any additional non-IP network in
case of routing or other TCP/IP issues between the sites. In addition to the XD_ip networks,
each site has local subnets for client access.
For our scenario, we create first a resource group, having the primary site A containing a
group of SRDF relationships that operate in synchronous mode (SRDF/S). Later, we add a
second resource group to the cluster configuration, having the primary site B with a group of
SRDF relations that operate in asynchronous mode (SRDF/A). We use the capability of
PowerHA Enterprise Edition to dynamically integrate a second site-related resource group in
an SRDF environment without disrupting the existing one. PowerHA Enterprise Edition
supports coexistence between SRDF replicated resources that operate in different modes.
Consider that our configuration is used only for describing the installation and configuration
steps and for testing several scenarios. In a production environment, whether to use
synchronous or asynchronous replication is determined by several factors such as distance
between sites, communication bandwidth, and application I/O pattern. For general
considerations that are related to replication options, see 1.4.1, Synchronous versus
asynchronous replication on page 22.
271
For illustration purposes, the resource groups that we create contain only IP labels, volume
groups, file systems, and SRDF replicated resources. Further details are provided within the
next sections.
In our environment, we use the following versions of the software on the cluster nodes:
Two EMC Symmetrix storage subsystems are used for the SRDF relationships:
DMX-3, mcode Version 5772.103.93
DMX-4, mcode Version 5773.134.94
272
EMCpower.base
EMCpower.consistency_grp
EMCpower.encryption
EMCpower.migration_enabler
EMCpower.mpx
Register the Powerpath software license by applying the available license keys by using
the emcpreg -install command. Use the powermt check_registration command to
verify the installation keys. For information about the licensing features for Powerpath
software, contact the EMC representative.
If the Symmetrix disks are already configured on the system:
a. Remove their existing definition:
rmdev -dl <hdisk#>
b. Reconfigure the AIX disk definition in the Object Data Manager (ODM):
/usr/lpp/Symmetrix/bin/emc_cfgmgr
c. Initialize the PowerPath devices:
powermt config
For more information about Powerpath installation and configuration, see EMC PowerPath
for AIX - Installation and Administration Guide, P/N 300-008-341.
2. Install the Solutions Enabler Software (SYMCLI). In our environment, we use a shared
Network File System (NFS) directory that is mounted on all hosts, containing the SYMCLI
7.0 installation media, and we run the installer script in silent mode on each cluster node
(Example 7-1).
Example 7-1 Installing the SYMCLI packages in silent mode
273
BASE/Symmetrix feature before you run the discovery by listing the contents of the license
file. For our installation, the default location is /var/symapi/config/symapi_licenses.dat.
Our PowerHA configuration with SRDF uses the following licensed features:
BASE / Symmetrix
SRDF / Symmetrix
SRDFA / Symmetrix
SRDF/CG / Symmetrix
For more information about the Symmetrix license features, contact an EMC
representative.
Run symcfg discover to start the discovery process. At the end of the process, verify the
discovered devices by running the symcfg list command (Example 7-2).
Example 7-2 Symmetrix arrays (site A view)
root@xdemca1:/>symcfg list
S Y M M E T R I X
SymmID
Attachment
000190100304 Local
000190101983 Remote
Model
Mcode
Version
DMX3-24
DMX4-24
5772
5773
Cache
Size (MB)
Num Phys
Devices
Num Symm
Devices
32768
65536
294
0
4063
1743
Running the symcfg list command from local node xdemca1 shows that the Symmetrix
system model DMX3-24 is locally attached to our node and that the Symmetrix system
DMX4-24 is remotely attached to the DMX3. Note the Symmetrix ID for both local and
remote systems. Repeat the discovery process on each node of the cluster. Example 7-3
shows a similar view of the storage devices from site B.
Example 7-3 Symmetrix arrays (site B view)
root@xdemcb1:/>symcfg list
S Y M M E T R I X
SymmID
Attachment
000190101983 Local
000190100304 Remote
Model
Mcode
Version
DMX4-24
DMX3-24
5773
5772
Cache
Size (MB)
Num Phys
Devices
Num Symm
Devices
65536
32768
259
0
1743
4063
In the output of Example 7-3, note the change between the local and the remote storage
subsystem when you run the list command from site B.
274
False
dyntrk
fc_err_recov
scsi_id
sw_fc_class
no
fast_fail
0x70400
3
Configure the gkselect file after you install the SYMCLI package to limit the use of
gatekeepers and the number of communication paths. Use the symdev list pd command
and choose 5 - 6 gatekeepers. Limiting the number of GK paths reduces the amount of
time that is spent querying paths when communication is lost, resulting in long wait times
that lead to cluster config_too_long errors. Add the selected devices to the
/var/symapi/config/gkselect configuration file on each cluster node (Example 7-5).
Example 7-5 A gkselect file
# cat /var/symapi/config/gkselect
/dev/rhdiskpower10
/dev/rhdiskpower11
/dev/rhdiskpower12
/dev/rhdiskpower13
/dev/rhdiskpower14
/dev/rhdiskpower15
To automate the lock recovery for SYMCLI if a node crash occurs, consider adding the
following options in the /var/symapi/config/daemon_options configuration file on all of the
cluster nodes:
storapid:internode_lock_recovery=enable
storapid:internode_lock_recovery_heartbeat_interval=10
storapid:internode_lock_information_export=enable
Enable also the RDF daemon on all the cluster nodes by setting the following option in the
/var/symapi/config/options configuration file:
SYMAPI_USE_RDFD=ENABLE
After you update the option files, restart all daemons for the changes to take effect:
/usr/symcli/storbin/stordaemon shutdown all
/usr/symcli/storbin/stordaemon start storapid
/usr/symcli/storbin/stordaemon start storrdfd
cluster.adt.es
cluster.doc.en_US.es
cluster.es.client
cluster.es.cspoc
275
cluster.es.server
cluster.es.sr
cluster.es.worksheets
cluster.license
cluster.man.en_US.es
cluster.xd.license
207.18.1/24
Site B
net_XD_ip_01
207.18.2/24
net_XD_ip_02
10.100.101/24
Segment A
10.100.102/24
Segment B
xdemca2
xdemca1
xdemc01
en0
xdemcb1
net_diskhb_01
hdiskpower0
en1
en2
207.18.2.4
en2
207.18.1.4
en1
10.100.102.2
en0
207.18.2.3
en2
207.18.1.3
10.100.101.2
en1
10.100.102.1
en0
207.18.2.2
en2
207.18.1.2
en1
207.18.201
10.100.101.1
en0
207.18.1.1
net_public_01
xdemcb2
net_diskhb_02
hdiskpower41
276
The IP interface names that are used for the cluster communication interfaces and service IP
labels are defined in the /etc/hosts file on each node (Example 7-6).
Example 7-6 IP interface names and service IP labels
# XD_IP SiteA
207.18.1.1
207.18.2.1
207.18.1.2
207.18.2.2
xdemca1_xdip1
xdemca1_xdip2
xdemca2_xdip1
xdemca2_xdip2
# XD_IP SiteB
207.18.1.3
207.18.2.3
207.18.1.4
207.18.2.4
xdemcb1_xdip1
xdemcb1_xdip2
xdemcb2_xdip1
xdemcb2_xdip2
#Boot IP addresses
10.100.101.1 xdemca1_boot
10.100.101.2 xdemca2_boot
10.100.102.1 xdemcb1_boot
10.100.102.2 xdemcb2_boot
# Service IP addresses
10.10.10.1 xdemcaa_sv
10.10.11.1 xdemcab_sv
10.10.11.2 xdemcbb_sv
10.10.10.2 xdemcba_sv
In addition to the IP networks, there are two disk heartbeat networks local in each site:
net_diskhb_01
net_diskhb_02
To define the cluster topology:
1. Define the cluster, which is xdemc in this example (Figure 7-3). By using SMIT menus,
enter the smitty hacmp command. Select Extended Configuration Extended
Topology Configuration Configure an HACMP Cluster Add/Change/Show an
HACMP Cluster.
Add/Change/Show an HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Cluster Name
[Entry Fields]
[xdemc]
277
2. Add the nodes to the cluster definition. Add both the local and the remote nodes. Ensure
that there is a communication path between all nodes in both sites (Figure 7-4). We use IP
addresses of the interfaces for inter-site communication.
Add a Node to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[xdemca1]
[xdemca1_xdip1]
* Node Name
Communication Path to Node
3. Define the sites. In our environment, we use the site that is named siteA, which contains
the xdemca1 and xdemca2 nodes (Figure 7-5).
Add a Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[siteA]
xdemca1 xdemca2
* Site Name
* Site Nodes
+
+
A similar operation is performed for siteB, which contains the xdemcb1 and xdemcb2
nodes (Figure 7-6).
Add a Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Site Name
* Site Nodes
[Entry Fields]
[siteB]
xdemcb1 xdemcb2
Run the cluster discovery process to collect information, including their IP and disk
configuration, which can be used later to facilitate the topology definition.
278
+
+
*
*
*
*
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_public_01]
ether
[255.255.255.0]
[Yes]
[]
We enable IP address takeover (IPAT) by using aliasing for this network. Later, we
define the service IP labels configurable on multiple nodes with two IP addresses, each
bound to a site.
Two XD_ip networks that are used for PowerHA heartbeating between sites. Figure 7-8
shows an example definition of a XD_ip network that is used for our environment.
Add an IP-Based Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_XD_ip_01]
XD_ip
[255.255.255.0]
[No]
[]
We disable the IPAT on this network (XD_ip). Later, we define service IP labels
node-bounded for this network.
279
|
|
|
|
|
|
|
|
|
|
|
|
|
|
IP Label IP Address
>
# net_public_01 / xdemca1
en0
xdemca1_boot
10.100.101
>
# net_public_01 / xdemca2
en0
xdemca2_boot
10.100.101
>
# net_public_01 / xdemcb1
en0
xdemcb1_boot
10.100.102
>
# net_public_01 / xdemcb2
en0
xdemcb2_boot
10.100.102
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 7-9 Adding the communication interfaces for the public network
Now, add both IP segments in sites A and B (10.100.101/24 and 10.100.102/24) to the
same cluster network.
The XD_ip network. We define on each node the associated communication interface.
See Figure 7-10 for an example of adding a communication interface for a node to an
XD_ip network. We apply the same operation for all cluster nodes XD_ip interfaces.
Add a Communication Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
IP Label/Address
Network Type
Network Name
Node Name
Network Interface
[Entry Fields]
[xdemca1_xdip1]
XD_ip
net_XD_ip_01
[xdemca1]
[]
Figure 7-10 Adding the communication interfaces for the XD_ip network
For each communication interface in the XD_ip networks, define a node-bounded service
IP address. Define the node-bound IP address. Run the smitty hacmp command. Select
Extended Configuration Extended Resource Configuration HACMP Extended
Resources Configuration Configure HACMP Service IP Labels/Addresses Add
a Service IP Label/Address Bound to a Single Node. Select the node name and the
IP interface name on that node for the XD_ip network.
280
6. Add the non-IP communication devices. We use a disk heartbeat network at each site
(Figure 7-2 on page 276). We use a dedicated heartbeat disk, part of an
enhanced-concurrent volume group, defined locally in each site. To facilitate adding
communication devices into the cluster definition, we run the discovery process on each
node of the cluster. Figure 7-11 shows we add the hdiskpower0 associated
communication devices on nodes xdemca1 and xdemca2 at site A.
Configure HACMP Communication Interfaces/Devices
+--------------------------------------------------------------------------+
| Select Point-to-Point Pair of Discovered Communication Devices to Add
|
|
|
| Move cursor to desired item and press F7.
|
|
ONE OR MORE items can be selected.
|
| Press Enter AFTER making all selections.
|
|
|
| [TOP]
|
|
# Node
Device
Pvid
|
| >
xdemca1
hdiskpower0 00cfb52d62799c35
|
| >
xdemca2
hdiskpower0 00cfb52d62799c35
|
|
xdemcb1
hdiskpower41 00cfb52d62799c35
|
|
xdemcb2
hdiskpower41 00cfb52d62799c35
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F7=Select
F8=Image
F10=Exit
|
| Enter=Do
/=Find
n=Find Next
|
+--------------------------------------------------------------------------+
Figure 7-11 Adding the communication devices for nodes at site A
At the end of the task the cluster network definition for disk heartbeat is automatically
created.
7. Verify and synchronize the cluster definition. Example 7-7 shows the cluster topology that
is defined up to this point after you synchronize the cluster definitions across the nodes by
running the cltopinfo command.
Example 7-7 Cluster topology definition
root@xdemca1:/>cltopinfo
Cluster Name: xdemc
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 4 node(s) and 5 network(s) defined
NODE xdemca1:
Network net_XD_ip_01
xdemca1_xdip1
207.18.1.1
Network net_XD_ip_02
xdemca1_xdip2
207.18.2.1
Network net_diskhb_01
xdemca1_hdiskpower0_01 /dev/hdiskpower0
Network net_diskhb_02
Network net_public_01
281
xdemca1_boot
10.100.101.1
NODE xdemca2:
Network net_XD_ip_01
xdemca2_xdip1
207.18.1.2
Network net_XD_ip_02
xdemca2_xdip2
207.18.2.2
Network net_diskhb_01
xdemca2_hdiskpower0_01 /dev/hdiskpower0
Network net_diskhb_02
Network net_public_01
xdemca2_boot
10.100.101.2
NODE xdemcb1:
Network net_XD_ip_01
xdemcb1_xdip1
207.18.1.3
Network net_XD_ip_02
xdemcb1_xdip2
207.18.2.3
Network net_diskhb_01
Network net_diskhb_02
xdemcb1_hdiskpower41_01 /dev/hdiskpower41
Network net_public_01
xdemcb1_boot
10.100.102.1
NODE xdemcb2:
Network net_XD_ip_01
xdemcb2_xdip1
207.18.1.4
Network net_XD_ip_02
xdemcb2_xdip2
207.18.2.4
Network net_diskhb_01
Network net_diskhb_02
xdemcb2_hdiskpower41_01 /dev/hdiskpower41
Network net_public_01
xdemcb2_boot
10.100.102.2
282
Site B
DMX-4 (SymmID=1983)
DMX-3 (SymmID=0304)
CG=haxdcg_siteA
AIX VG=srdf_s_vg
CG=haxdcg_siteA
SRDF/S
0F53
00BF
DG=haxd_siteA
RDF type=R1
DG=haxd_siteA
0F54
00C0
RF-4C
SRDF
Physical Links
RF-13C
CG=haxdcg_siteB
AIX VG=srdf_a_vg1
RDF type=R2
RF-3C
RF-14D
CG=haxdcg_siteB
SRDF/A
0F55
00C1
DG=haxdb1_siteB
AIX VG=srdf_a_vg1
DG=haxdb1_siteB
RDF type=R2
RDF type=R1
0F56
00C2
AIX VG=srdf_a_vg2
DG=haxdb2_siteB
AIX VG=srdf_s_vg
AIX VG=srdf_a_vg2
0F57
00C3
RDF type=R2
DG=haxdb2_siteB
RDF type=R1
Table 7-1 shows the disk association with the volume groups for our environment. We define
the following volume groups:
srdf_s_vg, enhanced concurrent capable, which is associated with the resource group
with primary site A
srdf_a_vg1, srdf_a_vg2: standard AIX volume groups that are associated with the
resource group with primary site B
Table 7-1 Volumes and volume groups that are defined on the Symmetrix storage subsystems
Site
Node
Symmetrix ID
(SymmID)
Volume ID (Sym)
AIX Disk
siteA
xdemca1/
xdemca2
0304
0F52
hdiskpower0
diskhb_a_vg
0F53
0F54
hdiskpower1
hdiskpower2
srdf_s_vg
0F55
0F56
hdiskpower3
hdiskpower4
srdf_a_vg1
0F57
hdiskpower5
srdf_a_vg2
283
Site
Node
Symmetrix ID
(SymmID)
Volume ID (Sym)
AIX Disk
siteB
xdemcb1/
xdemcb2
1983
00BE
hdiskpower41
diskhb_b_vg
00BF
00C0
hdiskpower42
hdiskpower43
srdf_s_vg
00C1
00C2
hdiskpower44
hdiskpower45
srdf_a_vg1
00C3
hdiskpower46
srdf_a_vg2
AIX definitions practice: In our environment, the nodes in the same site have the same
mapping of Symmetrix devices to the PowerPath hdiskpower# devices in AIX. Although
this configuration is not required, keep the AIX disk definitions the same on all nodes of the
cluster for facilitating the configuration and management operations of the disk volumes
with the cluster.
Table 7-2 details the configuration for the device groups and the composite groups that are
used in our environment.
Table 7-2 Definitions of SRDF pairs, device, and composite groups
Composite
group
Device
group
RDF1
members
RDF1 group
name (no.)
RDF2
members
RDF2 group
name (no.)
haxdcg_siteA
haxd_siteA
0F53
haxd1_rdfg(41)
00BF
haxd1_rdfg(41)
0F54
haxd1_rdfg(41)
00C0
haxd1_rdfg(41)
00C1
haxd2_rdfg(42)
0F55
haxd2_rdfg(42)
00C2
haxd2_rdfg(42)
0F56
haxd2_rdfg(42)
00C3
haxd2_rdfg(42)
0F57
haxd2_rdfg(42)
haxdcg_siteB
haxd1_siteB
haxd2_siteB
284
To create the first SRDF replicated resource with primary site A associated with the
composite group haxdcg_siteA:
1. Create an RDF group that contains the ports in the local and remote storage subsystems
that are used for data replication. To list all defined RDF groups in your configuration, enter
the symcfg list -ra all command. For our configuration, we allocate a new RDF group
haxd1_rdfg with number 41, not used until now (Example 7-8).
Example 7-8 Creating a dynamic RDF group
root@xdemca1:/> symrdf addgrp -label haxd1_rdfg -rdfg 41 -sid 0304 -dir 04C,13C
-remote_rdfg 41 -remote_sid 1983 -remote_dir 03C,14D
Execute a Dynamic RDF Addgrp operation for group
'haxd1_rdfg' on Symm: 000190100304 (y/[n]) ? y
Successfully Added Dynamic RDF Group 'haxd1_rdfg' for Symm: 000190100304
2. Verify the RDF group definition by using the symcfg list -ra all command. You can
reduce the scope of the list command to a single Symmetrix storage by using the -sid
<SymmID> option.
3. Create the replication relationship between the disk pairs. First, create a file that contains
the disk pairs. The pairs are delimited by blank spaces (Example 7-9).
Example 7-9 File contains the SRDF disks pairs
root@xdemca1:/>cat /tmp/sitea_disks.txt
0F53 00BF
0F54 00C0
4. Establish the SRDF relationship with synchronous operating mode by using the file that
was previously created as input (Example 7-10).
Example 7-10 Creating the SRDF relationships
285
5. Verify the status of the synchronization process for the SRDF pairs by using the verify
command:
symrdf -file /tmp/sitea_disks.txt -sid 0304 -rdfg 41 verify -synchronized -i 10
6. Using the interval flag (-i), continue to run the command at the specified interval
(seconds) until all the devices in the file list are synchronized.
7. Check the status of the SRDF pairs by using the symrdf list pd command.
Example 7-11 shows the status of all defined SRDF relationships that contain any of the
disks that are mapped on host xdemca1 by running the specified command. You can see
in the example that the devices on the storage at site A are type R1, corresponding to their
source roles in the SRDF/S relationships.
Example 7-11 List of the SRDF relationships (site A view)
root@xdemca1:/>symrdf list pd
Symmetrix ID: 000190100304
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------0F53 00BF
0F54 00C0
R1:41
R1:41
RW RW RW
RW RW RW
Total
Track(s)
MB(s)
S..1S..1-
0
0
0 RW
0 RW
WD
WD
Synchronized
Synchronized
-------- -------0
0
0.0
0.0
A
X
D
1
X
=
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
R1, 2 = R2
Enabled, .
8. For a similar view, run the same command on a node at site B (Example 7-12). You can
see in the output example that the type of the devices on the storage at site B is R2,
corresponding to their target states in the SRDF/S relationships.
Example 7-12 List of the SRDF relationships (site B view)
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
286
---------
----- -------
00BF 0F53
00C0 0F54
RW WD RW
RW WD RW
S..2S..2-
R2:41
R2:41
Total
Track(s)
MB(s)
0
0
0 WD
0 WD
RW
RW
Synchronized
Synchronized
-------- -------0
0
0.0
0.0
A
X
D
1
X
=
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
R1, 2 = R2
Enabled, .
Tip: You can use the symrdf -sid <SymmID> -rdfg <rdfg_ID> list command to list
the status of the SRDF pairs that are part of a specific RDF(RA) group.
9. Create the device group definition on a node. In Example 7-13, we create the device group
haxd_siteA.
Example 7-13 Defining the device group haxd_siteA on node xdemca1
G R O U P S
Name
Type
Valid
Symmetrix ID
haxd_siteA
RDF1
N/A
N/A
Number of
Devs GKs BCVs VDEVs
0
TGTs
10.Add the volumes on the storage at site A to the device group definition (Example 7-14).
Example 7-14 Associating volumes to the device group haxd_siteA
G R O U P S
Name
Type
Valid
Symmetrix ID
haxd_siteA
RDF1
Yes
000190100304
Number of
Devs GKs BCVs VDEVs
2
TGTs
0
11.Export the device definition from the xdemca1 node and import the definitions on
all nodes.
287
The sycfg sync command: The storage device roles change when the SRDF
relationships are created (from REGULAR to RDF1or RDF2). Therefore, run the symcfg
sync command before the import command on all cluster nodes other than the initiating
node of the SRDF relations to update the local database with the current device status.
12.To import the definition file on a node other than the one that was created, copy the file on
the target node. We create two files:
The Import file for the nodes in the same site
In our environment, we use this file to import the definition of the device group on
the xdemca2 node. Example 7-15 shows how we use export/import DG commands in
our environment.
Example 7-15 Exporting and importing the DG definitions for a node in the same site
On node xdemca1:
root@xdemca1:/>symdg exportall -f /tmp/dg_local.cfg
root@xdemca1:/>cat /tmp/dg_local.cfg
<haxd_siteA>
1 000190100304
S F53 DEV001
S F54 DEV002
We copy the /tmp/dg_local.cfg on node xdemca2 in the same directory and run
the import command
On xdemca2:
root@xdemca2:/>symdg importall -f /tmp/dg_local.cfg
Creating device group 'haxd_siteA'
Adding STD device F53 as DEV001...
Adding STD device F54 as DEV002...
The Import file for the nodes in the remote site
In Example 7-16, we perform the export/import operations by using the xdemcb1 node
at the remote site as the import target. The same import operation command applies
for the xdemcb2 node. The export command uses the -rdf flag to generate a file that
contains the corresponding R2 devices on the storage in the remote site.
Example 7-16 Exporting and importing the DG definitions for a node in the remote site
On node xdemca1:
root@xdemca1:/>symdg exportall -f /tmp/dg_remote.cfg -rdf
root@xdemca1:/>cat /tmp/dg_remote.cfg
<haxd_siteA>
2 000190101983
S 0BF DEV001
S 0C0 DEV002
Copy the file /tmp/dg_remote.cfg on the target node xdemcb1 and import de DG
definition on the node.
On node xdemcb1:
root@xdemcb1:/>cat /tmp/dg_remote.cfg
288
<haxd_siteA>
2 000190101983
S 0BF DEV001
S 0C0 DEV002
root@xdemcb1:/>symdg importall -f /tmp/dg_remote.cfg
Creating device group 'haxd_siteA'
Adding STD device 0BF as DEV001...
Adding STD device 0C0 as DEV002...
13.Verify the DG definition on a node by using the symdg list or symdg show <DG name>
commands. See an output example of the symdg show command in 7.8.1, SYMCLI
commands for SRDF environment on page 330.
14.Add the SRDF replicated resource definition in the cluster configuration. The cluster script
now creates a CG and synchronizes the definition across all cluster nodes. To succeed,
you must have propagated the device group definition on all cluster nodes.
To create the PowerHA SRDF replicated resource, run the smitty hacmp command. Then,
select Extended Configuration Extended Resource Configuration HACMP
Extended Resources Configuration Configure EMC SRDF Replicated
Resources Add EMC SRDF Replicated Resource (Figure 7-13).
Add an EMC SRDF Replicated Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[haxdcg_siteA]
SYNC
haxd_siteA
AUTO
YES
+
+
+
+
+
15.In the Add an EMC SRDF Replicated Resource panel, define the following options in the
menu:
EMC SRDF Composite Group Name
Specify a name for the composite group to be defined. In our scenario, we use
haxdcg_siteA for the CG name that is associated with the resource group with the
primary site A.
EMC SRDF Mode
Choose between SYNC and ASYNC, depending on the SRDF type of relationship that you
created. In our case, we set this parameter to SYNC, which corresponds with the
synchronous relationships that were already defined in Example 7-8 on page 285.
Device Groups
Specify the device groups that are associated with the composite group definition. In
our case, we include DG haxd_siteA, which were already defined on all nodes of the
cluster.
289
Recovery Action
Specify whether the resource will be automatically processed or manually processed if
a failover condition across sites applies. Consider that the MANUAL option for
processing the replicated resource is also correlated with the state of the SRDF
relations at the time of failover of the resource group that contains the replicated
resource to the secondary site. For more information about auto versus manual option
settings, see 5.5.5, Lost of the replication links (auto versus manual) on page 214.
Consistency Enabled
Specify whether consistency will be applied for the relationships in the composite
group. Use of this option requires an SRDF/CG license feature that is applied on all
cluster nodes. When you enable consistency, the SRDF/CG feature prevents a
dependent-write I/O operation from reaching the secondary site if a previous I/O write
was not completed on both sides. For more information about the SRDF/CG feature,
see EMC Symmetrix Remote Data Facility (SRDF) - Product Guide, P/N 300-001-165.
In our scenario, we enable the consistency for the current composite group.
SRDF: When you define the SRDF replicated resource, the cluster script verifies the
SRDF environment and propagates the CG definitions across the cluster nodes
before running the cluster synchronization. You must run the cluster verification and
synchronization task to synchronize the SRDF cluster resource definition across the
nodes.
Later, we add the second replicated resource to the cluster configuration on top of the existing
configuration.
root@xdemca1:/>lsvg
srdf_s_vg:
LV NAME
srdf_s_loglv
srdf_s_lv
-l srdf_s_vg
TYPE
jfs2log
jfs2
LPs
1
1000
PPs
1
1000
PVs
1
2
LV STATE
open/syncd
open/syncd
MOUNT POINT
N/A
/data1
Now, the definition of the volume group exists only on the nodes at site A, which are the
xdemca1 node and the xdemca2 node. To integrate the volume group in the cluster
configuration, import the volume group on the nodes at the remote site. To import the volume
group, split the SRDF relationships. We perform this task on the CG level (Example 7-18).
Example 7-18 Splitting the SRDF relations
290
root@xdemcb2:/>lspv
hdiskpower42
hdiskpower43
none
none
None
None
root@xdemcb2:/>lspv
hdiskpower42
00cfb52d62712021
None
hdiskpower43
00cfb52d6278a02c
None
291
net_public_01
xdemcbb
xdemcaa
xdemca1
xdemcb1
xdemca2
RG_siteb
RG_sitea
xdemcb2
SRDF: haxdcg_siteB
SRDF Replication
292
Figure 7-15 shows an example of defining the service IP label configurable on multiple
nodes that are used at site A.
Add a Service IP Label/Address configurable on Multiple Nodes (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* IP Label/Address
xdemcaa_sv
Netmask(IPv4)/Prefix Length(IPv6)
[]
* Network Name
net_public_01
Alternate HW Address to accompany IP Label/Address []
Associated Site
siteA
A similar operation is performed to define the service IP label xdemcab specific to site B.
The srdf_s_vg volume group and the associated /data1 file system
These resources are added in the resource group definition after the RG_sitea resource
group is created.
The SRDF replicated resource, previously created in Figure 7-13 on page 289
Create the RG_sitea resource group with the primary site A with the default values of the
failover/failback policy parameters (Figure 7-16).
Add a Resource Group (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[RG_sitea]
Startup Policy
Failover Policy
Failback Policy
+
+
+
293
Add the resources to the resource group definition. Figure 7-17 shows the SRDF replicated
resource option in the configuration menu.
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
Prefer Primary Site
xdemca1 xdemca2
xdemcb1 xdemcb2
Startup Policy
Online On Home Node Only
Failover Policy
Failover To Next Priority Node In The List
Failback Policy
Failback To Higher Priority Node In The List
Failback Timer Policy (empty is immediate)
[]
+
Service IP Labels/Addresses
Application Servers
[xdemcaa_sv xdemcab_sv]
[]
+
+
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Filesystems (empty is ALL for VGs specified)
Filesystems Consistency Check
Filesystems Recovery Method
Filesystems mounted before IP configured
Filesystems/Directories to Export (NFSv2/3)
Filesystems/Directories to NFS Mount
Network For NFS Mount
[srdf_s_vg ]
false
false
[ ]
fsck
sequential
false
[]
[]
[]
+
+
+
+
+
+
+
+
Tape Resources
Raw Disk PVIDs
[]
[]
+
+
[]
[]
+
+
[]
[]
+
+
Miscellaneous Data
WPAR Name
EMC SRDF Replicated Resources
[]
[]
[haxdcg_siteA]
Figure 7-17 Adding the SRDF replicated resource to the resource group
294
+
+
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
Verify in the clRGinfo output for the states of the resource group on both sites:
ONLINE: It is associated with the active state of the resource group on a node at a site.
That node activates the resources in the resource group and starts any defined application
servers. Example 7-22 shows the status of the IP addresses and file systems on the
xdemca1 node, currently the node with primary online state of the RG_sitea
resource group.
Example 7-22 Status of the resources on node xdemca1
IP Addresses:
root@xdemca1:/>netstat -i
Name Mtu
Network
Address
en0
1500 link#2
0.9.6b.dd.e.e4
en0
1500 10.100.101 xdemca1_boot
en0
1500 10.10.10
xdemcaa_sv
en1
1500 link#3
0.9.6b.dd.e.e5
en1
1500 207.18.1
xdemca1_xdip1
en2
1500 link#4
0.11.25.cd.36.60
en2
1500 207.18.2
xdemca1_xdip2
lo0
16896 link#1
lo0
16896 127
loopback
lo0
16896 ::1
Opkts Oerrs
1986826
3
1986826
3
1986826
3
947163
3
947163
3
380680
3
380680
3
3240273
0
3240273
0
32402730 0
Coll
0
0
0
0
0
0
0
0
0
0
VG IDENTIFIER:
PP SIZE:
TOTAL PPs:
8 megabyte(s)
1078 (8624
FREE PPs:
77 (616 megabytes)
295
LVs:
megabytes)
OPEN LVs:
TOTAL PVs:
STALE PVs:
ACTIVE PVs:
Concurrent:
VG Mode:
Node ID:
MAX PPs per VG:
MAX PPs per PV:
LTG size (Dynamic):
HOT SPARE:
USED PPs:
1001 (8008
2
2
0
2
Enhanced-Capable
Concurrent
1
32512
1016
256 kilobyte(s)
no
QUORUM:
2 (Enabled)
VG DESCRIPTORS: 3
STALE PPs:
0
AUTO ON:
no
Auto-Concurrent: Disabled
Active Nodes:
MAX PVs:
AUTO SYNC:
BB POLICY:
2
32
no
relocatable
Free %Used
15.86
96%
619.90
85%
109.86
83%
2907.02
6%
63.65
1%
127.64
1%
954.80
38%
249.28
3%
7848.32
2%
ONLINE SECONDARY: This state is associated with the resource group status at the
secondary site. When the cluster services are started on the secondary nodes, the
resource group is acquired in this state, indicating that the remote site is up and ready to
acquire the resource group in a failover case.
Compare the resource group status with the actual status of the SRDF relationships by
running the symrdf list pd command (Example 7-23).
Example 7-23 Status of the SRDF relationships
root@xdemca1:/>symrdf list pd
Symmetrix ID: 000190100304
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------0F53 00BF
R1:41
0F54 00C0
R1:41
...........
296
RW RW RW
RW RW RW
S..1S..1-
0
0
0 RW
0 RW
WD
WD
Synchronized
Synchronized
In the output of Example 7-23 observe that during normal operation, the RG_sitea is acquired
at site A. The SRDF status of the pairs shows the volumes at site A as primary volumes (R1
type), while a view from site B shows the volumes at site B as secondary (R2 type).
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF 0F53
00C0 0F54
00C1 0F55
00C2 0F56
00C3 0F57
........
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
WD
WD
WD
Synchronized
Synchronized
Consistent
Consistent
Consistent
297
d. Define the new device groups for the resource group at site B and add the volumes to
the device groups.
Group haxd1_siteB:
symdg create haxd1_siteB -type RDF1
symld -sid 1983 -g haxd1_siteB addall -RANGE 00C1:00C2
Group haxd2_siteB:
symdg create haxd2_siteB -type RDF1
symld -sid 1983 -g haxd2_siteB add dev 00C3
e. Propagate the DG definitions across the cluster nodes. Since the SRDF configuration
already contains the haxd_siteA device group, we use the export and import
commands that are applied only to the newly created device groups:
For the nodes on the same site, on xdemcb1 node export the DGs definitions:
symdg export haxd1_siteB -f /tmp/haxd1_siteB.cfg
symdg export haxd2_siteB -f /tmp/haxd2_siteB.cfg
Copy the files on the xdemcb2 node and run on the target node:
symcfg sync
symdg import haxd1_siteB -f /tmp/haxd1_siteB.cfg
symdg import haxd2_siteB -f /tmp/haxd2_siteB.cfg
For the nodes in the remote site (xdemca1 and xdemca2): On node xdemcb1
export the DGs definitions for the remote site by using the -rdf option:
symdg export haxd1_siteB -f /tmp/haxd1_siteB_remote.cfg -rdf
symdg export haxd2_siteB -f /tmp/haxd2_siteB_remote.cfg -rdf
Copy the files on the remote nodes xdemca1 and xdemca2 and run on each
of them:
symcfg sync
symdg import haxd1_siteB -f /tmp/haxd1_siteB_remote.cfg
symdg import haxd2_siteB -f /tmp/haxd2_siteB_remote.cfg
An alternative method is available for propagating the device group definitions across the
local and remote nodes. That is, run the device group creation and the disk association
commands that are used on the initiating node (in our case xdemcb1) on each node of the
cluster by using the corresponding devices at each site. For more information about the
device group and composite group operations, see EMC Solutions Enabler Symmetrix
SRDF Family CLI - Product Guide, P/N 300-000-877.
298
2. Define the SRDF replicated resource in the PowerHA Enterprise Edition configuration.
Now, we create the haxdcg_siteB composite group that contains both device groups
haxd1_siteB and haxd2_siteB using the SMIT menus. On the xdemcb1 node, we access
the SRDF replicated resource main menu by using the smitty cl_srdf_def fast path
command and choose the Add EMC SRDF Replicated Resource option (Figure 7-18).
Add an EMC SRDF Replicated Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* EMC SRDF Composite Group Name
* EMC SRDF Mode
Device Groups
* Recovery Action
* Consistency Enabled
[Entry Fields]
[haxdcg_siteB]
ASYNC
haxd1_siteB haxd2_siteB
MANUAL
YES
+
+
+
+
+
The SRDF replicated resource that is associated with the resource group of site B has the
following specific settings:
EMC SRDF Composite Group Name: haxdcg_siteB
This parameter represents the name of the composite group that is being created.
EMC SRDF Mode: ASYNC
This parameter is correlated with the asynchronous mode of operation of the disk pairs
in the haxdcg_siteB composite group.
Device groups: haxd1_siteB and haxd2_siteB
Here the device groups are specified as part of the haxdcg_siteB composite group.
Recovery action: MANUAL
When the resource recovery action is set to manual, if a site failover occurs, user
intervention is required to manually fail over the SRDF relationships to the secondary
location and make the storage volumes available to the recovery nodes.
Consistency enabled: YES
When consistency is enabled on the composite group, the data on the remote volumes
(part of the CG) is protected against a partial failure of the SRDF relationships.
Therefore, the data consistency is ensured on the recovery site.
Consistency: Consistency must be enabled when you use the ASYNC mode of
operation.
3. Define the service IP labels for the resource group of site B. In our environment we use
two service labels on the existing net_public_01: xdemcbb cluster network when the
resource group is ONLINE at site B, and xdemcba when the resource group is ONLINE at
site A. For defining the service labels, we use the same SMIT menu as when we defined
the service IP labels for RG_sitea (Figure 7-15 on page 293).
299
4. Import the volume group definitions of site B on the nodes at site A. At site B, we defined
two standard volume groups, srdf_a_vg1 and srdf_a_vg2, each containing a jfs2 file
system (Example 7-25).
Example 7-25 Volume group configuration on the nodes at site B
root@xdemcb1:/>lsvg
srdf_a_vg1:
LV NAME
srdfa1loglv
srdfa1lv
root@xdemcb1:/>lsvg
srdf_a_vg2:
LV NAME
srdfa2loglv
srdfa2lv
-l srdf_a_vg1
TYPE
LPs
jfs2log
1
jfs2
1000
-l srdf_a_vg2
PPs
1
1000
PVs
1
2
LV STATE
open/syncd
open/syncd
MOUNT POINT
N/A
/data21
TYPE
jfs2log
jfs2
PPs
1
500
PVs
1
1
LV STATE
open/syncd
open/syncd
MOUNT POINT
N/A
/data22
LPs
1
500
For importing the volume group definitions on the remote site, we use the same method
that is described in 7.4.5, Importing the volume group definition in the remote site on
page 290.
To split the relationships, enter:
symrdf -cg haxdcg_siteB split -force -nop
Import the volume groups on the target nodes at site A (see also the hdiskpower#
mapping to the volume groups in the shaded box on page 284):
importvg -y srdf_a_vg1 hdiskpower3
importvg -y srdf_a_vg2 hdiskpower5
Re-establish the relationships:
symrdf -cg haxdcg_siteB establish -nop
The -nop option (no prompt) is used in this case to skip the confirmation prompt when the
command is run.
5. Add the resource group definition in the cluster configuration. For site B, we use the
default inter-site policy: prefer primary site (Figure 7-19).
Add a Resource Group (extended)
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[RG_siteb]
[ignore]
[xdemcb1 xdemcb2]
[xdemca1 xdemca2]
Startup Policy
Failover Policy
Failback Policy
300
+
+
+
6. Change the attributes for the resource group to include the IP addresses, volume groups,
and SRDF replicated resources (Figure 7-20).
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_siteb
Prefer Primary Site
xdemcb1 xdemcb2
xdemca1 xdemca2
Startup Policy
Online On Home Node Only
Failover Policy
Failover To Next Priority Node In The List
Failback Policy
Failback To Higher Priority Node In The List
Failback Timer Policy (empty is immediate)
[]
+
Service IP Labels/Addresses
Application Servers
[xdemcbb_sv xdemcba_sv] +
[]
+
Volume Groups
[srdf_a_vg1 srdf_a_vg2 ]+
Use forced varyon of volume groups, if necessary
false
+
Automatically Import Volume Groups
false
+
Filesystems (empty is ALL for VGs specified)
[ ]
+
Filesystems Consistency Check
fsck
+
Filesystems Recovery Method
sequential
+
Filesystems mounted before IP configured
false
+
Filesystems/Directories to Export (NFSv2/3)
[]
+
Filesystems/Directories to NFS Mount
Network For NFS Mount
[]
[]
Tape Resources
Raw Disk PVIDs
[]
[]
+
+
[]
[]
+
+
[]
[]
+
+
Miscellaneous Data
WPAR Name
EMC SRDF Replicated Resources
[]
[]
[haxdcg_siteB]
+
+
301
7. Synchronize the cluster configuration while the cluster services are started on all nodes.
We check the resource groups status by using clRGinfo. See the final state of the
resource groups in the cluster (Example 7-26).
Example 7-26 Resource group status
root@xdemca2:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
You can compare the status of the resource group with the status of the SRDF
relationships by querying the actual state of the SRDF relationships. We use the symrdf
list pd command to show all the existing relationships for the volumes that are attached
on a host (Example 7-27).
Example 7-27 Listing the SRDF relations for the volumes that are attached on a host
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF 0F53
00C0 0F54
00C1 0F55
00C2 0F56
00C3 0F57
.........
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
WD
WD
WD
Synchronized
Synchronized
Consistent
Consistent
Consistent
The command was issued from a node at site B. When the RG_siteb resource group is
active on site B, the 00C1-00C3 devices that are part of the resource group have the R1
role (source volumes). While for the RG_sitea resource group, which is active at site A, the
00BF-00C0 volumes on storage at site B have the R2 role (target volumes).
availability at both sites. In many environments, you can consider periodic site failover tests as
a good practice to verify the readiness for a real disaster case.
Beyond the local high-availability configuration, PowerHA Enterprise Edition scenarios involve
new factors:
Sites and the node associations with sites
IP and non-IP communication links between sites
Data replication between storage subsystems
For our scenario, we do not perform local HA tests or redundancy tests, such as bringing
down one of the two XD_ip networks because we focus on an extended distance scenario.
Losing the XD_ip communication links can be considered a particular failure case. Configure
redundant IP/non-IP communication paths to avoid isolation of the sites. Losing all the
communication paths between sites leads to a partitioned state of the cluster and to data
divergence between sites if the replication links are also unavailable. For considerations
related to the partitioned cluster case, see 9.3, Partitioned cluster considerations on
page 457.
Another particular case is the loss of the replication paths between the storage subsystems
while the cluster is running on both sites. To avoid this situation, configure redundant
communication links for the SRDF replication. You must manually recover the status of the
SRDF pairs after the storage links are operational.
Important: The PowerHA software does not monitor the SRDF link status. If the SRDF link
goes down when the cluster is up, and later the link is repaired, you must manually
resynchronize the pairs.
In the following sections, we describe the following test cases:
Graceful failover scenario
Total site failure in two scenarios:
Fully automated
With manual intervention
Loss of storage access in a site
For all the tests in our environment, we start from the same initial state. Each resource group
is active in the associated primary site, RG_sitea at site A and RG_siteb at site B
(Example 7-28).
Example 7-28 Resource group states on each node
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
303
We run the RG move command by using SMIT menus. Run the smitty hacmp command.
Then, select System Management (C-SPOC) Resource Group and Applications
Move a Resource Group to Another Node / Site Move Resource Groups to Another
Site. Select the ONLINE instance of the RG_sitea to be moved to the remote site.
Figure 7-21 shows the SMIT panel for the move operation.
Move Resource Group(s) to Another Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
siteB
The resource group RG_sitea is moved to site B, and Example 7-29 shows the final status.
Example 7-29 Final state of the resource group in the cluster
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
xdemca1@siteA
OFFLINE
xdemca2@siteA
OFFLINE
xdemcb1@siteB
ONLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
In Example 7-30, you can see the status of the SRDF relationships from both sites after the
failover operation is completed.
Example 7-30 Status of SRDF relations after resource group movement
Site A view:
Local Device View
---------------------------------------------------------------------------304
Sym
RDF
Dev RDev Typ:G
---- ---- -------0F53 00BF
0F54 00C0
..
R2:41
R2:41
STATUS
--------SA RA LNK
---------
MODES
----- R1 Inv
MDATE Tracks
----- -------
RW WD RW
RW WD RW
S..2S..2-
0
0
RDF S T A T E S
R2 Inv ---------------------Tracks Dev RDev Pair
------- --- ---- ------------0 WD
0 WD
RW
RW
Synchronized
Synchronized
Site B view:
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF 0F53
00C0 0F54
....
R1:41
R1:41
RW RW RW
RW RW RW
S..1S..1-
0
0
0 RW
0 RW
WD
WD
Synchronized
Synchronized
In the output in Example 7-30, the R1/R2 roles of the volumes are swapped after the resource
group movement in comparison to their initial value. The replication is established now from
site B to site A. In this SRDF state, volumes at site B (R1 type) can be used for read/write
operations. Their corresponding targets (R2 type) at site A cannot be updated (WD state write denied) and can be opened only for read operations.
To fail back the resource group to the original site A, perform the same RG move operation of
the RG_sitea resource group, now active at site B to site A.
305
on the highest priority node in the list (xdemcb1). Example 7-31 shows the clRGinfo
command output. As shown in Example 7-31, there is no ONLINE SECONDARY state for the
RG_sitea and RG_siteb resource groups because there are no available nodes at site A.
Example 7-31 Status of the resource group after failover
root@xdemcb1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
OFFLINE
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
Example 7-32 shows the SRDF relationship status after the failover.
Example 7-32 SRDF relationship status after site failover event
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF
00C0
00C1
00C2
00C3
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
RW
RW
RW
Total
Track(s)
MB(s)
S..2S..2A..1A..1A..1-
13474
66574
0
0
0
0
0
0
0
0
RW
RW
RW
RW
RW
NA
NA
NA
NA
NA
Partitioned
Partitioned
TransIdle
TransIdle
TransIdle
-------- -------80048
0
5003.0
0.0
A
X
D
1
X
=
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
R1, 2 = R2
Enabled, .
Because the storage connections are being lost, the status of the SRDF/S relations became
Partitioned, whereas the status of SRDF/A relations became TransIdle. For volume pairs that
306
are associated with the RG_sitea resource group, R1 and R2 roles did not swap, but the
devices at site B are now enabled for read/write operations.
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF
00C0
00C1
00C2
00C3
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
NR
NR
NR
Total
Track(s)
MB(s)
S..2S..2A..1A..1A..1-
13474
66574
0
0
0
0
0
1
16388
16389
RW
RW
RW
RW
RW
RW
RW
WD
WD
WD
Split
Split
Suspended
Suspended
Suspended
-------- -------80048
32778
5003.0
2048.6
A
X
D
1
X
=
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
R1, 2 = R2
Enabled, .
In Example 7-33, observe that SRDF/S pairs changed to Split state, while the SRDF/A pairs
changed to Suspended.
Important: Do not initiate the failback of the resource group after a failover until you check
that the RDF links are up and verify the state of the SRDF pairs.
At this time, you need to update the R1 volumes on site A associated with the RG_sitea
resource group with the R2 data on site B, modified while RG_sitea was active at site B. You
need to perform this operation manually before the failback operation is initiated by the
cluster. If you start the cluster failback operations before you update the R1 volumes while the
SRDF pairs are in this state, you establish normal R1 R2 replication and overwrite the R2
volumes at site B with the existing image at site A.
307
We perform the following operations on a node at site B to update the R1 volumes with the
latest data version on R2 volumes. (For each change in the SDRF state of the CG pairs, a list
status is provided by using the symrdf list pd command.)
1. Fail over the SRDF relationship on the haxdcg_siteA composite group (Example 7-34).
symrdf -cg haxdcg_siteA failover -force
Example 7-34 Failover output of the SRDF relationship
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
NR
NR
NR
S..2S..2A..1A..1A..1-
3
16390
0
0
0
0
0
1
0
1
RW
RW
RW
RW
RW
WD
WD
WD
WD
WD
Failed Over
Failed Over
Suspended
Suspended
Suspended
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
NR
S..2S..2A..1A..1A..1-
1
0
0
0
0
0
0
1
0
1
RW
RW
RW
RW
RW
WD
WD
WD
WD
WD
R1 Updated
R1 Updated
Suspended
Suspended
Suspended
308
We finally start the cluster services on the xdemca1 and xdemca2 nodes by using smitty
clstart. Figure 7-22 shows the SMIT panel.
Start Cluster Services
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
now
[xdemca1,xdemca2]
Automatically
false
true
false
Interactively
+
+
+
+
+
+
+
After the failback operation is completed, the cluster re-establishes the R1 R2 relationships
to their original state. The RG_sitea is acquired on the highest priority node at site A
(xdemca1). Example 7-36 shows the clRGinfo command output.
Example 7-36 Resource group status after failback to primary site A
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
When the primary site comes back online, both RG_sitea and RG_siteb resource group
states are changed. RG_sitea is primary online at site A and secondary online at site B,
whereas RG_siteb, previously primary online at site B, acquires the secondary online state at
site A.
Both SRDF relationships are recovered from their previous states (Example 7-36 on
page 309). Their final status is Synchronized for SRDF/S pairs and Consistent for SRDF/A
pairs. Example 7-37 shows the new status of the pairs that are listed from site B.
Example 7-37 SRDF pair status after starting the cluster services at site A and failback of RG_sitea
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
Chapter 7. Configuring PowerHA SystemMirror Enterprise Edition with SRDF replication
309
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF 0F53
00C0 0F54
00C1 0F55
00C2 0F56
00C3 0F57
.........
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
WD
WD
WD
Synchronized
Synchronized
Consistent
Consistent
Consistent
root@xdemcb1:/>/usr/es/sbin/cluster/sr/cmds/cllssr -a
SRDFCgName
SRDFMode DeviceGroups
RecoveryAction ConsistencyEnabled
haxdcg_siteA SYNC
haxd_siteA
AUTO
YES
haxdcg_siteB ASYNC
haxd1_siteB haxd2_siteB
MANUAL
YES
To simulate a manual failover of the SRDF resources:
1. Simulate the total site failure condition in the same way as in the fully automated scenario
that was applied for the nodes and the storage at site B. See Fully automated scenario
on page 305.
After the cluster detects no available nodes at site B to acquire the resource group
RG_siteb, a site failover event takes place. Now, the cluster tries to acquire the RG_siteb
resource group at site A.
2. Use the MANUAL option is used on the SRDF replicated resource. Therefore, the cluster
notifies the user for the required actions that need to be performed on the SRDF replicated
resource and does not activate the RG_siteb resource group at the secondary site A.
Example 7-39 shows the message that is posted in the hacmp.out file.
Example 7-39 hacmp.out file message
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
OFFLINE
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
OFFLINE
OFFLINE
ONLINE SECONDARY
ERROR
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
3. Check the status of the SRDF pairs. We observed that part of the RG_siteb resource
group pairs are in TransIdle state with the WD access mode (Example 7-41).
Example 7-41 SRDF pair status after site B failure (site A view)
root@xdemca1:/>symrdf list pd
Symmetrix ID: 000190100304
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------0F53
0F54
0F55
0F56
0F57
....
00BF
00C0
00C1
00C2
00C3
R1:41
R1:41
R2:42
R2:42
R2:42
RW
RW
RW
RW
RW
RW
RW
WD
WD
WD
NR
NR
RW
RW
RW
S..1S..1A..2A..2A..2-
0
0
0
0
0
1
0
0
0
0
RW
RW
WD
WD
WD
NA
NA
NA
NA
NA
Partitioned
Partitioned
TransIdle
TransIdle
TransIdle
311
4. Follow the user actions that are provided in the hacmp.out file. Because site B failed, there
are no active links between the DMX storage subsystems now. Perform a manual failover
of the SRDF pairs in the composite group to activate the disks in the secondary site.
Example 7-42 shows the details of performing these operations in our environment.
Example 7-42 Failover of the SRDF/A pairs on the secondary site A
00BF
00C0
00C1
00C2
00C3
R1:41
R1:41
R2:42
R2:42
R2:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
NR
NR
NR
S..1S..1S..2S..2S..2-
0
0
0
0
0
2
0
0
0
0
RW
RW
RW
RW
RW
NA
NA
NA
NA
NA
Partitioned
Partitioned
Partitioned
Partitioned
Partitioned
5. After changing the state of the disks in the secondary site, bring the resource group online.
6. Run the varyonvg command on xdemca1 node and on the volume groups that are
associated with the RG_siteb, srdf_a_vg1, and srdf_a_vg2 resource groups:
varyonvg srdf_a_vg1
varyonvg srdf_a_vg2
7. Activate the RG_siteb resource group at site A by using the SMIT menu by running the
smitty hacmp command and selecting System Management (C-SPOC) Resource
Group and Applications Bring a Resource Group Online. Select the RG_siteb
target resource group that is associated with the primary online state from the list and then
the target node for the resource group activation, xdemca1.
312
Figure 7-23 Bring online resource group RG_siteb in the secondary site A
root@xdemca1:/var/hacmp/log>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
OFFLINE
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
OFFLINE
OFFLINE
ONLINE
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
NR
NR
NR
NR
NR
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
Suspended
Suspended
Split
Split
Split
313
After the RDF links became operational, the status of the pairs changed as follows:
The SRDF/S pairs are in Suspended state. The primary online instance of the RG_sitea
resource group was active at site A during the test and the pairs are re-established after
the RG_sitea resource group is acquired on site B in the secondary online state.
The SRDF/A pairs are in Split state. Read/write operations are allowed at both ends.
You can now re-establish the SRDF/A pairs, updating the storage at site B with the content of
the volumes at site A (siteA siteB replication). We run the update operation of the R1 devices
(site B) from R2 (site A) and re-establish the original state of the CG SRDF pairs with the
following sequence of tasks. Unless specified, the operations are run on the xdemca1 node.
1. Run the failover operation on the haxdcg_siteB composite group:
symrdf -cg haxdcg_siteB failover -immediate -force -symforce
Example 7-45 shows the state of the pairs after the failover command.
Example 7-45 Pairs status after failover
00BF
00C0
00C1
00C2
00C3
R1:41
R1:41
R2:42
R2:42
R2:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
NR
NR
NR
S..1S..1A..2A..2A..2-
0
0
3
6
6
0
0
0
0
0
RW
RW
RW
RW
RW
WD
WD
WD
WD
WD
Suspended
Suspended
Failed Over
Failed Over
Failed Over
2. Update the R1 volumes at site B with the latest version of the data on the R2 volumes at
site A:
symrdf -cg haxdcg_siteB update -force
Example 7-46 shows the state of the pairs after you run the update command and the R1
update is completed.
Example 7-46 Output of the state of the pairs after running the update command
314
00BF
00C0
00C1
00C2
00C3
R1:41
R1:41
R2:42
R2:42
R2:42
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
RW
S..1S..1A..2A..2A..2-
0
0
0
0
0
0
0
0
0
0
RW
RW
RW
RW
RW
WD
WD
WD
WD
WD
Suspended
Suspended
R1 Updated
R1 Updated
R1 Updated
The volumes of site B that are part of RG_siteb are now updated from their corresponding
pairs at site A.
3. Bring offline the RG_siteb resource group at site A to fail back the SRDF relationship to its
original state from site B to site A. Use the cluster C-SPOC SMIT menus. Run the smitty
hacmp command. Then, select System Management (C-SPOC) Resource Group and
Applications Bring a Resource Group Offline.
4. After you bring the RG_siteb resource group offline at site A, fail back the SRDF pairs to
their original state (R1 R2):
symrdf -cg haxdcg_siteB failback -force
Example 7-47 shows the status of the disk pairs after the failback operation.
Example 7-47 Status of the pairs after failback
root@xdemcb2:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF
00C0
00C1
00C2
00C3
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
WD
WD
WD
Suspended
Suspended
Consistent
Consistent
Consistent
5. Start the cluster services on the xdemcb1 and xdemcb2 nodes. When you start the cluster
services on the nodes at site B, the RG_sitea is acquired in the secondary online state. As
a result, SRDF/S pairs are automatically re-established.
Because of the previous manual activation/deactivation of the RG_siteb resource group,
the cluster does not automatically acquire the resource group. Manually bring the resource
group back online on a node at site B.
6. Activate the RG_siteb at site B by using the SMIT C-SPOC menus. Run the smitty hacmp
command. Select System Management (C-SPOC) Resource Group and
Applications Bring a Resource Group Online. Activate the RG_siteb resource group
on the xdemcb1 node. Example 7-48 shows the final resource group status.
Example 7-48 Final status of the resource groups after bringing the resource group online
root@xdemcb2:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
OFFLINE
xdemcb1@siteB
ONLINE SECONDARY
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
315
ONLINE SECONDARY
OFFLINE
xdemca1@siteA
xdemca2@siteA
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF
00C0
00C1
00C2
00C3
0F53
0F54
0F55
0F56
0F57
R2:41
R2:41
R1:42
R1:42
R1:42
RW
RW
RW
RW
RW
WD
WD
RW
RW
RW
RW
RW
RW
RW
RW
S..2S..2A..1A..1A..1-
0
0
0
0
0
0
0
0
0
0
WD
WD
RW
RW
RW
RW
RW
WD
WD
WD
Synchronized
Synchronized
Consistent
Consistent
Consistent
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
xdemca1@siteA
ACQUIRING
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
316
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
........
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
OFFLINE
xdemca1@siteA
TEMPORARY ERROR
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
The RG_sitea resource group finally fails over to the secondary site. After the failover,
Example 7-51 shows the final state of the resource group.
Example 7-51 The state of the resource groups after the failover
root@xdemcb1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
RG_siteb
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
Example 7-52 shows the state of the SRDF relationship. We run the command to check the
SRDF state of the volume on a host at site B. Only the storage connection at site B is
available now.
Example 7-52 State of SRDF relationship after failover
root@xdemcb1:/>symrdf list pd
Symmetrix ID: 000190101983
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------00BF 0F53
00C0 0F54
R1:41
R1:41
RW RW RW
RW RW RW
S..1S..1-
0
0
0 RW
0 RW
WD
WD
Synchronized
Synchronized
317
00C1 0F55
00C2 0F56
00C3 0F57
R1:42
R1:42
R1:42
RW RW RW
RW RW RW
RW RW RW
A..1A..1A..1-
0
0
0
0 RW
0 RW
0 RW
WD
WD
WD
Consistent
Consistent
Consistent
Observe again the swap of R1/R2 roles for the volume pairs that are associated with the
RG_sitea resource group because of the failover process to site B, while the RDF links are
operational.
root@xdemca1:/>powermt display
Symmetrix logical device count=58
CLARiiON logical device count=0
Hitachi logical device count=0
Invista logical device count=0
HP xp logical device count=0
Ess logical device count=0
HP HSx logical device count=0
==============================================================================
----- Host Bus Adapters --------- ------ I/O Paths ----- ------ Stats -----### HW Path
Summary
Total
Dead IO/Sec Q-IOs Errors
==============================================================================
0 fscsi0
optimal
116
0
0
0
1 fscsi1
optimal
116
0
0
0
Both adapters are now enabled and paths to the storage volumes are available again. The
resource group is now active at site B. To fall back the resource group to the primary site A,
perform a resource group move operation from site B to site A. We perform the resource
group movement on a cluster node using the SMIT C-SPOC menu (Figure 7-24).
Move Resource Group(s) to Another Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
RG_sitea
siteA
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
318
RG_siteb
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemca2@siteA
xdemcb1@siteB
xdemcb2@siteB
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemcb1@siteB
xdemcb2@siteB
xdemca1@siteA
xdemca2@siteA
[Entry Fields]
haxdcg_siteA
[]
SYNC
haxd_siteA
AUTO
YES
+
+
+
+
+
You can dynamically modify the following attributes of the SRDF replicated resource:
New EMC SRDF Composite Group Name
You can specify a new name for the CG that you already selected.
EMC SRDF Mode
Specify SYNC or ASYNC mode of operation for the pairs in the CG.
Device Groups
Recovery Action
You can specify AUTO or MANUAL for the recovery action on SRDF
volumes during the failover process.
Consistency EnabledYou can select YES or NO to enable or disable the consistency on the
specified CG.
319
Consistency: When you use an asynchronous mode of operation with the SRDF
relationships, you must enable consistency on the composite group. We implemented
consistency for synchronous operating mode.
When you change the composite group options by using the SMIT menus, the cluster scripts
perform the required modifications on the composite group configuration and propagate the
new composite group definition on the cluster nodes. You must synchronize the cluster after
you making these changes to distribute the cluster definition of the SRDF replicated resource
on all cluster nodes.
However, to manually synchronize the composite group definition on the cluster nodes:
1. Export the CG definition from the initiating node. We consider the node that is performing
the initial CG change to be the initiating node:
For the local import on nodes at the same site:
symcg export <CG_name> -f <file_import_local>
For import on nodes at a remote site:
symcg export <CG_name> -f <file_import_remote> -rdf
2. Delete the existing definition on the target node, if applicable:
symcg delete <CG_name> -force
3. Import the definition on the rest of the nodes in the cluster:
Nodes at the same site
symcg import <CG_name> -f <file_import_local> -rdf_consistency
Nodes at remote site
symcg import <CG_name> -f <file_import_remote> -rdf_consistency
320
To access the SMIT menu for removing a SRDF-replicated resource run the smitty hacmp
command. Select Extended Configuration Extended Resource Configuration
HACMP Extended Resources Configuration Configure EMC SRDF Replicated
Resources Remove EMC SRDF Replicated Resource. Select the replicated resource
group that you want to delete (Figure 7-26).
Configure EMC SRDF Replicated Resources
Move cursor to desired item and press Enter.
Add EMC SRDF Replicated Resource
Change/Show EMC SRDF Replicated Resource
Remove EMC SRDF Replicated Resource
+--------------------------------------------------------------------------+
|
Select the EMC SRDF Resource Name to Remove
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
haxdcg_siteA
|
|
haxdcg_siteB
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 7-26 Deleting an SRDF replicated resource
Definitions: When you delete the SRDF replicated resource, only the cluster definition is
removed. The device group and composite group definitions remain.
See 7.8.2, Deleting existing device group and composite group definitions on page 338, for
an example on how to remove CG and DG definitions from the cluster nodes.
321
In our example, we try to change the resource group RG_sitea from synchronous mode to
asynchronous mode. Use smitty cl_srdf_def to access the SMIT SRDF replicated
resource. On the Change/Show EMC SRDF Replicated Resource menu, select the CG group
name, and change the EMC SRDF Mode option from SYNC to ASYNC or vice versa
(Figure 7-27).
Change/Show EMC SRDF Replicated Resource
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
haxdcg_siteA
[]
ASYNC
haxd_siteA
AUTO
YES
+
+
+
+
+
Figure 7-27 Changing the SRDF operation mode on CG haxdcg_siteA from SYNC to ASYNC
When you change the SRDF resource, the cluster runs the SRDF configuration scripts, which
covert the relationships in the composite group from SYNC mode to ASYNC mode of
operation. Example 7-55 shows the final status of the SRDF relations.
Example 7-55 Status of the SRDF relations after changing from SYNC to ASYNC mode of operation
:
:
:
:
:
haxdcg_siteA
RDF1
1
1
MSC
Symmetrix ID
Remote Symmetrix ID
RDF (RA) Group Number
Source (R1) View
-------------------------------ST
Standard
A
Logical Sym
T R1 Inv R2 Inv
Device
Dev
E Tracks Tracks
-------------------------------DEV001
0F53 RW
0
0
DEV002
0F54 RW
0
0
Total
Track(s)
MBs
------- ------0
0
0.0
0.0
: 000190100304
: 000190101983
: 41 (28)
322
MODES STATES
----- ------ -------C S
o u
n s
RDF Pair
MDAE s p
STATE
----- ------ --------A..- X - Consistent
A..- X - Consistent
M(ode of Operation)
:
D(omino)
:
A(daptive Copy)
:
(Consistency) E(xempt):
A
X
D
X
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
Enabled, .
Run the cluster synchronization operation from the node where you performed the SRDF
resource change to distribute the configuration of the SRDF replicated resource to all nodes
of the cluster.
RAID-5
N/Grp'd
RW
4314
323
Device Name
Directors
Device
--------------------------- ------------- ------------------------------------Cap
Physical
Sym SA :P DA :IT Config
Attribute
Sts
(MB)
--------------------------- ------------- ------------------------------------...
/dev/rhdiskpower47
00C4 13B:1 01B:CB RAID-5
N/Grp'd
RW
4314
...
To extend the srdf_s_vg volume group and the /data1 file system from this volume group, and
then to synchronize the volume group definition in the cluster:
1. Add the volumes by using C-SPOC. Disks need a PVID defined on all hosts. To create a
PVID for the new volume on all hosts at site A (Example 7-57):
chdev -l <dev> -a pv=yes
Example 7-57 Creating a PVID on the new disk at site A
None
None
2. Create the SRDF relationship between the volumes. In Example 7-58, we use the RDF
group 41 containing the volumes of the target volume group for the storage at site A
(SymID=0304). Use the symdg show command on the device group where you want to add
the new disk to find the RFD group ID during disk pair creation. In Example 7-58, we add a
new pair of disks in a SRDF/S relationship.
Example 7-58 Establish the SRDF relationship between the disk pairs
root@xdemca1:/>cat /tmp/diskadd.txt
0F58 00C4
root@xdemca1:/>symrdf -file /tmp/diskadd.txt createpair -type R1 -sid 0304
-rdfg 41 -establish -rdf_mode sync -nop
An RDF 'Create Pair' operation execution is in progress for device
file '/tmp/diskadd.txt'. Please wait...
Create RDF Pair in (0304,041)....................................Started.
Create RDF Pair in (0304,041)....................................Done.
Mark target device(s) in (0304,041) for full copy from source....Started.
Devices: 0F58-0F58 in (0304,041)................................ Marked.
Mark target device(s) in (0304,041) for full copy from source....Done.
Merge track tables between source and target in (0304,041).......Started.
Devices: 0F58-0F58 in (0304,041)................................ Merged.
Merge track tables between source and target in (0304,041).......Done.
Resume RDF link(s) for device(s) in (0304,041)...................Started.
Resume RDF link(s) for device(s) in (0304,041)...................Done.
The RDF 'Create Pair' operation successfully executed for device
file '/tmp/diskadd.txt'.
324
3. Check the status of the replication by using the symrdf list pd command.
4. Activate the PVID on the disk at the remote site. We use the chdev command to
accomplish this task (Example 7-59). We can run the command without splitting the SRDF
pair because the target disk at site B can be opened for read operations.
Example 7-59 Activate the PVID of the disk at the remote site
None
None
The PVID of the disk is the same as the PVID of the source disk at site A.
5. Add the device in the device group that contains the srdf_s_vg target volume group. In
Example 7-60, we add the 0F58 (hdiskpower6) volume to the DG haxd_siteA already
defined.
Example 7-60 Adding a disk to the device group on the xdemca1 node
root@xdemca1:/>symdg list
D E V I C E
G R O U P S
Name
Type
Valid
Symmetrix ID
Devs
GKs
haxd_siteA
haxd1_siteB
haxd2_siteB
RDF1
RDF2
RDF2
Yes
Yes
Yes
000190100304
000190100304
000190100304
2
2
1
1
1
1
Number of
BCVs VDEVs
0
0
0
TGTs
0
0
0
0
0
0
Number of
BCVs VDEVs
TGTs
G R O U P S
Name
Type
Valid
Symmetrix ID
Devs
GKs
haxd_siteA
haxd1_siteB
haxd2_siteB
RDF1
RDF2
RDF2
Yes
Yes
Yes
000190100304
000190100304
000190100304
3
2
1
1
1
1
0
0
0
0
0
0
0
0
0
325
6. Propagate the DG update on the cluster nodes. You can use local and remote commands
to run at each site (Example 7-61).
Important: Before you add the disk to the device group definition on a node in the
cluster, run the symcfg sync command to update the SYMAPI local database with the
newly discovered devices and their SRDF status.
Example 7-61 Adding the disk definition in the DGs on all cluster nodes
On nodes at site A:
symld -sid 0304 -g haxd_siteA add dev 0F58
On nodes at site B:
symld -sid 1983 -g haxd_siteA add dev 00C4
7. Add the disk to the CG defined at site A. On each node of the cluster, add the disk in the
haxdcg_siteA composite group (Example 7-62).
Example 7-62 Modifying the composite group definitions on all nodes
326
8. Add the volume to the volume group on the xdemca1 node where the RG_sitea resource
group is active, and extend the existing /data1 file system with 4 GB:
extendvg srdf_s_vg hdiskpower6
Check the srdf_s_vg volume group physical volumes (Example 7-63).
Example 7-63 srdf_s_vg volume group physical volumes
root@xdemca1:/>lsvg -p srdf_s_vg
srdf_s_vg:
PV_NAME
PV STATE
hdiskpower1
active
hdiskpower2
active
hdiskpower6
active
108..108..107..108..108
TOTAL PPs
539
539
539
FREE PPs
77
0
539
FREE DISTRIBUTION
00..00..00..00..77
00..00..00..00..00
After you extend the /data1 file system by using the chfs -a size=+4G /data1 command,
physical partitions are allocated from the new hdiskpower6 disk (Example 7-64).
Example 7-64 Physical partitions allocated from the new hdiskpower6 disk
root@xdemca1:/>lsvg -p srdf_s_vg
srdf_s_vg:
PV_NAME
PV STATE
hdiskpower1
active
hdiskpower2
active
hdiskpower6
active
TOTAL PPs
539
539
539
FREE PPs
0
0
104
FREE DISTRIBUTION
00..00..00..00..00
00..00..00..00..00
00..00..00..00..104
327
9. Synchronize the volume group definition across the cluster nodes by using C-SPOC. Run the
smitty cl_vg command, and select Synchronize a Volume Group Definition (Figure 7-28).
Important: Perform the C-SPOC operation only when the cluster is active on all
PowerHA nodes and the underlying SRDF pairs are in a synchronized state.
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Set Characteristics of a Volume Group
Enable a Volume Group for Fast Disk Takeover or Concurrent Access
Import a Volume Group
Mirror a Volume Group
+--------------------------------------------------------------------------+
|
Select the Volume Group to Update
|
|
|
| Move cursor to desired item and press Enter. Use arrow keys to scroll.
|
|
|
|
#Volume Group
Resource Group
Node List
|
|
srdf_a_vg1
RG_siteb
xdemca1,xdemca2,xdemcb1,xd |
|
srdf_a_vg2
RG_siteb
xdemca1,xdemca2,xdemcb1,xd |
|
srdf_s_vg
RG_sitea
xdemca1,xdemca2,xdemcb1,xd |
|
diskhb_b_vg
<Not in a Resource Group> xdemcb1,xdemcb2
|
|
srdf_a_vg1
RG_siteb
xdemcb1,xdemcb2
|
|
srdf_a_vg2
RG_siteb
xdemcb1,xdemcb2
|
|
srdf_s_vg
RG_sitea
xdemcb1,xdemcb2
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
| /=Find
n=Find Next
|
+--------------------------------------------------------------------------+
Figure 7-28 Synchronizing the volume group definition by using C-SPOC
Expected error: When using C-SPOC to modify a volume group that contains an SRDF
replicated resource, expect to see the following error:
cl_extendvg: Error executing clupdatevg srdf_s_vg 00cfb52d62712021 on node
xdemcb1
cl_extendvg: Error executing clupdatevg srdf_s_vg 00cfb52d62712021 on node
xdemcb2
The updated definition of the volume group for the nodes in the remote site can be performed
later by the cluster lazy update process during a site failover, or you can perform a resource
group move to the other site.
328
Description
Action
SyncInProg
Synchronized
Split
The R1 and the R2 are currently Ready to their hosts, but the link
is Not Ready or Write Disabled.
FailedOver
R1 Updated
R1 UpdInProg
The R1 is Not Ready or Write Disabled to the host, there are invalid
local (R1) tracks on the source side, and the link is Ready or Write
Disabled.
Suspended
The RDF links are suspended and are Not Ready or Write
Disabled. If the R1 is Ready while the links are suspended, any I/O
accumulates as invalid tracks owed to the R2.
Partitioned
Mixed
Invalid
This is the default state when no other SRDF state applies. The
combination of R1, R2, and RDF link states and statuses do not
match any other pair state. This state can occur if there is a
problem at the disk director level.
329
State
Description
Action
Consistent
No action required.
Transmit idle
The SRDF/A session cannot push data in the transmit cycle across
the link because the link is down. This state is applicable only to
asynchronous mirroring.
# symcfg list
S Y M M E T R I X
SymmID
Attachment
000190100304 Local
000190101983 Remote
Model
Mcode
Version
DMX3-24
DMX4-24
5772
5773
Cache
Size (MB)
Num Phys
Devices
Num Symm
Devices
32768
65536
294
0
4063
330
Ident
Symb
Num
Slot
RF-4C
04C
36
Type
RDF-BI-DIR
R D F
Attr
-
D I R E C T O R S
Remote
SymmID
Local
RA Grp
Remote
RA Grp
Status
RF-13C
13C
45
13
RDF-BI-DIR
RF-4D
04D
52
RDF-BI-DIR
RF-13D
13D
61
13
RDF-BI-DIR
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
000190101983
-
201
202
41
42
43
45
74
205
210
203
204
42
41
43
200
201
202
41
42
43
203
204
74
45
205
41
42
43
211
1
10
11
12
13
14
17
18
22
23
25
26
27
40
46
101
102
103
104
105
106
107
1
10
11
12
(C8)
(C9)
(28)
(29)
(2A)
(2C)
(49)
(CC)
(D1)
(CA)
(CB)
(29)
(28)
(2A)
(C7)
(C8)
(C9)
(28)
(29)
(2A)
(CA)
(CB)
(49)
(2C)
(CC)
(28)
(29)
(2A)
(D2)
(00)
(09)
(0A)
(0B)
(0C)
(0D)
(10)
(11)
(15)
(16)
(18)
(19)
(1A)
(27)
(2D)
(64)
(65)
(66)
(67)
(68)
(69)
(6A)
(00)
(09)
(0A)
(0B)
201
202
41
42
43
45
74
205
210
203
204
42
41
43
200
201
202
41
42
43
203
204
74
45
205
41
42
43
-
(C8)
(C9)
(28)
(29)
(2A)
(2C)
(49)
(CC)
(D1)
(CA)
(CB)
(29)
(28)
(2A)
(C7) Online
(C8)
(C9)
(28)
(29)
(2A)
(CA)
(CB)
(49)
(2C)
(CC)
(28)
(29)
(2A)
Offline
Offline
331
13
14
17
19
22
23
25
26
28
40
46
101
102
103
104
105
106
107
(0C)
(0D)
(10)
(12)
(15)
(16)
(18)
(19)
(1B)
(27)
(2D)
(64)
(65)
(66)
(67)
(68)
(69)
(6A)
332
Ident
Symb
Num
Slot
Type
RF-3C
03C
35
RDF-BI-DIR
RF-14C
14C
46
14
RDF-BI-DIR
R D F
Attr
-
D I R E C T O R S
Remote
SymmID
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
-
Local
RA Grp
200
200
201
201
202
202
41
41
42
42
43
43
203
45
204
74
74
205
45
205
210
203
204
45
74
200
201
202
(C7)
(C7)
(C8)
(C8)
(C9)
(C9)
(28)
(28)
(29)
(29)
(2A)
(2A)
(CA)
(2C)
(CB)
(49)
(49)
(CC)
(2C)
(CC)
(D1)
(CA)
(CB)
(2C)
(49)
(C7)
(C8)
(C9)
Remote
RA Grp
200
200
201
201
202
202
41
41
42
42
43
43
203
45
204
74
74
205
45
205
210
203
204
-
Status
(C7) Online
(C7)
(C8)
(C8)
(C9)
(C9)
(28)
(28)
(29)
(29)
(2A)
(2A)
(CA)
(2C)
(CB)
(49)
(49)
(CC)
(2C)
(CC)
(D1)
(CA)
(CB)
Offline
RF-3D
03D
51
RDF-BI-DIR
RF-14D
14D
62
14
RDF-BI-DIR
000190100304
000190100304
000190100304
000190100304
000190100304
000190100304
-
203
204
205
211
1
10
11
12
13
14
17
19
22
23
25
26
28
40
46
101
102
103
104
105
106
107
41
42
41
42
43
43
1
10
11
12
13
14
17
18
22
23
25
26
27
40
46
101
102
103
104
105
106
107
(CA)
(CB)
(CC)
(D2)
(00)
(09)
(0A)
(0B)
(0C)
(0D)
(10)
(12)
(15)
(16)
(18)
(19)
(1B)
(27)
(2D)
(64)
(65)
(66)
(67)
(68)
(69)
(6A)
(28)
(29)
(28)
(29)
(2A)
(2A)
(00)
(09)
(0A)
(0B)
(0C)
(0D)
(10)
(11)
(15)
(16)
(18)
(19)
(1A)
(27)
(2D)
(64)
(65)
(66)
(67)
(68)
(69)
(6A)
41
42
41
42
43
43
-
Offline
(28) Online
(29)
(28)
(29)
(2A)
(2A)
333
Example 7-67 shows I/O statistics for the RDF ports on the local storage (for 5-second
time intervals).
Example 7-67 IO statistics for the RDF ports
IO/sec
Remote
Cache Requests/sec
READ
WRITE
RW
% RW
Hits
KB/sec
rcvd+sent
14
17
-----31
0
0
-----0
0
4
-----4
0
4
-----4
N/A
0
--0
0
299
---------299
421
673
-----1094
0
0
-----0
404
655
-----1059
404
655
-----1059
0
0
--0
25883
41871
---------67754
22:05:39
22:05:44 RF-4C
RF-13C
Total
Device group and composite group example outputs: Example 7-68 shows detailed
information about a device group.
Example 7-68 Sample symdg show output
haxd_siteA
Group Type
Device Group in GNS
Valid
Symmetrix ID
Group Creation Time
Vendor ID
Application ID
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
of
of
of
of
of
of
of
of
of
of
of
of
of
:
:
:
:
:
:
:
RDF1
(RDFA)
No
Yes
000190100304
Tue Mar 9 09:28:26 2010
EMC Corp
SYMCLI
2
1
0
0
0
0
0
0
0
0
0
0
0
-------------------------------------------------------------------Sym
Cap
LdevName
PdevName
Dev Att. Sts
(MB)
-------------------------------------------------------------------DEV001
/dev/rhdiskpower1
0F53
RW
4314
DEV002
/dev/rhdiskpower2
0F54
RW
4314
}
Associated GateKeeper Devices (1):
{
-------------------------------------------------------------------Sym
Cap
LdevName
PdevName
Dev
Sts
(MB)
-------------------------------------------------------------------N/A
/dev/rhdiskpower10
0F9A
RW
6
}
Device Group RDF Information
{
RDF Type
RDF (RA) Group Number
Remote Symmetrix ID
: R1
: 41
(28)
: 000190101983
: False
: False
: False
: Normal
: False
RDF
RDF
RDF
RDF
:
:
:
:
Mode
Adaptive Copy
Adaptive Copy Write Pending State
Adaptive Copy Skew (Tracks)
Synchronous
Disabled
N/A
32767
: Disabled
:
:
:
:
: Ready
(RW)
Device RA Status
: Ready
Device Link Status
: Ready
Time of Last Device Link Status Change : N/A
(RW)
(RW)
Device
Device
Device
RDF R2
Suspend State
Consistency State
Consistency Exempt State
Not Ready If Invalid
:
:
:
:
Fibre
Disabled
Enabled
Enabled
N/A
Disabled
N/A
Disabled
335
: Ready
: Write Disabled
: Synchronized
R1 <===> R2 )
: 0
: 0
RDFA Information:
{
Session Number
Cycle Number
Number of Devices in the Session
Session Status
Consistency Exempt Devices
:
:
:
:
:
40
0
2
Inactive
No
:
:
:
:
:
N/A
00:00:30
00:00:00
00:00:00
33
: 00:00:00
: 0
: 0
}
Example 7-69 shows detailed information about a CG.
Example 7-69 List the detailed configuration of a composite group
336
haxdcg_siteA
:
:
:
:
:
:
:
:
RDF1
Yes
No
No
No
NONE
No
No
:
:
:
1
2
0
(RW)
(WD)
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
of
of
of
of
of
of
of
of
of
of
of
of
BCV's (Locally-associated)
:
VDEV's (Locally-associated)
:
TGT's Locally-associated
:
CRDF TGT Devices
:
RVDEV's (Remotely-associated VDEV)
:
RBCV's (Remotely-associated STD-RDF)
:
BRBCV's (Remotely-associated BCV-RDF)
:
RRBCV's (Remotely-associated RBCV)
:
RTGT's (Remotely-associated)
:
Hop2 BCV's (Remotely-assoc'ed Hop2 BCV) :
Hop2 VDEV's (Remotely-assoc'ed Hop2 VDEV):
Hop2 TGT's (Remotely-assoc'ed Hop2 TGT) :
0
0
0
0
0
0
0
0
0
0
0
0
ID
: 000190100304
Version
: 5772
STD Devices
:
2
CRDF STD Devices
:
0
BCV's (Locally-associated)
:
0
VDEV's (Locally-associated)
:
0
TGT's Locally-associated
:
0
CRDF TGT Devices
:
0
RVDEV's (Remotely-associated VDEV)
:
0
RBCV's (Remotely-associated STD_RDF) :
0
BRBCV's (Remotely-associated BCV-RDF):
0
RTGT's (Remotely-associated)
:
0
RRBCV's (Remotely-associated RBCV)
:
0
Hop2BCV's (Remotely-assoc'ed Hop2BCV):
0
Hop2VDEVs(Remotely-assoc'ed Hop2VDEV):
0
Hop2TGT's (Remotely-assoc'ed Hop2TGT):
0
:
:
:
:
:
41
000190101983
5773
N/A
N/A
(28)
(N/A)
337
DEV002
.--1
/dev/rhdiskpower2
0F54 RDF1+R-5
RW
4314
}
}
}
Legend:
RDFA Flags:
C(onsistency)
(RDFA) S(tatus)
R(DFA Mode)
(Mirror) T(ype)
:
:
:
:
X
A
S
1
=
=
=
=
->
->
338
Chapter 8.
Configuring PowerHA
SystemMirror Enterprise Edition
with Geographic Logical Volume
Manager
The IBM PowerHA SystemMirror Enterprise Edition with the GLVM provides disaster recovery
and data mirroring capability for the data at geographically separated sites. It protects the
data against total site failure by remote mirroring and supports unlimited distance between
participating sites.
GLVM is base technology available with AIX using IP-based data mirroring between sites and
is integrated with AIX standard Logical Volume Manager (LVM). GLVM was introduced with
the HACMP/XD 5.2 release and integrated with it for automated high availability.
This solution increases data availability by providing continuing service during hardware or
software outages (or both), planned or unplanned, for a two-site cluster. The distance
between sites might be unlimited, and both sites might access the mirrored volume groups
serially over IP-based networks.
By using this solution, your business application can continue running at the takeover system at
a remote site while the failed system is being recovered from a disaster or a planned outage.
The software takes advantage of the following software components to reduce downtime and
recovery time during disaster recovery:
AIX LVM and GLVM
TCP/IP subsystem
PowerHA SystemMirror Enterprise Edition for AIX
The chapter includes the following sections:
Planning the implementation of PowerHA Enterprise Edition with GLVM
Installing and configuring PowerHA for GLVM
Configuration wizard for GLVM
339
340
Use non-concurrent or enhanced concurrent mode volume groups (only for sync mode).
For enhanced concurrent volume groups that you also want to make geographically
mirrored, ensure that PowerHA cluster services are running before you add and manage
RPVs.
For asynchronous mirroring, volume groups must be in a scalable format, and enhanced
concurrent mode volume groups cannot be used.
You must disable the auto-on and bad-block relocation options of the volume group.
Create mirror pools for use with asynchronous mirroring. Mirror pools are required for
using asynchronous mirroring but are optional when using synchronous mirroring.
Set the inter-disk allocation policy for logical volumes in AIX to super strict.
Carefully consider quorum and forced varyon issues when you plan the geographically
mirrored volume groups. For more information, see Quorum and forced varyon in the
Data divergence in the HACMP for AIX 6.1 Geographic LVM: Planning and Administration
Guide, SA23-1338.
In PowerHA Enterprise Edition for GLVM, the C-SPOC utility does not allow managing the
geographically mirrored volume groups from a single node.
Software requirements
The IBM PowerHA Enterprise Edition for GLVM requires specific versions of AIX and RSCT.
The RSCT file sets are shipped with the PowerHA installation media and are installed
automatically. The PowerHA Enterprise Edition software uses 1 MB of disk space. Ensure
that the /usr file system has 1 MB of free disk space for the upgrade.
Reference: AIX 6.1 Technology Level 2 SP3 must be installed for asynchronous mirroring
to use mirror pools. For more information see Chapter 2, Infrastructure considerations on
page 39, or the support flash site at:
http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10673
Required PTFs: Remember to install the latest PTFs of PowerHA for GLVM. At the time this
book was written, the following PTFs were required to allow GLVM to function correctly:
IZ69484: HA/XD GLVM does not function with two nodes at a site.
IZ69964: POWERHA/XD with GLVM single-adapter network problems.
IZ69945: POWERHA/XD with GLVM single-adapter network rg move problems.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
341
D/R Site
AIX LVM
RPV client
device driver
Node A
PV1
RPV server
kernel extension
PV2
PV3
Node B
PV4
Mirror Pool #1
PV5
PV6
Mirror Pool #2
342
Additionally, mirror pools provide extra benefits to the asynchronous mirroring function:
Synchronous or asynchronous is an attribute of a mirror pool. Rather than having to
configure individual RPV devices, mirror pools provide a convenient way for users to
manage asynchronous mirroring at a higher level.
The decision of whether to mirror synchronously or asynchronously is made at the mirror
pool level.
Therefore, you can decide to mirror from the production site to the disaster recovery site
asynchronously and then to mirror from the disaster recovery site back to the production site
synchronously. You can accomplish this task by configuring the mirror pool that contains the
disaster recovery site disks as asynchronous when you configure the mirror pool that
contains the production site disks as synchronous.
343
Installation components
The PowerHA Enterprise Edition software for GLVM comprises several components. The file
sets listed in Table 8-1 are required to configure PowerHA for GLVM.
Table 8-1 Installation components
Component
Description
File sets
GLVM
glvm.rpv.client
glvm.rpv.server
glvm.rpv.util
glvm.rpv.msg.en_US
PowerHA Enterprise
Edition for GLVM
cluster.xd.glvm
cluster.xd.license
cluster.doc.en_US.glvm
Installation prerequisites
Before you install the PowerHA Enterprise Edition for GLVM, remember to install the
necessary software in the cluster nodes. The software has the following prerequisites:
The latest versions of PowerHA, AIX, and RSCT.
The PowerHA Enterprise Edition software, which uses 1 MB of disk space. Ensure that the
/usr file system has 1 MB of free disk space for installation.
Configuration prerequisites
Before you configure a GLVM environment, complete these steps:
1. Install AIX on all nodes in the cluster.
2. Select site names for each site.
3. Consult the planning worksheets for the service IP addresses on the local site. These
addresses serve as the service IP addresses for the RPV clients (that reside on the local
site), and you will make the addresses known to the remote site.
4. Similarly, consult the planning worksheets for the service IP addresses on the remote site.
These addresses serve as the service IP labels/addresses for the RPV servers (that
reside on the remote site). You make the addresses known to the local site.
5. By using the standard AIX LVM SMIT panels, configure volume groups, logical volumes,
and file systems for which you plan to configure geographic mirror with the GLVM utilities.
Ensure that either standard or enhanced concurrent mode LVM volume groups exist for
the data that you plan to be geographically mirrored.
6. For all logical volumes that are planned to be geographically mirrored, ensure that the
inter-disk allocation policy is set to super strict.
344
7. Create volume groups and mirror pools for use with asynchronous mirroring. Mirror pools
are required for using asynchronous mirroring but are optional when you use synchronous
mirroring.
Cluster services: If you want to turn your existing enhanced volume groups into
geographically mirrored volume groups by adding RPVs, PowerHA cluster services must
be running on the nodes. Although you can create standard volume groups in AIX and add
RPVs to these groups by using the GLVM utilities in SMIT, to add RPVs to enhanced
concurrent volume groups, PowerHA cluster services must be running.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
345
3. To make the mirroring function work both ways in a PowerHA cluster, configure the RPV
client on each node at the remote site and configure an RPV server on each node at the local
site. You must understand that RPV servers and clients must be configured at both sites.
4. Vary off the volume groups and update the volume group definitions on the nodes at the
other site by running the importvg command.
For more information, see the PowerHA for AIX 6.1 Geographic LVM: Planning and
Administration Guide, SA23-1338.
346
3. Add RPVs to allow the disks at the local site to be accessed from the remote site.
4. Import the volume group to all of the nodes at the remote site.
5. Vary off the volume group on all nodes at the remote site.
6. Add the enhanced concurrent geographically mirrored volume group to a PowerHA
resource group.
347
6. Configure your existing logical volumes to belong to the mirror pool that you just created.
You also need to turn off bad block relocation for each logical volume:
chlv -m copy1=Pougkeepsie -b n dataloglv
chlv -m copy1=Pougkeepsie n datafslv
7. Confirm that you have an ordinary volume group that is configured for super strict mirror
pools and that all local disks at the production site belong to the Asite mirror pool:
lsmp -A datavg
8. Define the RPV clients and RPV servers that are required for your cluster. On each node,
define an RPV server for each local disk that belongs to the volume group, and define an
RPV client for each remote disk that belongs to the volume group. Then, the RPV
client/server pairs that you create enable LVM to access remote physical volumes as
though they were ordinary local disks.
9. After you create the RPV clients and servers, add the remote physical volumes, which are
identified by their ROV client names to the volume group. You can define the mirror pool
for the remote physical volumes in the same step by using the extendvg command:
extendvg -p Austin datavg hdisk31 hdisk32
In this example command, the remote physical volumes hdisk31 and hdisk32 belong to a
mirror pool named Austin.
10.After you add the remote physical volumes to your volume group, add mirror copies of your
logical volumes to them. You can use the GLVM utilities in SMIT or the mirrorvg
command:
mirrorvg -c 2 -p copy2=Austin datavg
In this example command, a second mirror copy of each logical volume is created on the
disks that reside in the Austin mirror pool.
11.Because asynchronous mirroring requires a logical volume of type aio_cache to serve as
the cache device, create this logical volume. Use the usual steps to create a logical
volume, except specify aio_cache as the logical volume type. Also, the logical volume
must be on the disks in the opposite sites mirror pool. You can perform this step with AIX
LVM in the SMIT utilities or using the mklv command:
mklv -y datacachelv1 -t aio_cache -p copy1=Pougkeepsie -b n datavg 100
In this example command, the cache logical volume is in the Poughkeepsie mirror pool. It
is used for caching during asynchronous mirroring to the disks in the Austin mirror pool.
12.Configure a mirror pool to use asynchronous mirroring. You can use GLVM utilities in SMIT
or the chmp command:
chmp -A -m Austin datavg
In this example command, the Austin mirror pool is configured to use asynchronous
mirroring. The chmp command automatically determines that the datacachelv1 logical
volume is the cache device to use because it is resides in the opposite sites mirror pool.
13.Optional: Configure asynchronous mirroring for the Poughkeepsie mirror pool by creating
a logical volume in the Austin mirror pool to serve as the cache device:
mklv -y datacachelv2 -t aio_cache -p copy1=Austin -b n datavg 100
14.Configure the Poughkeepsie mirror pool to use asynchronous mirroring by using the same
procedure that you followed for the Austin mirror pool:
chmp -A -m Poughkeepsie datavg
15.Confirm that asynchronous mirroring is configured by running the lsmp command again:
lsmp -A datavg
348
Now when the volume group is varied online at the Poughkeepsie site, the local disks in the
Poughkeepsie mirror pool are updated synchronously, just as though they were an ordinary
volume group. The remote disks in the Austin mirror pool are updated asynchronously using
the cache device on the local Poughkeepsie disks. Likewise, when the volume group is varied
online at the Austin site, the local disks in the Austin mirror pool are updated synchronously,.
Also the remote disks in the Poughkeepsie mirror pool are updated asynchronously by using
the cache device on the local Austin disks.
If you already have a GLVM volume group that is configured for synchronous mirroring, you
might decide to reconfigure it for asynchronous mirroring, too. To reconfigure a GLVM volume
group for asynchronous mirroring, see 8.10.1, Converting synchronous GMVGs to
asynchronous GMVGs on page 412.
8.3.1 Prerequisites
Note the following prerequisites:
Install the following file sets on the local node and remote node in addition to base
PowerHA file sets:
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
349
The volume group must be varied on the local node and set to be the scalable type.
The volume group must not already include any RPV disks.
One or more logical volumes must be already defined.
All IP network interfaces must be connected and configured.
The application must be installed and configured.
The applications service IP label must be added to /etc/hosts on all nodes.
Persistent IP addresses are required if the XD_data network has multiple interfaces.
Note: Historically, PowerHA required a XD_data network with persistent IP for mirroring.
From PowerHA 6.1, GLVM can now use boot IP for mirroring and the XD_data network can
be configured with a single interface. Persistent IP is automatically configured if the local
and remote nodes have preconfigured IP aliases and the user has not supplied explicit IP
through SMIT.
8.3.2 Considerations
The GLVM wizard has a few considerations:
The GLVM wizard cannot be used where a pre-existing GLVM configuration is incomplete
or broken.
The GLVM wizard requires one-to-one mapping of disks for mirroring. For example, for
mirroring a 50-GB disk on the local node, the remote node must have a disk that is 50 GB
or more.
Although GLVM and PowerHA support IPv6, IPv6 cannot be configured directly with
the wizard.
The GLVM wizard creates a synchronous geographically mirrored volume group by
default. If you want to create an asynchronous geographically mirrored volume group, you
can convert the volume group after the cluster configuration. For more information, see
8.10.1, Converting synchronous GMVGs to asynchronous GMVGs on page 412.
350
The GLVM Cluster Configuration Assistant creates the following PowerHA configuration:
Cluster: <user supplied application name>_cluster
Nodes: one per site, use host name for node name
Sites: siteA and siteB
If the RPV server site name is already configured at each site, the GLVM wizard
configures the PowerHA sites as the defined RPV server site name.
XD_data network: single XD_data network
An IP-Alias is enabled and it includes all inter-connected network interfaces.
Application server
One or more service IPs when provided
One Resource Group: <user supplied application name>_group
Inter-site management policy is set to prefer primary site.
Persistent IP address for each node (optional for single interface networks).
Create the RPV server, clients, and associated GMVGs with a PowerHA resource group.
Geographically mirrored volume groups will be set as synchronous mirroring by default.
Note: After intermediate failure of the GLVM wizard, you can remove the PowerHA
configuration, but the GMVG/VG configuration remains. Therefore, you manually remove
the GLVM/VG configuration before performing a basic GLVM and PowerHA configuration
with the GLVM Cluster Configuration Assistant.
All output from the GLVM Configuration Assistant is logged to /var/hacmp/log/
cl2siteconfig_assist.log (default). The log file can be redirected by using standard SMIT
panels smitty cm_hacmp_log_viewing_and_management_menu_dmn.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
351
352
[Yuri_svc] +
[GLVM_A1_per] +
[GLVM_B1_per] +
Figure 8-4 shows the result of the PowerHA configuration using the GLVM wizard with the
following information:
Cluster: YuriApp_cluster
Nodes: GLVM_A1 and GLVM_B1
Sites: siteA and siteB
XD_data network: net_XD_data_01
Application server: YuriApp
Service IP address: Yuri_svc
One resource group: YuriApp_group
Persistent IP address for each node: GLVM_A1_XDT_per, GLVM_A2_XDT_per
RPV server and client in each node that is configured and added to geographically
mirrored volume group: yurivg
Mirror pools for each site that is also created for geographically mirrored volume group: yurivg
YuriApp_cluster
Asite
Bsite
GLVM_A1
GLVM_B1
TCP/IP WAN
Yuri_svc 10.10.100.107
GLVM_A1_XDT 10.10.101.107
GLVM_A1_XDT_per 10.10.102.107
hdisk1
hdisk5
Local PV
RPV
net_XD_data_01
net_XD_data_01
yurivg
10.10.201.107 GLVM_B1_XDT
10.10.202.107 GLVM_B1_XDT_per
yurivg
hdisk1
Local PV
hdisk5
RPV
353
Asite
xd_data
TCP/IP WAN
xd_ip
GLVM_A1
GLVM_B1
GLVM_A2
P550
P550
GLVM_B2
P570
P570
SAN_B8
SAN_B8
SVC_B8
SVC_B8
SVC_B8
SVC_B8
DS 4000
DS 8000
DS 8000
Although not shown in Figure 8-5, each site has its own local Ethernet type network,
non-routed between sites for the client access to the cluster nodes. Have non-IP networks
between the nodes in one site to prevent false failover.
For the TCP/IP WAN, we installed a Linux system and added a routing table to enable the
network connection between nodes in each site. In addition, to establish latency between
sites, we installed a WAN simulator.
354
In our scenario in Figure 8-6, we create two private communication lines for the XD_data
networks between sites and two Ethernet interfaces on each node for connection to each
private network. These networks are used for the geographically mirrored volume groups.
GLVM_cluster
Asite
Bsite
TCP/IP WAN
xd_ip_01
GLVM_A1_XDIP
GLVM_B1_XDIP
10.10.100.107
10.10.200.107
GLVM_A1_XDT1
xd_data_01
GLVM_B1_XDT1
10.10.201.107
10.10.101.107
GLVM_A1_XDT2
GLVM_A1
xd_data_02
GLVM_B1_XDT2
10.10.102.107
10.10.202.107
GLVM_B1
GLVM_A1_BOOT
192.168.8.107
GLVM_B1_BOOT
192.168.9.107
TCP/IP WAN
RPVs
Local
PVs
yurivg
ether_01
rs232
ether_02
yunavg
Local
PVs
RPVs
GLVM_B2_BOOT
GLVM_A2_BOOT
192.168.9.108
192.168.8.108
GLVM_A2
rs232
GLVM_A2_XDT2
GLVM_B2_XDT2
10.10.102.108
10.10.202.108
GLVM_A2_XDT1
GLVM_B2_XDT1
10.10.101.108
10.10.201.108
GLVM_B2
GLVM_A2_XDIP
GLVM_B2_XDIP
10.10.100.108
10.10.100.108
xd_ip_01
TCP/IP WAN
We define one XD_ip network for the communication links between sites in a separate IP
segment from the XD_data networks. Each site has its own local network, non-routed
between sites for the client access to the cluster nodes. Although we did not configure non-IP
networks within a site, in a production environment, have non-IP networks. For more
information, see the HACMP for AIX: Planning and Administration Guide, SC23-4862.
We create first a resource group primarily at Asite that contains a GMVG, yurivg. Later, we
add a second resource group to the cluster configuration, primarily at Bsite with another
GMVG, yunavg. The cluster has the following information:
Cluster name: GLVM_cluster
Site and participating nodes:
Asite: GLVM_A1 and GLVM_A2
Bsite: GLVM_B1 and GLVM_B2
Topology
net_XD_data_01: GLVM_A1_XDT1, GLVM_A2_XDT1, GLVM_B1_XDT1, and
GLVM_B2_XDT1
net_XD_data_02: GLVM_A1_XDT2, GLVM_A2_XDT2, GLVM_B1_XDT2, and
GLVM_B2_XDT2
net_XD_ip_01: GLVM_A1_XDIP, GLVM_A2_XDIP, GLVM_B1_XDIP, and
GLVM_B2_XDIP
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
355
Application server
Service IP
GLVM_A1_RG
GLVM_A1
GLVM_A2_RG
GLVM_A2
GLVM_B1_RG
GLVM_B1
GLVM_B2_RG
GLVM_B2
Volume group
YuriRG
YuriApp
yurivg
YunaRG
YunaApp
yunavg
Bsite
GLVM_A1
GLVM_B1
TCP/IP WAN
RPV client
device driver
AIX
LVM
10.10.101.107
10.10.201.107
10.10.102.107
10.10.202.107
RPV server
kernel extension
hdisk3
hdisk1 hdisk2
hdisk5 hdisk6
yurivg
yurivg
hdisk5
RPVs
Local PVs
hdisk4
Local PVs
RPVs
hdisk6
TCP/IP WAN
AIX
LVM
RPV client
device driver
10.10.101.108
10.10.201.108
10.10.102.108
10.10.202.108
GLVM_A2
RPV server
kernel extension
GLVM_B2
Figure 8-7 Remote physical volume configuration for yurivg activated in GLVM_A1 (Asite)
356
10
2. Configure an RPV server site name Bsite on the GLVM_B1 and GLVM_B2 nodes at the
remote site (Figure 8-8). Run the smitty rpvserver command. Then, select Remote
Physical Volume Server Site Name Configuration Remote Physical Volume
Server Site Name Configuration Define / Change / Show Remote Physical Volume
Server Site Name.
Define / Change / Show Remote Physical Volume Server Site Name
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* Remote Physical Volume Server Site Name
[Bsite]
Figure 8-8 Define RPV server site name on Bsite
3. Check the available physical volumes and configure RPV servers. Run the smitty
rpvserver command and select Remote Physical Volume Server Site Name
Configuration Add Remote Physical Volume Servers.
If you have multiple RPV client IP addresses, place the IP addresses, separated by
commas, in the IP address of the remote physical volume client. For example, the RPV
client IP address is 10.10.101.107, 10.10.102.107 for GLVM_B1 node.
Specify that the RPV server start immediately, which indicates that the RPV server is in
the available state. Set the Configure Automatically at System Restart field to No
(Example 8-5 and Figure 8-9 on page 358).
Example 8-5 RPV server configuration in GLVM_B1 node for yurivg
rootvg
None
None
None
None
active
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
357
4. Configure RPV clients on GLVM_A1, which creates RPV clients hdisk5 and hdisk6 on the
local node. Run the smitty rpvclient command. Select Add Remote Physical Volume
Clients. Then, enter the remote physical volume server internet address of GLVM_B1 and
the remote physical volume local internet address of GLVM_A1.
Specify that the RPV client start immediately, which indicates that the RPV client in the
available state (Figure 8-10 and Example 8-6).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Remote Physical Volume Server Internet Address
10.10.201.107
Remote Physical Volume Local Internet Address
10.10.101.107
Physical Volume Identifiers
00c0f6a02fae31720000000000000000 00c0f6a02fae31dd0000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
Figure 8-10 Add Remote Physical Volume Clients panel
Example 8-6 RPV client configuration in GLVM_A1 node for yurivg
358
active
active
active
Multiple RPV server-client networks: If there are multiple RPV server-client networks
(Figure 8-11 and Example 8-7), add them through GLVM utilities or by using the chdev
command. Run the smitty glvm_utils command, and select Remote Physical
Volume Clients Change Multiple Remote Physical Volume Clients.
/usr/sbin/chdev -l hdiskX -a server_addr='server_ip -a local_addr=local_ip
-a server_addr2=server_ip2 -a local_addr2=local_ip2
[10.10.201.107] +
[10.10.101.107] +
[10.10.202.107] +
[10.10.102.107] +
rootvg
None
None
None
None
None
active
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
359
[10.10.202.108] +
[10.10.102.108] +
Figure 8-13 Configure RPV clients on GLVM_A2 node and add more networks
360
7. Add remote physical volumes to the volume group (Figure 8-14). You can use
GLVM_utilities or the extendvg command. Run the smitty glvm_utils command and
select Geographically Mirrored Volume Groups Manage Legacy Geographically
Mirrored Volume Groups Add Remote Physical Volumes to a Volume Group.
Add Remote Physical Volumes to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* VOLUME GROUP name
yurivg
Force
[no] +
* REMOTE PHYSICAL VOLUMES Name
hdisk5 hdisk6
Figure 8-14 Add RPVs on the volume group
8. Add a remote site mirror copy to the volume group (Figure 8-15). You can use
GLVM_utilities or the mirrorvg command. Run the smitty glvm_utils command and
select Geographically Mirrored Volume Groups Manage Legacy Geographically
Mirrored Volume Groups Add a Remote Site Mirror Copy to a Volume Group.
Add a Remote Site Mirror Copy to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* VOLUME GROUP name
yurivg
Mirror Sync Mode
[Foreground] +
* REMOTE PHYSICAL VOLUME names
hdisk5 hdisk6
Number of COPIES of each logical partition
2 +
Keep Quorum Checking On?
no +
Figure 8-15 Add a remote site mirror copy to the yurivg volume group
9. When volume group mirroring is finished, vary off the volume group on the GLVM_A1
node and import it to the GLVM_A2 node (Example 8-9). For the shared volume group,
have the same volume group major number within the cluster. For more information, see
HACMP for AIX 6.1 Geographic LVM: Planning and Administration Guide, SA23-1338.
Example 8-9 Import volume group yurivg in GLVM_A2
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
361
GLVM_cluster
Asite
Bsite
GLVM_A1
GLVM_B1
TCP/IP WAN
RPV server
kernel extension
hdisk5
hdisk6
hdisk1
RPVs
hdisk2
10.10.101.107
10.10.201.107
10.10.102.107
10.10.202.107
yurivg
yurivg
RPV client
device driver
hdisk5 hdisk6
RPVAIX
client
device
driver
LVM
hdisk3 hdisk4
Local PVs
RPVs
Local PVs
TCP/IP WAN
RPV server
kernel extension
10.10.101.108
10.10.201.108
10.10.102.108
10.10.202.108
RPV client
device driver
GLVM_A2
AIX
LVM
GLVM_B2
Figure 8-16 Remote physical volume configuration for yurivg activated in GLVM_B1(Bsite)
To configure RPV servers and clients at both sites for the mirroring function to work both ways
in the cluster:
1. Configure RPV servers in the defined state (Figure 8-17) on the GLVM_B1 and GLVM_B2
nodes at the remote site by using GLVM utilities or the rmdev command. Run the smitty
rpvserver command, and select Remove Remote Physical Volume Servers.
Remove Remote Physical Volume Servers
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Servers
rpvserver0 rpvserver1
Keep definitions in database?
[yes] +
or
root@GLVM_B1 / > rmdev -l rpvserver0
rpvserver0 Defined
root@GLVM_B1 / > rmdev -l rpvserver1
rpvserver1 Defined
root@GLVM_B2 / > rmdev -l rpvserver0
rpvserver0 Defined
root@GLVM_B2 / > rmdev -l rpvserver1
rpvserver1 Defined
Figure 8-17 Configure RPV servers in the defined state
362
2. Configure RPV clients in the defined state on the GLVM_A1 and GLVM_A2 nodes at the
local site by using GLVM utilities or the rmdev command (Figure 8-18). Run the smitty
rpvclient command, and select Remove Remote Physical Volume Clients.
Remove Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Clients
hdisk5 hdisk6
Keep definitions in database?
[yes] +
or
root@GLVM_A1 / > rmdev -l hdisk5
hdisk5 Defined
root@GLVM_A1 / > rmdev -l hdisk6
hdisk6 Defined
root@GLVM_A2 / > rmdev -l hdisk5
hdisk5 Defined
root@GLVM_A2 / > rmdev -l hdisk6
hdisk6 Defined
Figure 8-18 Configure RPV clients in the defined state
3. Configure an RPV server site named Asite on the GLVM_A1 and GLVM_A2 nodes at the
local site (Figure 8-19).
Define / Change / Show Remote Physical Volume Server Site Name
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* Remote Physical Volume Server Site Name
[Asite]
Figure 8-19 Define RPV server site name on Asite
4. Configure RPV servers on GLVM_A1 and GLVM_A2 at the local site (Example 8-10,
Figure 8-20 on page 364, and Figure 8-21 on page 364).
Example 8-10 Configuring RPV servers on GLVM_A1
rootvg
yurivg
yurivg
None
None
active
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
363
5. Configure RPV clients on the GLVM_B1 and GLVM_B2 nodes at the remote site and
configure multiple networks on RPV clients by using the chdev command or the
GLVM_utilities (Figure 8-22 and Figure 8-23 on page 365).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Server Internet Address
10.10.101.107
Remote Physical Volume Local Internet Address
10.10.201.107
Physical Volume Identifiers
000fe4112f99817c0000000000000000 000fe4112f9982350000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
root@GLVM_B1 / > chdev -l hdisk5 -a local_addr=10.10.201.107 -a
local_addr2=10.10.202.107 -a server_addr=10.10.101.107 -a
server_addr2=10.10.102.107
root@GLVM_B1 / > chdev -l hdisk6 -a local_addr=10.10.201.107 -a
local_addr2=10.10.202.107 -a server_addr=10.10.101.107 -a
server_addr2=10.10.102.107
Figure 8-22 Configure RPV clients on GLVM_B1
364
6. Vary off the volume group at the local site (Asite) and update the volume groups
definitions on the nodes at remove site (Bsite) by running the importvg command
(Example 8-11).
Example 8-11 Import volume group yurivg in Bsite
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
365
We have another volume group, yunavg, at the remote site (Bsite) to mutually take over
the local site (Asite) in case of site failure (Figure 8-24). To configure the volume group as
a geographically mirrored volume group, go back to 8.2.2, Configuring geographically
mirrored volume groups on page 344, and repeat step 1 and step 2.
GLVM_cluster
Asite
Bsite
GLVM_A1
GLVM_B1
TCP/IP WAN
RPV server
kernel extension
hdisk7
hdisk8
RPVs
hdisk3
hdisk4
10.10.101.107
10.10.201.107
10.10.102.107
10.10.202.107
yunavg
yunavg
Local PVs
RPV client
device driver
hdisk7 hdisk8
RPVAIX
client
device
driver
LVM
hdisk1 hdisk2
Local PVs
RPVs
TCP/IP WAN
RPV server
kernel extension
10.10.101.108
10.10.201.108
10.10.102.108
10.10.202.108
GLVM_A2
RPV client
device driver
AIX
LVM
GLVM_B2
Figure 8-24 Remote physical volume configuration for yunavg activated in GLVM_B1 (Bsite)
366
#XD_IP Network
10.10.100.107 GLVM_A1_XDIP
10.10.100.108 GLVM_A2_XDIP
10.10.200.107 GLVM_B1_XDIP
10.10.200.108 GLVM_B2_XDIP
#XD_IP Data1
10.10.101.107
10.10.101.108
10.10.201.107
10.10.201.108
GLVM_A1_XDT1
GLVM_A2_XDT1
GLVM_B1_XDT1
GLVM_B2_XDT1
#XD_IP Data2
10.10.102.107
10.10.102.108
10.10.202.107
10.10.202.108
GLVM_A1_XDT2
GLVM_A2_XDT2
GLVM_B1_XDT2
GLVM_B2_XDT2
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_XD_data_02]
XD_data
[255.255.255.0]
[Yes] +
[]
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
367
3. Go to step 2, select the XD_ip network type, and enter the information (Figure 8-27).
Add an IP-Based Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_XD_ip_01]
XD_ip
[255.255.255.0]
[Yes] +
[]
3. Repeat these steps for all nodes and for each network. You can verify them by using SMIT.
Select Extended Configuration Extended Topology Configuration Configure
HACMP Communication Interfaces/Devices Change/Show Communication
Interfaces/Devices. Alternatively, use the cllsif cluster utility (Example 8-13).
Example 8-13 Configured XD-type networks on the GLVM_cluster
root@GLVM_A1 / > cllsif
Adapter
Type
Network
Net Type
Address Interface Name Global Name
Netmask
GLVM_A1_XDT1
boot
net_XD_data_01 XD_data
en1
255.255.255.0
GLVM_A1_XDT2
boot
net_XD_data_02 XD_data
en2
255.255.255.0
GLVM_A1_XDIP
boot
net_XD_ip_01 XD_ip
en0
255.255.255.0
GLVM_A2_XDT1
boot
net_XD_data_01 XD_data
en1
255.255.255.0
368
Attribute Node
IP Address
Alias for HB Prefix Length
public
GLVM_A1
10.10.101.107
24
public
GLVM_A1
10.10.102.107
24
public
GLVM_A1
10.10.100.107
24
public
GLVM_A2
10.10.101.108
24
Hardware
GLVM_A2_XDT2
en3
GLVM_A2_XDIP
en0
GLVM_B1_XDT1
en2
GLVM_B1_XDT2
en3
GLVM_B1_XDIP
en1
GLVM_B2_XDT1
en2
GLVM_B2_XDT2
en3
GLVM_B2_XDIP
en1
boot
boot
boot
boot
boot
boot
boot
boot
net_XD_data_02 XD_data
255.255.255.0
net_XD_ip_01 XD_ip
255.255.255.0
net_XD_data_01 XD_data
255.255.255.0
net_XD_data_02 XD_data
255.255.255.0
net_XD_ip_01 XD_ip
255.255.255.0
net_XD_data_01 XD_data
255.255.255.0
net_XD_data_02 XD_data
255.255.255.0
net_XD_ip_01 XD_ip
255.255.255.0
public
24
public
24
public
24
public
24
public
24
public
24
public
24
public
24
GLVM_A2
GLVM_A2
10.10.102.108
10.10.100.108
GLVM_B1
10.10.201.107
GLVM_B1
10.10.202.107
GLVM_B1
10.10.200.107
GLVM_B2
10.10.201.108
GLVM_B2
10.10.202.108
GLVM_B2
10.10.200.108
Fields]
[net_ether_01]
ether
[255.255.255.0]
[Yes] +
[]
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
369
GLVM_A2_boot
GLVM_B1_boot
GLVM_B2_boot
boot
boot
boot
boot
net_ether_01
net_ether_01
net_ether_02
net_ether_02
ether
ether
ether
ether
public
public
public
public
GLVM_A1
GLVM_A2
GLVM_B1
GLVM_B2
192.168.8.107
192.168.8.108
192.168.9.107
192.168.9.108
en3
en2
en0
en0
255.255.255.0
255.255.255.0
255.255.255.0
255.255.255.0
24
24
24
24
370
Fields]
[GLVM_A1_RG]
[ignore] +
[GLVM_A1 GLVM_A2] +
[] +
2. Repeat step 1 to add more resource groups (Table 8-3 on page 372). YuriRG and YunaRG
are the resource groups that take over the remote site in case of site failure. You must set
the inter-site policy and select the participating nodes from the secondary site.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
371
Intersite policy
Participating nodes
from primary site
Participating nodes
from secondary site
GLVM_A1_RG
Ignore
GLVM_A1 GLVM_A2
NA
GLVM_A2_RG
Ignore
GLVM_A2 GLVM_A1
NA
GLVM_B1_RG
Ignore
GLVM_B1 GLVM_B2
NA
GLVM_B2_RG
Ignore
GLVM_B2 GLVM_B1
NA
YuriRG
GLVM_A1 GLVM_A2
GLVM_B1 GLVM_B2
YunaRG
GLVM_B1 GLVM_B2
GLVM_A1 GLVM_A2
For all resource groups, the startup policy is Online On Home Node Only, the failover policy is
Fall over To Next Priority Node, and the failback policy is Failback To Higher Priority Node.
The inter-site policy for the resource groups that contains the geographically mirrored volume
group is set to prefer primary site. It might not be set to ignore or to online on both sites.
Add a resource into the resource groups and integrate the geographically mirrored volume
group into the PowerHA resource group:
1. Run smitty hacmp. Select Extended Configuration Extended Topology
Configuration Configure HACMP Persistent Node IP Label/Addresses>
Change/Show Resources and Attributes for a Resource Group.
2. Select the resource group and enter the information that is listed in Table 8-4.
Table 8-4 List of resource group names mapped to application servers and service IP addresses
Resource group name
372
Application server
Service IP
GLVM_A1_RG
GLVM_A1
GLVM_A2_RG
GLVM_A2
GLVM_B1_RG
GLVM_B1
GLVM_B2_RG
GLVM_B2
Volume group
YuriRG
YuriApp
yurivg
YunaRG
YunaApp
yunavg
3. Set the Use forced varyon of volume groups, if necessary field to true (Figure 8-33),
although the default is false. This setting allows PowerHA Enterprise Edition for GLVM to
vary on the volume group if the quorum is disabled and the remote site fails.
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes [Entry Fields]
Resource Group Name
YuriRG
Inter-site Management Policy
Prefer Primary Site
Participating Nodes from Primary Site
GLVM_A1 GLVM_A2
Participating Nodes from Secondary Site
GLVM_B1 GLVM_B2
Startup Policy
Online On Home Node Only
Failover Policy
Failover To Next Priority Node In The List
Failback Policy
Failback To Higher Priority Node In The
List
Failback Timer Policy (empty is immediate)
[] +
Service IP Labels/Addresses
Application Servers
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Allow varyon with missing data updates?
(Asynchronous GLVM Mirroring Only)
Default choice for data divergence recovery
(Asynchronous GLVM Mirroring Only)
[] +
[YuriApp] +
[yurivg ] +
true +
false +
true +
[] +
Figure 8-33 Integrating the geographically mirrored volume yurivg into the resource group
Note: If you have configured the asynchronous geographically mirrored volume group,
set the default choice for data divergence recovery to preserve when recovering from
data divergency. Allow varyon with missing data updates to allow or prevent data
divergence when a non-grace failover occurs.
4. Repeat the necessary steps to complete each resource group configuration.
Fields]
[Both] +
[Interactively] +
[No] +
[No] +
[Standard] +
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
373
Site UK
XD_IP / XD_Data
Node A1
Node B1
Node A2
LAN
Power 550
AIX 6.1
PowerHA 6.1
GLVM 6.1
Power 550
AIX 6.1
GLVM 6.1
PowerHA 6.1
Power 550
AIX 6.1
PowerHA 6.1
GLVM 6.1
hdisk 1
hdisk 2
hdisk 3
hdisk 4
hdisk 1
hdisk 2
hdisk 5
hdisk 6
hdisk 7
hdisk 8
hdisk 5
hdisk 6
374
XD_IP / XD_Data
WAN
hdisk 3
hdisk 7
hdisk 4
hdisk 8
IP address
GLVM_NY_A1 XD_data
10.12.5.50
GLVM_NY_A2 XD_data
10.12.5.53
GLVM_UK_B1 XD_data
10.137.62.112
aixod17.lpar.co.uk:/ # cltopinfo
Cluster Name: GLVM_NY_UK
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
There are 3 node(s) and 3 network(s) defined
NODE GLVM_NY_A1:
Network net_XD_data_01
GLVMUK_A1_XDIP 10.12.5.50
Network net_XD_ip_01
Network net_ether_02
NODE GLVM_NY_A2:
Network net_XD_data_01
GLVMUK_A2_XDIP 10.12.5.53
Network net_XD_ip_01
Network net_ether_02
NODE GLVM_UK_B1:
Network net_XD_data_01
GLVMUK_B1_XDIP 10.137.62.212
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
375
Network net_XD_ip_01
Network net_ether_02
[Zhifa_rg]
Startup Policy
Failover Policy
Failback Policy
Resource Group 2:
* Resource Group Name
[Rebecca_rg]
Startup Policy
Failover Policy
Failback Policy
We now need to configure GLVM, which will be integrated with our cluster resource group.
Configuring GLVM
In this section, we briefly describe the steps for configuring GLVM in a three-node cluster
configuration. For information about how to configure GLVM, see 8.3.4, Configuring GLVM
and PowerHA by using the GLVM wizard on page 352.
In our environment, node1 and node2 have four units of shared disk that reside in an IBM
DS8000, whereas node3 at the UK site only has internal disks. Example 8-16 shows disk
configuration for these nodes before configuring GLVM.
Example 8-16 Disk configuration before configuring GLVM
rootvg
None
active
hdisk2
hdisk3
hdisk4
00c1f1703fd38b28
00c7cd9e843445ad
00c1f1703fd39c15
None
None
None
rootvg
None
None
None
None
None
active
aixod17.lpar.co.uk:/ # lspv
hdisk0
0003ee875d250553
hdisk1
0003ee876c83ee54
hdisk2
0003ee876c83fa0e
hdisk3
0003ee876c840e63
hdisk4
0003ee876c841919
rootvg
None
None
None
None
active
Example 8-16 on page 376 shows that hdisk1 - hdisk4 in the GLVM_UK_A1 node are the
same disks as hdisk2 - hdisk5 in the GLVM_NY_A2 node. These four disks are configured as
GLVM disks at the NY site. Those disks have their corresponding GLVM disks in the
GLVM_UK_B1 node, namely hdisk1 - hdisk4.
To configure GLVM:
1. Create the RPV server at the UK site.
These steps make the hard disk from the GLVM_UK_B1 node accessible in the
GLVM_NY_A1 and GLVM_NY_A2 nodes. We create the access by using smitty
glvm_utils, and Example 8-17 shows the result.
Example 8-17 RPV server in the GLVM_UK_B1 node
root@GLVMUK_570_1_A1 / >
hdisk5 Available Remote
hdisk6 Available Remote
hdisk7 Available Remote
hdisk8 Available Remote
lsdev -t
Physical
Physical
Physical
Physical
rpvclient
Volume Client
Volume Client
Volume Client
Volume Client
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
377
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
Volume
Volume
Volume
Volume
Client
Client
Client
Client
Check also the configuration in the GLVM_NY_A2 node, and the output must also have
four rpvclients.
3. Create the RPV server at the NY site.
Create the RPV server at the NY site in both nodes, beginning with creating it in the
GLVM_NY_A1 node. Example 8-19 shows the result.
Example 8-19 RPV server configuration in the GLVM_NY_A1 node
378
hdisk7 Available
hdisk8 Available
rootvg
ny1vg
ny1vg
ny2vg
ny2vg
ny1vg
ny1vg
ny2vg
ny2vg
active
Vary off the volume group in the GLVM_UK_A1 node and then, import it in all other nodes.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
379
MIRROR POOL:
ny1_mp2
MIRROR POOL:
ny1_mp1
root@GLVMUK_570_1_A1 / > lsmp -A ny2vg
VOLUME GROUP:
ny2vg
Mirroring Mode:
Mirroring Mode:
MIRROR POOL:
MIRROR POOL:
Mirroring Mode:
Mirroring Mode:
ny2_mp1
ny2_mp2
SYNC
SYNC
380
PPs
8
2
2
2
PVs
2
1
2
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
N/A
N/A
N/A
N/A
PPs
8
2
2
2
PVs
2
1
2
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
N/A
N/A
N/A
N/A
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
381
+
#
We convert all the mirror pool disks. Example 8-25 shows the status of all logical volumes
after the conversion.
Example 8-25 Mirror pool status after converting to asynchronous
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
ny1_aiocache
no
no
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
ny1_2_aiocache
yes
no
ny1_mp1
inactive
yes
100
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
ny2_mp1
inactive
yes
100
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
ny2_2_aiocache
yes
no
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
ny2_mp2
active
yes
100
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
ny2_aiocache
no
no
382
We intend to include the ny1vg and ny2vg volume groups in these resource groups. We run
the smitty hacmp command, and then integrate the volume group (Figure 8-38).
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Resource Group Name
Inter-site Management Policy
Participating Nodes from Primary Site
Participating Nodes from Secondary Site
[Entry Fields]
Zhifa_rg
Prefer Primary Site
GLVM_NY_A1 GLVM_NY_A2
GLVM_UK_B1
Startup Policy
Failover Policy
Failback Policy
Failback Timer Policy (empty is immediate)
Service IP Labels/Addresses
Application Servers
[]
[Zhifa_app]
+
+
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Default choice for data divergence recovery
[ny1vg]
false
false
ignore
+
+
+
+
We also run the command in Figure 8-38 to include the ny2vg volume group into the
Rebecca_rg resource group.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
383
The status of the resource group before and after the node failure can be checked by using
the /usr/es/sbin/cluster/utilities/clRGinfo command. We check before and after the
testing. Example 8-26 shows the result.
Example 8-26 Resource group status before and after the testing node failure
Before testing:
aixod17.lpar.co.uk:/ # clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zhifa_rg
ONLINE
GLVM_NY_A1@NY
OFFLINE
GLVM_NY_A2@NY
ONLINE SECONDARY
GLVM_UK_B1@UK
Rebecca_rg
ONLINE
OFFLINE
ONLINE SECONDARY
GLVM_NY_A2@NY
GLVM_NY_A1@NY
GLVM_UK_B1@UK
After testing:
aixod17.lpar.co.uk:/ # clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zhifa_rg
OFFLINE
GLVM_NY_A1@NY
ONLINE
GLVM_NY_A2@NY
ONLINE SECONDARY
GLVM_UK_B1@UK
Rebecca_rg
ONLINE
OFFLINE
ONLINE SECONDARY
GLVM_NY_A2@NY
GLVM_NY_A1@NY
GLVM_UK_B1@UK
Example 8-26 shows that when the GLVM_NY_A1 node failed, the resource group Zhifa_rg is
automatically taken over by node GLVM_NY_A2. When the failed node, GLVM_NY_A1, is up
and joins the cluster again, the resource group Zhifa_rg is reacquired by node GLVM_NY_A1.
Before testing:
aixod17.lpar.co.uk:/ # clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zhifa_rg
ONLINE
GLVM_NY_A1@NY
OFFLINE
GLVM_NY_A2@NY
ONLINE SECONDARY
GLVM_UK_B1@UK
Rebecca_rg
384
ONLINE
GLVM_NY_A2@NY
OFFLINE
ONLINE SECONDARY
GLVM_NY_A1@NY
GLVM_UK_B1@UK
After testing:
aixod17.lpar.co.uk:/ # clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------Zhifa_rg
OFFLINE
GLVM_NY_A1@NY
OFFLINE
GLVM_NY_A2@NY
ONLINE
GLVM_UK_B1@UK
Rebecca_rg
OFFLINE
OFFLINE
ONLINE
GLVM_NY_A2@NY
GLVM_NY_A1@NY
GLVM_UK_B1@UK
Example 8-27 on page 384 shows that when the GLVM_NY_A1 and GLVM_NY_A2 nodes
are down, the Zhifa_rg and Rebecca_rg resource groups are taken over automatically by the
GLVM_NY_A2 node. It takes around 5 minutes for the GLVM_UK_B2 node to detect the site
failure in the NY site.
Total
Cache
Wait %
-----0.00
Max
Cache
Cache Free
Wait
Space KB
------- ---------0
326655
The chmp command can be used to raise and lower the high water mark, which is the percent
of aio_cache that will be used. The default is 100%. To change the percentage of the I/O
cache size used, enter:
chmp -h <percent> -m <mirrorpool> <volumegroup>
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
385
You must the -h option when making the volume group asynchronous if you want it less than
100% because it cannot be used to decrease the size after it is created. Only a dynamic
increase is supported (Figure 8-39). The rpvclient detects the high water mark change on
these two conditions:
The mirror pool is changed from asynchronous to synchronous and back to asynchronous.
The rpvclients are stopped and started, which is accomplished by bringing the resource
group offline and then online to see the change.
root@GLVM_A1 / > lsmp -A yurivg
VOLUME GROUP:
yurivg
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Asite
inactive
yes
100
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
datacachelv2
yes
no
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Bsite
active
yes
100
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
datacachelv1
no
no
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Asite
inactive
yes
100
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
datacachelv2
yes
no
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Bsite
active
yes
3
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
datacachelv1
no
no
Total
Cache
Wait %
-----0.00
Max
Cache
Cache Free
Wait
Space KB
------- ---------0
8703
Total
Cache
Wait %
-----0.00
Max
Cache
Cache Free
Wait
Space KB
------- ---------0
8694
386
8.8 Monitoring
To assist you in monitoring the state of your GLVM environment, including the RPVs and
GMVGs that you have configured, GLVM includes two tools:
rpvstat
gmvstat
These commands provide real-time status information about RPVs and GMVGs.
The rpvstat command can optionally display its I/O-related statistics on a per-network
basis. The network summary option shows more information about network throughput
in KBps.
The rpvstat command can also display the highest recorded values for the pending statistics,
which includes the following historical high water mark numbers:
Maximum number of pending reads per device and network (high water mark)
Maximum number of pending KBs to read per device and network (high water mark)
Maximum number of pending writes per device and network (high water mark)
Maximum number of pending KBs to write per device and network (high water mark)
These statistics are reported on a separate display and include the additional statistic of
the number of I/O operations that were tried again (combination of both reads and writes).
The rpvstat command allows information to be displayed for all RPV clients on the system or
for a subset of RPV clients that is specified by the RPV client name on the command line. The
rpvstat command also allows the information to be monitored (redisplayed at user-specified
intervals).
The rpvstat command interacts with the RPV client pseudo device driver to retrieve the
information that the client displays.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
387
Syntax
The rpvstat command has the following syntax:
rpvstat
rpvstat
rpvstat
rpvstat
rpvstat
rpvstat
rpvstat
rpvstat
-h
[-n] [-t] [-i Interval [-c Count] [-d]] [rpvclient_name . . .]
-N [-t] [-I Interval [-c Count] [-d]]
-m [-n] [-t] [rpvclient_name . . .]
-R [-r][rpvclient_name . . .]
-r [-R] [rpv-device(s)...]
-A [-t] [-i Interval [-d] [-c Count] ] [rpv-device(s)...] |
-C [-t] [-i Interval [-d] [-c Count] ] [rpv-device(s)...]
Description
The rpvstat command displays statistical the following information available from the RPV
client device:
The read/write errors are displayed together. These counters indicate the number of I/O
errors returned to the application.
The rpvstat command can optionally display its I/O-related statistics on a per-network basis.
A network summary option of the command displays more l information about the network
throughput in KBps. The throughput is calculated per interval time that is specified by the user
while in monitoring mode.
The rpvstat command can also display the highest recorded values for the pending statistics,
which includes the following historical high water mark numbers:
These statistics are reported on a separate display and include the additional statistic of the
number of I/O operations that are tried again (both reads and writes). This count records the
number of I/Os that are tried again that have occurred on this network or device. This
information can be used as an indicator of a marginal or failing network.
388
You can also display the statistics for asynchronous mirroring. The rpvstat command prints
overall asynchronous statistics using the A option. To display statistics per device, specify
the list of devices. You can display the asynchronous I/O cache information by using the -C
option (Table 8-6).
Table 8-6 Flags
Flags
Description
-h
-R
-t
-n
-N
Displays summary statistics by mirroring network, including throughput rate for each
network.
-i Interval
Automatically shows the status every <Interval> seconds. The value of the <Interval>
parameter must be an integer greater than zero and less than or equal to 3600. If the
<Interval> parameter is not specified, the status information is displayed once.
-c Count
Shows information at the indicated interval <Count> times. The value of the <Count>
parameter must be an integer greater than zero and less than or equal to 999999. If the
<Interval> parameter is specified, but the <Count> parameter is not, it is displayed
indefinitely.
-m
Displays historical maximum pending values (high water mark values) and accumulated
retry count.
-d
-A
-C
-r
Resets counters for the asynchronous I/O cache information. You can specify the -R and
-r options together to reset all counters. (This needs root privilege.)
389
The -A option prints the following statistical information for one or more asynchronous
devices:
Asynchronous device name.
Asynchronous status. The status is printed as a single character.
A Device is fully configured for asynchronous I/O and can accept asynchronous I/Os.
I
U The device is not configured with asynchronous configuration and, therefore, acts as
an asynchronous device. All statistics are printed as zero.
X Device status cannot be retrieved. All the remaining statistics are printed as zero.
Total asynchronous writes.
Maximum cache utilization in percent.
Number of pending asynchronous writes that are waiting for the cache flush after the
cache hits the high water mark.
Percentage of writes that are waiting for the cache flush after the cache hits the high water
mark limit.
Maximum time waited after the cache hits high water mark in seconds.
Current free space in cache in KB.
Notes
Note the following information:
The count of reads and writes is accumulated on a per-buffer basis. For example,
an application I/O passes a vector of buffers in a single read or write call. Instead of counting
that read or write as a single I/O, it is counted as the number of buffers in the vector.
390
The count of completed and pending I/O KB is truncated. Any fractional amount of a KB is
dropped in the output display.
The cx field in the display output shows the connection status and can have the following
values:
A number The count of active network connections between the RPV client and its
RPV server.
Y
Indicates that the required information cannot be retrieved from the device
driver for the following reasons:
Exit status
This command returns the following exit values:
0
>0
No errors.
An error occurred.
Examples
This section lists sample commands that are used to display the information that is requested:
To display statistical information for all RPV clients, use the following command:
rpvstat
To display statistical information for RPV client hdisk14, use the following command:
rpvstat hdisk14
To reset the statistical counters in RPV client hdisk23, use the following command:
rpvstat -R hdisk23
To display statistical information for RPV client hdisk14 and repeat the display every
30 seconds for 12 times, use the following command:
rpvstat hdisk14 -i 30 -c 12
To display statistical information for all RPV clients and include information by mirroring
network, enter the following command:
rpvstat -n
To display statistical information for all mirroring networks, use the following command:
rpvstat -N
To display statistical information about maximum pending values for all RPV clients, use
the following command:
rpvstat -m
Files
The /usr/sbin/rpvstat contains the rpvstat command.
Related information
See the HACMP for AIX 6.1 Geographic LVM: Planning and Administration Guide, SA23-1338.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
391
Example 2: Run the rpvstat command with no options, but specify a single RPV client to
display accumulated statics for that particular RPV client only (Figure 8-41).
rpvstat hdisk158
Example 3: Run the rpvstat command with the -n option to show accumulated RPV client
statistics for each currently defined network (Figure 8-42).
rpvstat -n
103.17.133.102
103.17.133.104
103.17.133.202
103.17.133.204
34
0
12
0
4
0
8
0
10
0
488700
10
122175
2
122174
2
122176
3
122175
3
888
0
313
0
170
0
276
0
129
0
392
8336500
5120
2084100
1280
2000311
1288
2118732
1284
2133357
1268
122
28
32
30
32
Example 4: Run the rpvstat command specifying a single RPV client, a monitor interval of
30 seconds with 3 repeats, and a display of the date and time for each interval. When
running in monitor mode with the -d option, several of the repeated statistics show only the
delta from their previous value, as indicated by the preceding plus sign (+) (Figure 8-43):
-i
-c
-d
-t
Interval.
Repeat.
Deltas indicated by plus sign.
Displays the date and time for each interval.
rpvstat t -i 30 -c 3 -d hdisk158
Remote Physical Volume Statistics
Example 5: Run the rpvstat command with the -N option to display summary statistics for
each mirroring network. Monitor every 30 seconds for a total of two repeats. This can be
used to detect errors on a particular network (Figure 8-44).
rpvstat -N -i 30 -c 2 d
Remote Physical Volume Statistics
RPV Client Network
-----------------103.17.133.102
103.17.133.104
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
393
Example 6: Run the rpvstat command with the -m option to display the maximum pending
statistics (high water marks). This option shows the high water mark statistics first by RPV
device (for all networks), and then by network (for all devices) (Figure 8-45).
rpvstat -m
RPV Client
-----------------hdisk144
hdisk158
hdisk173
hdisk202
Maximum
Maximum
Maximum
Maximum Total
cx Pend Reads Pend Writes Pend KBRead Pend KBWrite Retries
-- ---------- ----------- ----------- ------------ ------4
28
56
568
9154
38
3
20
131
98
4817
14
0
9
27
31
990
212
X
0
0
0
0
0
Network Summary:
103.17.133.102
103.17.133.104
103.17.133.202
103.17.133.204
10
15
14
9
71
51
23
11
220
382
98
31
3312
6231
754
321
10
201
34
12
Example 7: The rpvstat command with the -A option displays asynchronous I/O statistics
(Figure 8-46).
rpvstat -A
Remote Physical Volume Statistics:
RPV Client
-----------hdisk10
hdisk9
hdisk8
hdisk7
hdisk6
ax
-A
A
A
A
A
Completd
Async
Writes
-------0
230
2
29
0
Completed
Async
KB Writes
----------0
115
8
116
0
Cached
Async
Writes
-------0
10
0
0
0
Cached
Async
KB Writes
----------0
5
0
0
0
Pending
Async
Writes
-------0
0
0
0
0
Pending
Async
KB Writes
----------0
0
0
0
0
Syntax
This command has the following syntax:
gmvgstat [-h] | [-r] [-t] [-i Interval [-c Count] [-w]]
[gmvg_name . . .]
Description
The gmvgstat command shows status information for one or more GMVGs:
The gmvgstat command can optionally be started in monitor mode by specifying the -i and
-c flags.
If one or more GMVG names are supplied on the command line, the gmvgstat command
verifies that each listed GMVG name is a valid, available, online GMVG. In monitor mode, the
user-supplied list of GMVGs is verified during each loop.
If no GMVG names are supplied on the command line, the gmvgstat command reports
information about all valid, available, online GMVGs. In monitor mode, the list of GMVGs to
report on is regenerated during each loop. The gmvgstat command has the following flags:
-h
-r
Includes information for each individual RPV client that is associated with the
displayed GMVGs.
-t
-i interval Automatically shows the status every <Interval> seconds. The value of the
<Interval> parameter must be an integer greater than zero and less than or equal
to 3600. If the <Interval> parameter is not specified, this parameter shows the
status information once.
The -i interval is the time, in seconds, between each successive gathering and
display of GMVG statistics in monitor mode. This interval is not a precise measure
of the elapsed time between each successive updated display. The gmvgstat
command obtains certain information that it displays by calling other commands
and has no control over the amount of time that these commands take to
complete their processing. Larger numbers of GMVGs result in the gmvgtstat
command taking longer to gather information and elongates the time between
successive displays in monitor mode. In some cases, an underlying command
can take excessively long to complete and results in the gmvgstat command
taking much longer than the -i interval between displays.
-c Count
Shows information at the indicated interval <Count> times. The value of the
<Count> parameter must be an integer greater than zero and less than or equal
to 999999. If the <Interval> parameter is specified, but the <Count> parameter is
not, it is shown indefinitely.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
395
-w
Operands
This operand displays the gmvg_name, which is name of one or more GMVGs for which to
display information. If no GMVG names are specified, information for all valid, available,
online GMVGs is displayed.
Exit status
The Exist status command returns the following exit values:
0
>0
No errors.
An error occurred.
Examples
This section lists sample commands used to display the information requested:
To display statistical information for all GMVGs use the command:
gmvgstat
To display statistical information for the GMVG named red_gmvg7 use the command:
gmvgstat red_gmvg7
To display statistical information for the GMVG named red_gmvg7 with statistics for all the
RPVs associated with that volume group, use the command:
gmvgstat -r red_gmvg7
To display information for GMVG red_gmvg7 that is automatically redisplayed every 10
seconds, enter the following command:
gmvgstat red_gmvg7 -i 10
To display information for GMVG red_gmvg7 that is automatically redisplayed every 10
seconds for 20 intervals and clears the screen between each redisplay, enter the following
command:
gmvgstat red_gmvg7 -i 10 -c 20 -w
Files
The /usr/sbin/gmvgstat contains the gmvgstat command.
Related information
See the HACMP for AIX 6.1 Geographic LVM: Planning and Administration Guide,
SA23-1338.
PVs
---60
5
152
RPVs
---63
5
152
Tot Vols
St Vols
-------- -------123
0
10
1
304
152
Total PPs
---------29846140
5926
91504
396
Stale PPs
---------0
384
45752
Sync
---100%
93%
50%
Example 2: Run the gmvgstat command with no options, but specifying the GMVG
blue_gmvg23 to display statistics for that particular GMVG only (Figure 8-48).
gmvgstat blue_gmvg23
GMVG Name
PVs RPVs Tot Vols St Vols Total PPs Stale PPs Sync
--------------- ---- ---- -------- -------- ---------- ---------- ---blue_gmvg23
5
5
10
1
235
10 95%
Example 3: Run the gmvgstat command with the -t and -r options, specifying the GMVG
blue_gmvg23 to display statistics for the specified GMVG followed by statistics for each
RPV included in blue_gmvg23 (from the rpvstat command) (Figure 8-49).
gmvgstat t -r blue_gmvg23
Geographically Mirrored Volume Group Information
------------------------------------------------
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
397
Selecting Status Monitors brings up the next panel (fastpath: glvmmonitors) (Figure 8-51).
Status Monitors
Move cursor to desired item and press Enter.
Remote Physical Volume Client Statistics
Geographically Mirrored Volume Group Status
Asynchronous Mirroring Statistics
Figure 8-51 Status Monitors panel
Select Remote Physical Volume Client Status to display the SMIT interface for the
rpvstat command for physical volumes. Select Geographically Mirrored Volume Group
Status to display the SMIT interface for the gmvgstat command. Select Asynchronous
Mirroring Statistics to display the SMIT interface for the rpvstat command for
asynchronous monitoring.
Select Display Remote Physical Volume Client Statistics to display the following panel
(fastpath: rpvstat_dialog) (Figure 8-53).
Display Remote Physical Volume Client Statistics
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry
Specific RPV(s) to display (leave blank for all)
Display Statistics for Individual Mirroring Networ
ks?
Include date and time in display?
Monitor interval in seconds
Number of times to monitor
Display applicable monitored statistics as delta
values?
Fields]
[] +
no +
no +
[] #
[] #
no +
398
You can select from the following options for the field values to display RPV client statistics by
network:
Include date and time in display?
You can alternately select no or yes.
Monitor interval in seconds
This field is optional. This field requires an integer value greater than
zero and less than or equal to 3600.
Number of times to monitor
This field is optional, but requires the Include date and time in display
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
399
You can choose from the following options to display RPV client maximum pending statistics:
Specific RPVs to display (Leave blank for all.)
You can leave this field blank to display statistics for all RPV clients.
Pressing PF4 provides a list from which you can use PF7 to select one
or more RPV clients. You can also manually enter the names of one or
more RPV clients in this field.
Display statistics for individual mirroring networks
You can alternately select no or yes.
Include date and time in display?
You can alternately select no or yes.
After the fields are completed, press Enter to run the rpvstatus -m command to display high
water mark statistical information for pending statistics.
Select Reset RPV Client statistics on the previous Remote Physical Volume Client Statistics
panel to display the panel (fastpath: rpvstat_reset_dialog) (Figure 8-56).
Reset Remote Physical Volume Client Statistics
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Specific RPV(s) to reset (leave blank for all)
[]
Figure 8-56 Reset Remote Physical Volume Client Statistics panel
Run the rpvstat -R command to reset the statistical counters in the indicated RPV clients.
400
Select Display Asynchronous I/O Progress Statistics, and the Display Asynchronous I/O
Progress Statistics panel (Figure 8-58) opens.
Display Asynchronous I/O Progress Statistics
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry
Specific RPV(s) to display (leave blank for all)
Include date and time in display?
Monitor interval in seconds
Number of times to monitor
Display applicable monitored statistics as delta
values?
Fields]
[] +
no +
[] #
[] #
no +
You can choose from the following options to display asynchronous I/O progress statistics:
Specific RPVs to display (Leave blank for all.)
You can leave this field empty to display information for all RPV clients.
Pressing PF4 provides a list from which you can use PF7 to select one
or more RPV clients for which to display information. Or, you can
manually enter the names of one or more RPV clients in the entry
field.
Include date and time in display?
You can alternately select no or yes.
Monitor interval in seconds
This field is optional and requires an integer value greater than zero
and less than or equal to 3600.
Number of times to monitor
This field is optional, but requires that the Monitor interval in seconds
field has a value. The Number of times to monitor field requires an
integer value greater than 0 and less than or equal to 999999.
Display applicable monitored statistics as delta values?
You can alternately select no or yes. This field applies only if the
Monitor interval in seconds field has a value.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
401
Selecting Display Asynchronous I/O Cache Statistics brings up the panel in Figure 8-59.
Display Asynchronous I/O Cache Statistics
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Specific RPV(s) to display (leave blank for all)
[] +
Include date and time in display?
no +
Monitor interval in seconds
[] #
Number of times to monitor
[]
# Display applicable monitored statistics as delta
no +
values?
Figure 8-59 Display Asynchronous I/O Cache Statistics panel
You can choose from the following options to display asynchronous I/O cache statistics:
Include date and time in display?
You can alternately select no or yes.
Monitor interval in seconds
This field is optional and requires an integer value greater than zero
and less than or equal to 3600.
Number of times to monitor
This field is optional, but requires that the Monitor interval in seconds
field has a value. The Number of times to monitor field requires an
integer value greater than 0 and less than or equal to 999999.
Display applicable monitored statistics as delta values?
You can alternately select no or yes. This field applies only if the
Monitor interval in seconds field has a value.
Selecting Reset Asynchronous IO Statistics runs the rpvstat -A -R and rpvstat -C -R
commands to reset all statistics.
402
Fields]
[] +
no +
no +
[] #
[] #
The following options are available to display the status of a geographically mirrored volume
group:
Specific RPVs to display (Leave blank for all.)
You can leave this field empty to display information for all valid,
available, online GMVGs. Pressing PF4 provides a list from which you
can use PF7 to select one or more GMVGs for which to display
information. You can also manually enter the names of one or more
GMVGs in this field.
Include associated RPV Client statistics?
You can alternately select no or yes.
Include header with display?
You can alternately select no or yes.
Monitor interval in seconds
This field is optional and requires an integer value greater than zero
and less than or equal to 3600.
Number of times to monitor
This field is optional, but requires a value in the Monitor interval in
seconds field. The Number of times to monitor field requires an integer
value greater than 0 and less than or equal to 999999.
After the fields are complete, run the gmvgstat command to display statistical information for
all of the indicated GMVGs.
ONLINE
OFFLINE
GLVM_A2
GLVM_A1
GLVM_B1_RG
ONLINE
GLVM_B1
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
403
OFFLINE
GLVM_B2
GLVM_B2_RG
ONLINE
OFFLINE
GLVM_B2
GLVM_B1
YuriRG
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_A1@Asite
GLVM_A2@Asite
GLVM_B1@Bsite
GLVM_B2@Bsite
YunaRG
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
GLVM_A1@Asite
GLVM_A2@Asite
Run the smitty hacmp command. Then, select System Management (C-SPOC)
Resource Groups and Applications Move a Resource Group to Another Node /
Site Move Resource Group(s) to Another Site (Example 8-30).
Example 8-30 Move YuriRG to Bsite
404
[Entry Fields]
YuriRG
Bsite
ONLINE
OFFLINE
GLVM_A2
GLVM_A1
GLVM_B1_RG
ONLINE
OFFLINE
GLVM_B1
GLVM_B2
GLVM_B2_RG
ONLINE
OFFLINE
GLVM_B2
GLVM_B1
YuriRG
ONLINE SECONDARY
OFFLINE
OFFLINE
ONLINE
GLVM_A1@Asite
GLVM_A2@Asite
GLVM_B1@Bsite
GLVM_B2@Bsite
YunaRG
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
GLVM_A1@Asite
GLVM_A2@Asite
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
405
Asite
xd_data
TCP/IP WAN
xd_ip
GLVM_A1
GLVM_B1
GLVM_A2
P550
P550
GLVM_B2
P570
P570
SAN_B8
SAN_B8
SVC_B8
SVC_B8
SVC_B8
SVC_B8
DS 4000
DS 8000
DS 8000
Example 8-32 shows the current state of the cluster resource groups. We run the halt
command in each node, GLVM_A1 and GLVM_A2.
Example 8-32 Initial resource group status in the GLVM cluster
406
GLVM_A2_RG
ONLINE
OFFLINE
GLVM_A2
GLVM_A1
GLVM_B1_RG
ONLINE
OFFLINE
GLVM_B1
GLVM_B2
GLVM_B2_RG
ONLINE
OFFLINE
GLVM_B2
GLVM_B1
YuriRG
ONLINE
OFFLINE
GLVM_A1@Asite
GLVM_A2@Asite
YunaRG
ONLINE SECONDARY
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
GLVM_A1@Asite
GLVM_A2@Asite
When the cluster manager detects a site failure, the following procedure is performed:
1. Detects that the primary site is down
2. Releases the secondary online resource group on the secondary site
3. Acquires the resource group in the online primary state at the secondary site
After a site failure, YuriRG is in ONLINE state at the GLVM_B1 node. The GLVM_A1_RG and
GLVM_A2_RG are in OFFLINE status because the resource groups are set to ignore
inter-site management policy, which means that the resource groups are only in the site not
taking over the remote site (Example 8-33).
Example 8-33 Resource group status after total site loss in Asite
OFFLINE
OFFLINE
GLVM_A2
GLVM_A1
GLVM_B1_RG
ONLINE
OFFLINE
GLVM_B1
GLVM_B2
GLVM_B2_RG
ONLINE
OFFLINE
GLVM_B2
GLVM_B1
YuriRG
OFFLINE
OFFLINE
ONLINE
OFFLINE
GLVM_A1@Asite
GLVM_A2@Asite
GLVM_B1@Bsite
GLVM_B2@Bsite
YunaRG
ONLINE
OFFLINE
OFFLINE
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
GLVM_A1@Asite
GLVM_A2@Asite
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
407
Asite
xd_data
xd_ip
TCP/IP WAN
GLVM_A1
GLVM_B1
GLVM_A2
P550
P550
GLVM_B2
P570
P570
SAN_B8
SAN_B8
SVC_B8
SVC_B8
SVC_B8
SVC_B8
DS 4000
DS 8000
DS 8000
When we lose all XD_data networks, the cluster manager performs a site isolation. You notice
the error messages in the errpt log shown in Example 8-34. The RPV disks are in the missing
state. However, since XD_ip Network is still alive, the resource group and application are still
ONLINE on both sites.
Example 8-34 errpt after losing XD_data networks
hdisk6
missing
1275
1274 255..254..255..255..255
root@GLVM_A1 > lsvg -l yurivg
yurivg:
LV NAME
TYPE
LPs
PPs
PVs LV STATE
MOUNT POINT
lvyuri
jfs2
10
20
2
open/stale
/yuri
loglv00
jfs2log
1
2
2
open/stale
N/A
....
root@GLVM_A1 / > clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------GLVM_A1_RG
ONLINE
GLVM_A1
OFFLINE
GLVM_A2
GLVM_A2_RG
ONLINE
OFFLINE
GLVM_A2
GLVM_A1
GLVM_B1_RG
ONLINE
OFFLINE
GLVM_B1
GLVM_B2
GLVM_B2_RG
ONLINE
OFFLINE
GLVM_B2
GLVM_B1
YuriRG
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_A1@Asite
GLVM_A2@Asite
GLVM_B1@Bsite
GLVM_B2@Bsite
YunaRG
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
GLVM_B1@Bsite
GLVM_B2@Bsite
GLVM_A1@Asite
GLVM_A2@Asite
When the XD_data network is up again, PowerHA merges the sites automatically, brings the
RPV disks to the active state, and synchronizes the data.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
409
Asite
xd_data
TCP/IP WAN
xd_ip
GLVM_A1
GLVM_B1
GLVM_A2
P550
GLVM_B2
P570
P570
P550
SAN_B8
SAN_B8
SVC_B8
SVC_B8
SVC_B8
SVC_B8
DS 4000
DS 8000
DS 8000
After the local storage loss at Asite (Example 8-35), the local PVs are in the missing state and
the logical volumes are now stale. At the Bsite, the RPVs are in the missing state, and the
logical volumes are in the stale state.
Example 8-35 VG status after local storage loss in Asite
TOTAL PPs
1275
1275
1275
1275
PPs
20
2
FREE
1265
1274
1265
1274
PVs
2
2
PPs
FREE DISTRIBUTION
255..245..255..255..255
255..254..255..255..255
255..245..255..255..255
255..254..255..255..255
LV STATE
open/stale
open/stale
MOUNT POINT
/yuri
N/A
PV_NAME
PV STATE
hdisk1
active
hdisk2
active
hdisk7
missing
hdisk8
missing
root@GLVM_B1 / > lsvg -l yunavg
yunavg:
LV NAME
TYPE
LPs
lvyuna
jfs2
10
loglv01
jfs2log
1
TOTAL PPs
1275
1275
1275
1275
PPs
20
2
FREE
1265
1274
1265
1274
PVs
2
2
PPs
FREE DISTRIBUTION
255..245..255..255..255
255..254..255..255..255
255..245..255..255..255
255..254..255..255..255
LV STATE
open/stale
open/stale
MOUNT POINT
/yuna
N/A
When the local storage is brought backup online, run the varyonvg command to enable the
local physical volume to set it to the active state, and run the syncvg for the data
synchronization (Example 8-36).
Example 8-36 Recover from local storage lost
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
411
412
Example 8-38 Configuring RPV clients and its server for yurivg
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
413
5. Configure the volume group to use super strict mirror pools. Asynchronous mirroring
requires the volume group to be configured to use super strict mirror pools. You can
configure super strict mirror pools by using the chvg -M s command:
root@GLVM_A1 / > chvg -M s yurivg
6. Configure a mirror pool of disks at each site. Asynchronous mirroring also requires the
local and remote disks to belong to separate mirror pools. Set a mirror pool for the local
disks and remote disks:
root@GLVM_A1 / > lsvg -p yurivg
yurivg:
PV_NAME
PV STATE
TOTAL PPs
hdisk1
active
1275
hdisk2
active
1275
hdisk5
active
1275
hdisk6
active
1275
root@GLVM_A1 / > chpv -p Asite hdisk1 hdisk2
root@GLVM_A1 / > chpv -p Bsite hdisk5 hdisk6
FREE
1265
1274
1265
1274
PPs
FREE DISTRIBUTION
255..245..255..255..255
255..254..255..255..255
255..245..255..255..255
255..254..255..255..255
7. Configure the logical volumes to belong to the new mirror pool with bad block relocation
turned off. A second mirror copy of the logical volume is on the Bsite mirror pool:
root@GLVM_A1 / > chlv -m copy1=Asite -m copy2=Bsite -b n lvyuri
root@GLVM_A1 / > chlv -m copy1=Asite -m copy2=Bsite -b n loglv00
8. Create a logical volume of type aio_cache for each mirror pool. Asynchronous mirroring
requires a logical volume of type aio_cache to serve as the cache device.
root@GLVM_A1 /> mklv -y datacachelv1 -t aio_cache -p copy1=Asite -s s -u 1024
-b n yurivg 10
datacachelv1
root@GLVM_A1 /> mklv -y datacachelv2 -t aio_cache -p copy1=Bsite -s s -u 1024
-b n yurivg 10
datacachelv2
The datacachelv1 that is in the Asite mirror pool is used for cache during asynchronous
mirroring to the disks in the Bsite mirror pool. The datacachelv2 that is in the Bsite mirror
pool is used for cache during asynchronous mirroring to the disks in the Asite mirror pool.
9. Convert to asynchronous mirroring for mirror pools:
root@GLVM_A1 / > chmp -A -m Asite yurivg
root@GLVM_A1 / > chmp -A -m Bsite yurivg
In this example, the Asite and Bsite mirror pools are configured to use asynchronous
mirroring. The chmp command automatically determines the datacachelv1 logical volume
for the cache device to use for Bsite because it is in the opposite sites mirror pool. You can
verify asynchronous mirroring configuration with the lsmp command (Example 8-39).
Example 8-39 Asynchronous mirroring configuration in yurivg
MIRROR POOL:
ASYNC MIRROR STATE:
ASYNC CACHE VALID:
ASYNC CACHE HWM:
Mirroring Mode:
ASYNC CACHE LV:
ASYNC CACHE EMPTY:
ASYNC DATA DIVERGED:
ASYNC
datacachelv2
yes
no
Mirroring Mode:
ASYNC CACHE LV:
ASYNC
datacachelv1
Asite
inactive
yes
100
MIRROR POOL:
Bsite
ASYNC MIRROR STATE: active
414
yes
100
no
no
10.For all other nodes, export the existing yurivg and import again (Example 8-40). All RPV
servers and clients must be in the available state.
Example 8-40 import asynchronous volume group in other nodes
root@GLVM_A1
root@GLVM_B1
root@GLVM_B1
yurivg
root@GLVM_B1
yurivg:
LV NAME
lvyuri
loglv00
datacachelv1
datacachelv2
LPs
10
1
10
10
PPs
20
2
10
10
PVs
2
2
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
open/syncd
MOUNT POINT
/yuri
N/A
N/A
N/A
11.Change the attribute of the resource group to preserve the asynchronous geographically
mirrored volume group from data divergency. If this value is not set, manually select a site
for recovery after data divergence (Figure 8-64).
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Resource Group Name
YuriRG
Inter-site Management Policy
Prefer Primary Site
Participating Nodes from Primary Site
GLVM_A1 GLVM_A2
Participating Nodes from Secondary Site
GLVM_B1 GLVM_B2
Startup Policy
Online On Home Node Only
Failover Policy
Failover To Next
Priority Node In The List
Failback Policy
Failback To Higher
Priority Node In The List
Failback Timer Policy (empty is immediate)
[] +
Service IP Labels/Addresses
Application Servers
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Allow varyon with missing data updates?
(Asynchronous GLVM Mirroring Only)
Default choice for data divergence recovery
(Asynchronous GLVM Mirroring Only)
[] +
[YuriApp] +
[yurivg ] +
true +
false +
true +
Bsite +
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
415
Data divergence: The allow varyon with missing data updates determines whether
HACMP event processing automatically allows data divergence to occur during a
non-graceful site failover to the remote site. It is set to enable fully automatic site failover
by default. The default choice for the Data divergence recovery entry field provides an
opportunity to configure your most likely decision for data divergence recovery in advance.
For more information, see 8.12, Data divergence in PowerHA for GLVM on page 429.
12.Run verification and synchronization and bring the resource group ONLINE at the primary
site and ONLINE SECONDARY at the secondary site (Example 8-41).
Example 8-41 Bringing the resource group YuriRG online
Procedure 1
To add disks at the primary site (where the volume group is online and the applications are
running):
1. On the node that has the volume group varied on, identify a new physical disk that will
be added to the volume group. This disk must be accessible by all the nodes at that site
416
and check that all disks have PVIDs assigned. If not, assign the PVIDs by using chdev -a
pv=yes -l (for example, chdev -a pv=yes -l hdisk3, where hdisk3 is a new disk added to
the system).
In Example 8-42, we add a local physical volume hdisk2 (00c1f1702fab9da9) on Bsite to
yunavg and configure a remote physical volume from hdisk4 (00c1f1702fab9da9) on Asite
and add it to the volume group for geographic mirroring.
Example 8-42 Check physical disks to add into yunavg in Bsite
rootvg
yunavg
None
yurivg
yurivg
yunavg
active
active
rootvg
yunavg
None
yurivg
yurivg
yunavg
active
active
2. On the node that has the volume group varied on (holding the primary instance of the
resource group), create an RPV server instance for this disk (Figure 8-65).
Add Remote Physical Volume Servers
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Physical Volume Identifiers
00c1f1702fab9da9
* Remote Physical Volume Client Internet Address
[10.10.101.107,10.10.102.108] +
Configure Automatically at System Restart?
[no] +
Start New Devices Immediately?
[yes] +
Figure 8-65 Create RPV server for hdisk2 00c1f1702fab9da9 in GLVM_B1 node
3. Create an RPV server instance on all remaining nodes of this site (if any) by using the disk
that is identified in Figure 8-66.
Add Remote Physical Volume Servers
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Physical Volume Identifiers
00c1f1702fab9da9
* Remote Physical Volume Client Internet Address
[10.10.101.108,10.10.102.108] +
Configure Automatically at System Restart?
[no] +
Start New Devices Immediately?
[yes] +
Figure 8-66 Creating an RPV server for hdisk2 00c1f1702fab9da9in GLVM_B2 node
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
417
4. On the remote site node (holding the secondary instance of the RG), create an RPV client
for this disk (Figure 8-67).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Server Internet Address
10.10.201.107
Remote Physical Volume Local Internet Address
10.10.101.107
Physical Volume Identifiers 00c1f1702fab9da90000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
root@GLVM_A1 / > chdev -l hdisk8 -a local_addr2=10.10.102.107 -a
server_addr2=10.10.202.107
hdisk8 changed
Figure 8-67 Create RPV client in GLVM_A1 node
5. Similarly, create an RPV client instance on all remaining nodes on the remote site (if any)
(Figure 8-68).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Server Internet Address
10.10.201.108
Remote Physical Volume Local Internet Address
10.10.101.108
Physical Volume Identifiers 00c1f1702fab9da90000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
root@GLVM_A2 / > chdev -l hdisk8 -a local_addr2=10.10.102.108 -a
server_addr2=10.10.202.108
hdisk8 changed
Figure 8-68 Create RPV client in GLVM_A2
6. On the node with the volume group varied on, run the extendvg command from the
command line to include the local physical volume in the volume group, or use the AIX
SMIT menu for volume groups to include the local physical disk (Example 8-43).
Example 8-43 Extend volume group in GLVM_B1
418
rootvg
yunavg
yunavg
yurivg
yurivg
yunavg
active
active
active
active
Procedure 2
To add disks to the backup site (where the volume group is offline and data is being mirrored
remotely):
1. On one of the remote site nodes (holding the secondary instance of the resource group),
identify a new physical disk to be added to the volume group. This disk must be accessible
by all the nodes at that site. For example, on the remote node, GLVM_A1, a new physical
disk, hdisk4, has PVID 00c1f1702fab9da9, and in GLVM_A2 as well (Example 8-44).
Example 8-44 Check physical disks to add into yunavg in Asite
rootvg
yurivg
yurivg
yunavg
None
yurivg
yurivg
yunavg
active
active
active
rootvg
yurivg
yurivg
yunavg
None
yunavg
active
active
active
3. Similarly, create an RPV server instance on all other nodes of the remote site
(Figure 8-70).
Add Remote Physical Volume Servers
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Physical Volume Identifiers
000fe4012f9a9fcf
* Remote Physical Volume Client Internet Address
[10.10.201.108,10.10.202.108] +
Configure Automatically at System Restart?
[no] +
Start New Devices Immediately?
[yes] +
Figure 8-70 Create RPV server for hdisk4 000fe4012f9a9fcf in GLVM_A2
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
419
4. On the node that has the volume group varied on (primary instance of the resource
group), create an RPV client instance for this disk (Figure 8-71).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Server Internet Address
10.10.101.107
Remote Physical Volume Local Internet Address
10.10.201.107
Physical Volume Identifiers 000fe4012f9a9fcf0000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
root@GLVM_B1 / > chdev -l hdisk8 -a local_addr2=10.10.202.107 -a
server_addr2=10.10.102.107
hdisk8 changed
Figure 8-71 Creating an RPV client in GLVM_B1 node
5. Similarly, create an RPV client on all other nodes (if any) on this site (Example 8-72).
Add Remote Physical Volume Clients
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Remote Physical Volume Server Internet Address
10.10.101.108
Remote Physical Volume Local Internet Address
10.10.201.108
Physical Volume Identifiers 000fe4012f9a9fcf0000000000000000
I/O Timeout Interval (Seconds)
[180] #
Start New Devices Immediately?
[yes] +
root@GLVM_B2 / > chdev -l hdisk8 -a local_addr2=10.10.202.108 -a
server_addr2=10.10.102.108
hdisk8 changed
Figure 8-72 Create RPV client in GLVM_B2
6. On the node that has the volume group varied on (holding the primary instance of the RG),
extend the volume group to add the remote physical volume or disk to the volume group.
Either use the extendvg command from the command line or use GLVM SMIT by running
the smitty glvm_utils command and selecting Geographic Logical Volume Groups
Add Remote Physical Volumes to a Volume Group (Figure 8-73).
Add Remote Physical Volumes to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* VOLUME GROUP name
yunavg
Force
[no] +
* REMOTE PHYSICAL VOLUMES Name
hdisk8
Figure 8-73 Extend volume group in GLVM_B1
420
active
active
active
active
active
...
physical volume information in GLVM_A1 node after synchronization
root@GLVM_A1 / > lspv
hdisk0
000fe4111f25a1d1
rootvg
hdisk1
000fe4112f99817c
yurivg
hdisk2
000fe4112f998235
yurivg
hdisk3
000fe4012f9a9f43
yunavg
hdisk4
000fe4012f9a9fcf
yunavg
hdisk5
00c0f6a02fae3172
yurivg
hdisk6
00c0f6a02fae31dd
yurivg
hdisk7
00c1f1702fab9d25
yunavg
hdisk8
00c1f1702fab9da9
yunavg
active
active
active
active
active
TOTAL PPs
1275
1275
1275
1275
FREE
1265
1265
1265
1265
PPs
FREE DISTRIBUTION
255..245..255..255..255
255..245..255..255..255
255..245..255..255..255
255..245..255..255..255
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
421
rootvg
yurivg
yurivg
yunavg
yunavg
active
rootvg
yurivg
yurivg
yunavg
yunavg
yunavg
yunavg
active
True
True
True
True
True
True
True
True
In the node that has the ONLINE state of the resource group YuriRG, all RPV clients are
already in the available state. For example, GLVM_B1 has yurivg activated and RPV
clients are active, enabling mirror to the remote copy.
In the node that has ONLINE SECONDARY of the resource group YuriRG, all RPV
servers are already in the available state. For example, GLVM_A1 has all RPV servers in
the available state.
422
2. Remove a remote physical volume by using the SMIT GLVM utilities (Figure 8-74).
Remove Remote Physical Volumes From a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
* VOLUME GROUP name
yunavg
FORCE deallocation of all partitions on this physi [no] +
cal volume?
* Select One or more Remote Physical Volumes
hdisk8
Figure 8-74 Remove remote physical volume in GLVM_B1
active
active
active
active
active
...
physical volume information in GLVM_A1 node after synchronization
root@GLVM_A1 / > lspv
hdisk0
000fe4111f25a1d1
rootvg
hdisk1
000fe4112f99817c
yurivg
hdisk2
000fe4112f998235
yurivg
hdisk3
000fe4012f9a9f43
yunavg
hdisk4
000fe4012f9a9fcf
yunavg
hdisk5
00c0f6a02fae3172
yurivg
hdisk6
00c0f6a02fae31dd
yurivg
hdisk7
00c1f1702fab9d25
yunavg
hdisk8
00c1f1702fab9da9
None
active
active
active
active
active
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
423
explains the steps to migrate from HAGEO to GLVM. It addresses both a synchronous and an
asynchronous HAGEO environment that is migrated to a synchronous or asynchronous
GLVM environment. To download and read this paper, go to:
http://www.ibm.com/systems/power/software/availability/whitepapers/glvm.html
End of service: HAGEO is in end of service. For more information, go to:
http://www.ibm.com/software/support/systemsp/lifecycle/
The objective of the paper states:
The white paper provides an overview of the HAGEO and GLVM technologies and
compares their technical workings and their external interfaces. The requirements for both
HAGEO and GLVM are reviewed. The different HAGEO mirroring configurations (sync,
mwc, and async) are briefly reviewed and mapped to the mirroring configurations available
in GLVM (sync and async). Migration prerequisites are presented along with the steps
required before actually starting the migration process. The actual migration steps for two
sample configurations (one with synchronous mirroring and another with asynchronous
mirroring) are presented in detail.
The paper makes the following statements under the topic migration prerequisites:
Migration from an HAGEO-based cluster to a GLVM-based cluster requires, for each
PowerHA resource group, the removal of HAGEO and its artifacts followed by the
configuration of GLVM. The application data can be retained at either site before
introducing GLVM and remirroring the data to the other site. However, retain the data at
the primary site (the production site where the application runs and is available most of the
time) and remove it from the backup (recovery) site. PowerHA SystemMirror Enterprise
Edition does not allow a resource group to simultaneously contain both HAGEO GMDs
and GLVM GMVGs.
This means that the migration requires removing the HAGEO subsystem and then adding the
new GLVM rpvservers and rpvclients. This method has the following drawbacks:
When removing HAGEO, there is no longer a disaster recovery site in place.
Adding GLVM requires a full synchronization of all the data across the XD_data network.
If something goes wrong with the migration to GLVM, the back-out plan is to set up the
HAGEO environment again, requiring full synchronization of the GMDS across the
XD_data network.
An alternate solution is to set up GLVM while HAGEO is still in place. This method has its
considerations as well:
Since HAGEO is not supported on AIX 6.1, any asynchronous GMDS must first be
migrated to synchronous GLVM.
To have both GMDS and an rpvserver on the remote node, a second set of disks is
required to contain the rpvservers. After the rpvservers are set up, the GMDS can be
removed and the GMD disk space can be reused.
424
3. Synchronize the logical volumes one at a time to minimize extra load on the
XD_data network.
4. Complete the migration by removing the GMDS from the resource group, creating the
rpvservers on the local site, and creating the rpvclients on the remote site.
5. Test the failover of the new GLVM environment.
6. Remove the GMD statemap logical volumes.
7. Follow PowerHA documentation to migrate to AIX 6.1 and asynchronous GLVM if required.
Tip: If a failure occurs and the resource group moves to the remote site, the GMDS and file
systems will work as before. When it is time to move the resource group back to the
primary site, make sure that the rpvclient disks are available by using the mkdev -l
<rpvdisk> command. If you do not run this command, you see errors in errpt about stale
partitions, LVM_SA_STALEPP.
You can still proceed if the rpvdisks are missing by issuing the mkdev -l <rpvdisk> and
then varyonvg -n <vg>. This situation varies on the volume group, but does not sync the
stale logical volumes. At a convenient time, continue the synchronization of the stale
logical volumes with the syncvg -l command, monitoring the stale with the lspv -p
<rpvdisk> command.
-f 'oldhageo
3. Define a remote physical volume server site name on one node on the remote site.
Run the smitty rpvserver command and select Remote Physical Volume Server
Site Name Configuration Define / Change / Show Remote Physical Volume
Server Site Name.
4. Create an rpvserver on one node on the remote site. Run the smitty rpvserver
command, select Remote Physical Volume Server Site Name Configuration Add
Remote Physical Volume Servers, and select an unused disk that is not associated with
any volume groups. This disk must be comparable to the disks used for GMD.
5. Create an rpvclient on a node on the local site. Run the smitty rpvclient
command, select Add Remote Physical Volume Clients, and enter the XD_data IP
of the rpvserver.
6. Add the RPV to the local volume group. Run the smitty glvm_utils command and select
Geographically Mirrored Volume Groups Add Remote Physical Volumes to a
Volume Group.
7. Make sure that the logical volumes to be mirrored have the superstrict flag set
(Example 8-49). This must be done for the statemaps also to get past error checking later
when doing an HACMP verification and synchronization.
Example 8-49 Setting superstrict for logical volumes
> lslv datsm_data1
LOGICAL VOLUME:
LV IDENTIFIER:
VG STATE:
datsm_data1
VOLUME GROUP:
geovg
0000858400004c0000000126e2b26aaf.3 PERMISSION:
read/write
active/complete
LV STATE:
closed/syncd
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
425
TYPE:
statemap
WRITE VERIFY:
MAX LPs:
512
PP SIZE:
COPIES:
2
SCHED POLICY:
LPs:
1
PPs:
STALE PPs:
0
BB POLICY:
INTER-POLICY:
minimum
RELOCATABLE:
INTRA-POLICY:
middle
UPPER BOUND:
MOUNT POINT:
N/A
LABEL:
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes (superstrict)
Serialize IO ?:
NO
off
16 megabyte(s)
parallel
2
relocatable
yes
2
None
If superstrict is not on or the Upper Bound is not at least 2, change it by entering the
following command:
chlv -s s -u 2 datsm_data1
8. Add a remote site mirror copy to the logical volume. HACMP requires all logical volumes
on a volume group that is part of GLVM to be mirrored, which includes the statemap
logical volumes. The statemaps logical volumes are removed later in the process. Run the
smitty glvm_utils command. Select Geographically Mirrored Logical Volumes
Add a Remote Site Mirror Copy to a Logical Volume. Then, select the logical volume to
be mirrored. Leave the option to SYNCHRONIZE set to no so that you can do it manually
at a time when it does not affect the XD_DATA network.
The lspv -p <rpvdisk> command shows where the stale partitions are (Example 8-50).
Example 8-50 Checking for the stale partitions
> lspv -p
hdisk9:
PP RANGE
1-109
110-129
130-130
131-150
151-151
152-152
153-154
155-217
218-325
326-433
434-542
hdisk9
STATE
free
used
used
used
stale
stale
stale
free
free
free
free
REGION
outer edge
outer middle
outer middle
outer middle
outer middle
outer middle
outer middle
outer middle
center
inner middle
inner edge
LV NAME
TYPE
MOUNT POINT
datlv_data1
loglv_data1
datlv_data2
loglv_data2
loglv00
fslv00
jfs
jfslog
jfs
jfslog
jfs2log
jfs2
N/A
N/A
N/A
N/A
N/A
N/A
9. Synchronize the logical volumes one at time to minimize the data replication load on the
XD network. Consider doing this step at off-peak times, depending on your network
bandwidth. By using the syncvg -l <logical volume> command, you can synchronize
just one logical volume copy (Example 8-51).
Example 8-51 Sync one logical volume to the rpvserver
426
hdisk9
STATE
free
used
used
REGION
outer edge
outer middle
outer middle
LV NAME
TYPE
MOUNT POINT
datlv_data1
loglv_data1
jfs
jfslog
N/A
N/A
131-150
151-151
152-152
153-154
155-217
218-325
326-433
434-542
used
used
stale
stale
free
free
free
free
outer middle
outer middle
outer middle
outer middle
outer middle
center
inner middle
inner edge
datlv_data2
loglv_data2
loglv00
fslv00
jfs
jfslog
jfs2log
jfs2
N/A
N/A
N/A
N/A
10.Stop the cluster services on all nodes when all the logical volumes are synchronized on
the rpvclient.
11.Change /etc/filesystems to use the logical volumes instead of GMDS on all nodes in the
cluster (Example 8-52).
Example 8-52 The /etc/filesystem using gmds and logical volumes
true
[]
> lspv
hdisk0
none
None
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
427
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
hdisk7
hdisk8
hdisk9
hdisk10
0000849208566aca
000105534fb8c100
00011774996d81bb
00011774996d9259
00011774996da2f0
00011774996db38c
00011774996dd4bf
00011774996de51f
00011774996df55d
00011774996d0c2f
rootvg
None
geovg
None
None
None
None
None
None
None
> lsrpvserver
rpvserver0
00011774996d9259
hdisk4
> lsrpvclient
hdisk10
00011774996d0c2f
siteA
active
active
16.On the remote node, varyoffvg the volume groups that are currently mirrored with GMDS.
17.Exportvg the GMD mirrored volume group (Example 8-55).
Example 8-55 geovg using gmd is exported
None
rootvg
None
None
None
None
None
None
None
None
None
active
18.On all nodes, import the volume group by using the rpvserver and rpvclient disks
(Example 8-56).
Example 8-56 The importvg and the vg varied on using the new disks
428
none
0000849208566aca
000105534fb8c100
00011774996d81bb
00011774996d9259
00011774996da2f0
00011774996db38c
00011774996dd4bf
00011774996de51f
None
rootvg
None
None
geovg
None
None
None
None
active
active
hdisk9
hdisk10
00011774996df55d
00011774996d0c2f
None
geovg
active
429
If one or more XD_data networks are available, PowerHA brings the resource group online
and the geographically mirrored volume group is automatically synchronized from the copy on
the remote site.
Note: If the forced varyon option is not set for the volume group and quorum is enabled,
PowerHA for GLVM moves the resource group into an ERROR state. It does not attempt to
activate it on a node where access to a single copy of the data exists. The cluster is
protected from data divergence, and data integrity is secure. However, data availability is
not always achievable automatically.
For more information, see the HACMP for AIX 6.1 Geographic LVM planning and
administration Guide, SA23-1338.
systems. The decision of which copy of the data to keep must be made based on the
circumstances, and only you can make this decision.
PowerHA for GLVM provides an attribute of resource group. The default choice for data
divergence recovery allows you to configure the site name of the site with the version of the
data that you are most likely to choose to preserve when recovering from data divergence.
Otherwise, this field can be left blank to indicate that you do not want to choose a default. For
example, if Asite is the production site and Bsite is the remote site, you might want to
configure Bsite as the default choice. If a production site failure occurs, and you want to keep
the data at Bsite after the production site is recovered, you do not need to do anything to tell
HACMP to carry out your decision (Figure 8-75).
Change/Show All Resources and Attributes for a Resource Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Resource Group Name
YuriRG
Inter-site Management Policy
Prefer Primary Site
Participating Nodes from Primary Site
GLVM_A1 GLVM_A2
Participating Nodes from Secondary Site
GLVM_B1 GLVM_B2
Startup Policy
Online On Home Node Only
Failover Policy
Failover To Next Priority
Node In The List
Failback Policy
Failback To Higher Priority
Node In The List
Failback Timer Policy (empty is immediate)
[] +
Service IP Labels/Addresses
Application Servers
Volume Groups
Use forced varyon of volume groups, if necessary
Automatically Import Volume Groups
Allow varyon with missing data updates?
(Asynchronous GLVM Mirroring Only)
Default choice for data divergence recovery
(Asynchronous GLVM Mirroring Only)
[] +
[YuriApp] +
[yurivg ] +
true +
false +
true +
Bsite +
The secondary instance of the resource group must be online before the primary instance is
brought online, if the secondary instance is the one that contains the data to be preserved.
Chapter 8. Configuring PowerHA SystemMirror Enterprise Edition with Geographic Logical Volume Manager
431
First, start the cluster without allowing the joining nodes to acquire any resources. Select
Manually for the Manage Resource Groups field (Figure 8-76).
Start Cluster Services
Type or select values in entry fields.
Press Enter AFTER making all desired changes.[Entry Fields]
* Start now, on system restart or both
now
Start Cluster Services on these nodes
[GLVM_A1]
* Manage Resource Groups
Manually
BROADCAST message at startup?
true
Startup Cluster Information Daemon?
false
Ignore verification errors?
false
Automatically correct errors found during
Interactively
+
+
+
+
+
+
+
Next, after the production site nodes join the cluster, you must manage the resource groups
manually. Most likely the primary instance of the resource group is already running at the
disaster recovery site. You have a few choices:
Move the primary instance of the resource group back to the production site. This step
performs what is known as a site failback to return the cluster to how it was before the site
failure, with the production site asynchronously mirroring to the disaster recovery site.
PowerHA automatically takes the primary resource group instance offline and then brings
the secondary instance online at the disaster recovery site before bringing the primary
instance online at the production site.
Keep the primary instance at the disaster recovery site and bring the secondary instance
of the resource group online at the production site. Then, asynchronous data mirroring
occurs in the opposite direction, from the disaster recovery site back to the production site.
If you want to switch back to the production sites version of the data, while continuing to
run at the disaster recovery site, take the resource group offline at the disaster recovery
site. Then, you can start the secondary instance at the production site, followed by the
primary instance at the disaster recovery site (Figure 8-77).
Bring a Resource Group Online
Type or select values in entry fields.
Press Enter AFTER making all desired changes. [Entry Fields]
Resource Group to Bring Online
YunaRG
Node on Which to Bring Resource Group Online
GLVM_A1
Name of site containing data to be preserved
[Asite] +
by data divergence recovery processing
(Asynchronous GLVM Mirroring only)
Figure 8-77 Overriding default data divergence recovery (Part 2 of 2)
You can either complete the name of the site that contains data to be preserved by data
divergence recovery processing field with a site name or leave it blank. If you leave the field
blank, this operation fails.
432
Part 4
Part
Maintenance,
management, and
disaster recovery
This part provides feedback about the management and maintenance of the cluster nodes.
This part includes the following chapters:
Chapter 9, Maintenance and management on page 435.
Chapter 10, Disaster recovery with DS8700 Global Mirror on page 469
Chapter 11, Disaster recovery by using Hitachi TrueCopy and Universal Replicator on
page 515
433
434
Chapter 9.
435
436
To perform a non-disruptive update, you must stop cluster services on the nodes with the
UNMANAGE option. This option leaves the application and resources in the resource group
online. This option is available from the cluster clstop panels and replaces the older Forced
Stop option in older versions.
To stop cluster services without stopping your applications:
1.
2.
3.
4.
To avoid the loss of an accurate cluster status when performing updates, stop the cluster by
using the UNMANAGE option on one cluster node at a time. Figure 9-1 shows a sample of
the unmanage option.
Stop Cluster Services
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
now
[svcxd_a1]
true
Bring Resource Groups>
+
+
+
+
+--------------------------------------------------------------------------+
|
Select an Action on Resource Groups
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
Bring Resource Groups Offline
|
|
Move Resource Groups
|
|
Unmanage Resource Groups
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
F1| F8=Image
F10=Exit
Enter=Do
|
F5| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 9-1 RG UNMANAGE option for updating the cluster
Regardless of the type of resource group you have, if you stop cluster services on the node
on which this group is active and do not stop the application that belongs to the resource
group, PowerHA puts the group into an UNMANAGED state and keeps the application
running according to your request.
The resource group that contains the application remains in the UNMANAGED state (until
you instruct PowerHA to start managing it again) and the application continues to run. While
in this condition, PowerHA and the RSCT services continue to run, providing services to ECM
VGs that the application servers might be using.
You can instruct PowerHA to start managing the resources again either by restarting cluster
services on the node or by using SMIT to move the resource group to a node that is actively
managing its resource groups.
Chapter 9. Maintenance and management
437
If you have instances of replicated resource groups that use the extended distance
capabilities of the PowerHA Enterprise Edition, the UNMANAGED SECONDARY state is
used for resource groups that were previously in the ONLINE SECONDARY state.
Attention: When you stop cluster services on a node and place resource groups in an
UNMANAGED state, the cluster stops managing the resources on that node. PowerHA
does not react to individual resource failures, application failures, or even if the
node crashes.
Similar to how AIX operating system levels must be the same across the cluster nodes, the
PowerHA levels must also be the same. Running in a mixed mode is acceptable for short
periods of time while performing cluster updates.
Description
clRGinfo
cltopinfo -m
Displays the cluster uptime and any missed heartbeats for any interfaces
cldump
clshowres
cllsnw
clshowsrv -v
clstat
Cluster status
The information that is displayed by using the general commands does not show information
specific to the replicated resources. The naming schemes that are used in the environment
can help determine the type of replication that is being used. However, reviewing the
PowerHA Enterprise Edition packages that are installed and running commands specific to
each replication type displays the information specific to the replicated resources.
The following tables identify several of the new commands that get added to the cluster when
the replication packages for each solution are installed. Several of these commands are used
by the cluster processing and might not be supported as stand-alone commands, but they are
useful to quickly identify how the replicated resources are configured.
438
Table 9-2 displays several new commands that are appended by the SAN Volume Controller
cluster file sets. For detailed results, see Chapter 5, Configuring PowerHA SystemMirror
Enterprise Edition with Metro Mirror and Global Mirror on page 153.
Table 9-2 SVC replication cluster reference commands
Command
Description
/usr/es/sbin/cluster/svcpprc/cmd
s/cllssvc
/usr/es/sbin/cluster/svcpprc/cmd
s/cllssvcpprc
/usr/es/sbin/cluster/svcpprc/cmd
s/cllsrelationship
Table 9-3 lists several new commands that are available to display the devices that are used
in the Metro Mirror replication. For detailed results, see Chapter 6, Configuring PowerHA
SystemMirror Enterprise Edition with ESS/DS Metro Mirror on page 237.
Table 9-3 DS replication cluster reference commands
Command
Description
/usr/es/sbin/cluster/pprc/spprc/
cmds/cllscss
/usr/es/sbin/cluster/pprc/spprc/
cmds/cllsspprc
/usr/es/sbin/cluster/pprc/spprc/
cmds/cllsdss
439
Table 9-4 shows the additional EMC cluster list command that displays the configuration for
the replicated resources. For detailed results, see Chapter 7, Configuring PowerHA
SystemMirror Enterprise Edition with SRDF replication on page 267.
Table 9-4 EMC SRDF replication cluster reference commands
Command
Description
/usr/es/sbin/cluster/sr/cmds/
cllssr
Table 9-5 list GLVM-specific commands that display the replicated resources that are part of
the cluster configuration. For examples, see Chapter 8, Configuring PowerHA SystemMirror
Enterprise Edition with Geographic Logical Volume Manager on page 339.
Table 9-5 GLVM replication cluster reference commands
440
Command
Description
/usr/sbin/rpvstat
Provides the status of one or more Local Remote Physical Volume (RPV)
clients.
/usr/sbin/gmvgstat
441
The <snapshot>.odm file that also gets generated is what the cluster uses to restore a
configuration or to update during a snapshot migration. It provides a collection of cluster
stanzas that contain all of the cluster definitions. Figure 9-3 shows partial output from a
sample from one of our test clusters.
# cat /usr/es/sbin/cluster/snapshots/Prod_SVC_Cluster_03_22_2010.odm
.....
HACMPsvc:
svccluster_name = "B8_8G4"
svccluster_role = "Master"
sitename = "svc_sitea"
cluster_ip = "10.12.5.55"
cluster_2nd_ip = ""
r_partner = "B12_4F2"
version = ""
reserved = ""
HACMPsvc:
svccluster_name = "B12_4F2"
svccluster_role = "Master"
sitename = "svc_siteb"
cluster_ip = "10.114.63.250"
cluster_2nd_ip = ""
r_partner = "B8_8G4"
version = ""
reserved = ""
HACMPsvcpprc:
svcpprc_consistencygrp = "svc_metro"
MasterCluster = "B8_8G4"
AuxiliaryCluster = "B12_4F2"
relationships = "svc_disk2 svc_disk3 svc_disk4 svc_disk5"
CopyType = "METRO"
RecoveryAction = "MANUAL"
HACMPsvcpprc:
svcpprc_consistencygrp = "svc_global"
svccluster_name = "B12_4F2"
svccluster_role = "Master"
sitename = "svc_siteb"
cluster_ip = "10.114.63.250"
cluster_2nd_ip = ""
r_partner = "B8_8G4"
version = ""
reserved = ""
.....
Figure 9-3 Sample cluster .odm configuration file
442
The cluster report file that is generated by importing a cluster definition file into the Online
Planning Worksheets on your desktop can also provide a useful cluster reference. This same
report is automatically generated when you use the WebSMIT PowerHA GUI management
console. Figure 9-4 shows a sample html report file from one of our four node test clusters.
443
Replicated resources: You can perform general cluster testing for clusters that include
replicated resources, but not testing specific to replicated resources or any of the
PowerHA Enterprise Edition products such as testing on storage replication relationships.
Site-specific tests
If sites are present in the cluster, the test tool runs the tests for them. The automated testing
sequence that the cluster test tool uses contains two site-specific tests:
auto_site
This sequence of tests runs if you have any cluster configuration with sites. For example,
this sequence is used for clusters with cross-site LVM mirroring configured that does not
use XD_data networks. This sequence includes the following tests:
SITE_DOWN_GRACEFUL
Stops the cluster services on all nodes in a site while taking
resources offline
SITE_UP
SITE_DOWN_TAKEOVER
Stops the cluster services on all nodes in a site and moves the
resources to nodes at another site
SITE_UP
RG_MOVE_SITE
auto_site_isolation
This sequence of tests runs only if you configured sites and a XD-type network. This
sequence includes the following tests:
SITE_ISOLATION
SITE_MERGE
The support for the cluster test tool is included in the base PowerHA product. The following
tests are available for the cluster test tool specific to PowerHA Enterprise Edition
environments:
RG_MOVE_SITE
SITE_ISOLATION
Brings down all XD_data networks in the cluster at which the tool is
running, causing a site isolation.
SITE_MERGE
SITE_UP
Starts cluster services on all nodes at the specified site that are
currently stopped.
SITE_DOWN_TAKEOVER
Stops cluster services on all nodes at the specified site and moves the
resources to nodes at another site by starting automatic rg_move
events.
SITE_DOWN_GRACEFUL
Stops cluster services on all nodes at the specified site and takes the
resources offline.
444
For more information about the cluster test tool, see the HACMP for AIX 6.1 Administration
Guide, SC23-4862, which you can find in the Cluster Products Information Center at:
http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.
cluster.hacmp.doc/hacmpbooks.html
9.2.1 Checking the status of the RGs with the clRGinfo utility
Identifying the state of the resource groups in the cluster typically provides a good indication
of the state of the cluster resources. The status, however, does not indicate whether the
application is truly running, but rather it tells only where it is hosted.
The clRGinfo command is shows you the state of the resource groups. You can use the
/usr/es/sbin/cluster/utilities/clRGinfo command to monitor the resource group status
and its location. This command also reports if a node temporarily has the highest priority for
this instance.
Alternative: You can use the clfindres command instead of clRGinfo. The clfindres
command is a link to clRGinfo. Only the root user can run the clRGinfo utility.
445
There are various flags available for the clRGinfo command (Example 9-1).
Example 9-1 clRGinfo command options
# clRGinfo -?
clRGinfo: illegal option -- ?
Usage: clRGinfo [-h] [-v] [-a] [-s|-c] [-t] [-p] [group name(s)]
For more information about each option, see the HACMP for AIX 6.1 Administration Guide,
SC23-4862, in each corresponding release.
The sample output in Example 9-2 shows the status of the resources in the cluster that is
used for the SAN Volume Controller Metro and Global Mirror test scenarios. An important
observation from the sample output is how the use of the -p option provides details about the
node with the highest priority and a timestamp from the most recent operation.
Example 9-2 clRGinfo -p sample output
# clRGinfo -p
Cluster Name: svcxd
Resource Group Name: RG_sitea
Secondary instance(s):
The following node temporarily has the highest priority for this instance:
svcxd_b2, user-requested rg_move performed on Fri Mar 19 19:47:57 2010
Node
---------------------------svcxd_a1@svc_sitea
svcxd_a2@svc_sitea
svcxd_b2@svc_siteb
svcxd_b1@svc_siteb
Primary State
--------------ONLINE
OFFLINE
OFFLINE
OFFLINE
Secondary State
OFFLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
Primary State
--------------ONLINE
OFFLINE
OFFLINE
OFFLINE
Secondary State
OFFLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
The output in Example 9-2 also shows the site-specific states for the two resource groups in
the cluster. The secondary states are designed to attempt to identify where the resources are
moved if a failover occurs across the sites.
Site failure: The cluster tries to use the location that is identified in the ONLINE
SECONDARY output, but if a site failure occurs, the resources might not always end up on
the expected target.
446
ONLINE
ONLINE SECONDARY
OFFLINE
OFFLINE SECONDARY
ERROR
ERROR SECONDARY
UNMANAGED
UNMANAGED SECONDARY
[]
[ignore]
[]
[]
Startup Policy
Failover Policy
Failback Policy
+
+
+
+--------------------------------------------------------------------------+
|
Inter-Site Management Policy
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
ignore
|
|
Prefer Primary Site
|
|
Online On Either Site
|
|
Online On Both Sites
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
F1| F8=Image
F10=Exit
Enter=Do
|
F5| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 9-5 RG site-specific options
447
The default option is set to Ignore, but the following options are available:
Ignore: If you select this option, the resource group does not have ONLINE SECONDARY
instances. Use this option if you use cross-site LVM mirroring. You can also use it with
HACMP/XD for Metro Mirror.
Prefer Primary Site: The primary instance of the resource group is brought ONLINE on the
primary site at startup. The secondary instance is started on the other site. The primary
instance falls back when the primary site rejoins.
Online on Either Site: During startup the primary instance of the resource group is brought
ONLINE on the first node that meets the node policy criteria (either site). The secondary
instance is started on the other site. The primary instance does not fall back when the
original site rejoins.
Online on Both Sites: During startup the resource group (node policy must be defined
as online on all available nodes) is brought ONLINE on both sites. There is no failover
or failback.
The resource group moves to another site only if no node or condition exists under which it
can be brought or kept ONLINE on the site where it is located. The site that owns the active
resource group is called the primary site.
9.2.2 RG states
A resource group can be many states. Table 9-7 lists the possible states of a resource group.
The RG state numbers are maintained internally and are not displayed by the clRGinfo
command.
Table 9-7 PowerHA - resource group states
448
RG state number
Description
RG Invalid
RGOnline State
RG Offline State
RG Unknown State
16
RG Acquiring
32
RG Releasing
64
RG Error State
128
256
RG Online Secondary
512
RG Offline Secondary
1024
RG Acquiring Secondary
2048
4096
RG Releasing Secondary
8192
16384
32768
RG state number
Description
65536
131072
262144
Parent/child dependency
Online on same node location dependency
Online on different nodes location dependency
Online on same site location dependency
449
The following rules and restrictions are applicable to the online on same site dependency set
of resource groups:
All resource groups in a same site dependency set must have the same intersite
management policy, but might have different startup, failover, and failback policies. If
failback timers are used, they must be identical for all resource groups in the set.
All resource groups in the same site dependency set must be configured so that the nodes
that can own the resource groups are assigned to the same primary and secondary sites.
Online using node distribution policy startup policy is supported.
Both concurrent and non-concurrent resource groups are allowed.
You can have more than one same site dependency set in the cluster.
All resource groups in the same site dependency set that are active (ONLINE) are
required to be ONLINE on the same site, even though certain resource groups in the set
may be OFFLINE or in an ERROR state.
If you add a resource group included in a same node dependency set to a same site
dependency set, you must add all the other resource groups in the same node
dependency set to the same site dependency set.
# clRGinfo -v
Cluster Name: xdemc
Resource Group Name: mikeRG1
Startup Policy: Online On Home Node Only
Failover Policy: Failover To Next Priority Node In The List
Failback Policy: Never Failback
Site Policy: Prefer Primary Site
Node
Primary State
Secondary State
---------------------------- --------------xdemca1@siteA
ONLINE
OFFLINE
xdemca2@siteA
OFFLINE
OFFLINE
xdemcb1@siteB
OFFLINE
ONLINE SECONDARY
xdemcb2@siteB
OFFLINE
OFFLINE
Resource Group Name: mikeRG2
Startup Policy: Online On Home Node Only
Failover Policy: Failover To Next Priority Node In The List
Failback Policy: Never Failback
Site Policy: Prefer Primary Site
Node
Primary State
Secondary State
---------------------------- --------------xdemca1@siteA
ONLINE
OFFLINE
xdemca2@siteA
OFFLINE
OFFLINE
xdemcb1@siteB
OFFLINE
ONLINE SECONDARY
xdemcb2@siteB
OFFLINE
OFFLINE
450
Next, we defined a site dependency between the two resource groups through the PowerHA
panels. Figure 9-6 shows the PowerHA panel to define the dependency. Run the smitty
hacmp command. Select Extended Configuration Extended Resource Configuration
Configure Resource Group Run-Time Policies Configure Dependencies between
Resource Groups Configure Online on the Same Site Dependency Add Online on
the Same Site Dependency Between Resource Groups. Then, press Enter.
Add Online on the Same Site Dependency Between Resource Groups
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* Resource Groups to be Online on the same site
[]
+--------------------------------------------------------------------------+
|
Resource Groups to be Online on the same site
|
|
|
| Move cursor to desired item and press F7.
|
|
ONE OR MORE items can be selected.
|
| Press Enter AFTER making all selections.
|
|
|
|
mikeRG1
|
|
mikeRG2
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
F1| F7=Select
F8=Image
F10=Exit
|
F5| Enter=Do
/=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 9-6 Defining online on same site dependency SMIT panel
After you define the dependency, you push the update across all nodes by running a cluster
synchronization. You confirm the dependency by running the following command:
# odmget HACMPrg_loc_dependency
HACMPrg_loc_dependency:
id = 1
set_id = 1
group_name = "mikeRG1"
priority = 0
loc_dep_type = "SITECOLLOCATION"
loc_dep_sub_type = "STRICT"
HACMPrg_loc_dependency:
id = 2
set_id = 1
group_name = "mikeRG2"
priority = 0
loc_dep_type = "SITECOLLOCATION"
loc_dep_sub_type = "STRICT"
451
With the relationship established, the cluster automatically groups them and requires you to
perform administrative tasks against all RGs in the dependency set. Figure 9-7 shows the
sample output when we tried to move only one of the resource groups to the remote site. With
the dependency defined, the cluster provides only an option to move the set.
Move a Resource Group to Another Node / Site
Mo+--------------------------------------------------------------------------+
|
Select a Resource Group
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
| [MORE...5]
|
|
RG_siteb
ONLINE
xdemcb1 / siteB |
|
RG_siteb
ONLINE SECONDARY
xdemca1 / siteA |
|
mikeRG1
ONLINE SECONDARY
xdemcb1 / siteB |
|
mikeRG2
ONLINE SECONDARY
xdemcb1 / siteB |
|
|
|
#
|
|
# Resource groups in node or site collocation configuration:
|
|
# Resource Group(s)
State
Node / Site
|
|
#
|
|
mikeRG1,mikeRG2
ONLINE
xdemca1 / siteA |
Figure 9-7 rg_move of RGs using an online on same site dependency
Similarly, when we tried to remove only one of the resource groups, we received a message
that indicates that the dependency needed to be removed first before the resource group can
be deleted (Figure 9-8).
clrmgrp: ERROR: The resource group mikeRG2 is configured in the resource group
dependency configuration. Please delete the group from the depend ency
configuration prior to removing the group.
Figure 9-8 Message when trying to move one of the resource groups
452
If quorum loss is triggered and the resource group has its secondary instance online
on the affected node, PowerHA tries to recover the secondary instance on another
available node.
If a local network_down occurs on an XD_data network, PowerHA moves the replicated
resource groups that are ONLINE on the particular node to another available node on that
site. This function of the primary instance is mirrored to the secondary instance so that
secondary instances may be recovered via selective failover.
You can disable this default behavior by using the customized resource group recovery policy.
Figure 9-9 shows the panel to disable the selective failover behavior between sites. Run the
smitty hacmp command. Select Extended Configuration Extended Resource
Configuration HACMP Extended Resources Configuration Customize Resource
Group and Resource Recovery Customize Inter-Site Resource Group Recovery.
Then, press Enter.
Customize Inter-Site Resource Group Recovery
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
failover
[]
If the notify method is specified, you can achieve similar behavior as when you define a
HACMPpager event notification. The pager event notification uses sendmail to forward an
email to the account specified.
Even if the selective failover function is disabled, PowerHA still moves the resource group if a
node_down or node_up event occurs. Also, a user-designated rg_move operation continues
to behave as expected with the setting disabled.
Tip: The PowerHA software documentation states that when you select this option, you
can specify the notify method for the individual resource groups:
The resource groups that contain nodes from more than one site are listed. The
resource groups include those with a site management policy of Ignore. These
resource groups are not affected by this function even if you select one of them.
In our PowerHA 6.1 clusters, when this option is selected, it applies to all resource groups
in the cluster. The granularity is not available for individual resource groups.
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
-----------------------------------------------------------------------------
453
RG_sitea
ONLINE
OFFLINE
ONLINE SECONDARY
OFFLINE
xdemca1@siteA
xdemca2@siteA
xdemcb1@siteB
xdemcb2@siteB
For our first test, we failed all connections to the storage from each of the nodes at site A and
watched the cluster lose quorum and attempt a move from node A1 to node A2 (Example 9-5).
Example 9-5 Failed connections to storage
# errpt
CAD234BE
52715FA5
E86653C3
0322123010 U H LVDD
0322123010 U H LVDD
0322123010 P H LVDD
The cluster first released the resource group on node A1 and attempted to acquire it on
node A2. After it detected that all connections were unavailable, it moved the resource group
to the second site as expected (Example 9-6).
Example 9-6 Resource group moved to second site
root@xdemca1:/>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE SECONDARY
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
This test proved that selective failover on volume group loss worked. Next, we ran the same
test with the selective failover option disabled:
1. Disable the selective failover setting by changing the default behavior to notify
(Figure 9-10).
Customize Inter-Site Resource Group Recovery
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
notify
+
[/usr/local/scripts/no>
As shown in Example 9-7, the script was configured only to validate that the failover was
started.
Example 9-7 Script validating the failover
root@xdemca1:/usr/local/scripts>ls -l
total 8
-rwxr----1 root
system
454
cat notify_failover.sh
#!/bin/ksh
touch /usr/local/scripts/ran.sh
exit 0
2. Synchronize the change across all cluster nodes.
Attention: In our scenario, we only defined the notify_failover.sh script on the first
node in the cluster. The cluster verification did not check or complain that it did not exist
on all cluster nodes.
3. Disable all paths to the disks on both nodes at site A (Example 9-8).
Example 9-8 Disabling all paths to the disks on both nodes at site A
On Node A1
# powermt disable hba=0
# powermt disable hba=1
On Node A2
# powermt disable hba=0
# powermt disable hba=1
root@xdemca1:/usr/local/scripts>powermt display
Symmetrix logical device count=58
CLARiiON logical device count=0
Hitachi logical device count=0
Invista logical device count=0
HP xp logical device count=0
Ess logical device count=0
HP HSx logical device count=0
==============================================================================
----- Host Bus Adapters --------- ------ I/O Paths ----- ------ Stats -----### HW Path
Summary
Total
Dead IO/Sec Q-IOs Errors
==============================================================================
0 fscsi0
failed
140
140
0
0
1 fscsi1
failed
140
140
0
0
4. Monitor the status of the resource group by using the repeated clRGinfo commands.
The results of our second test showed that with the selective failover option disabled the
cluster first attempted an rg_move between the local nodes. After it attempted to acquire
and release the resources on the second local node, it left the resources in the state that is
shown in Example 9-9.
Example 9-9 Testing the selective failover option
root@xdemca1:/usr/local/scripts>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ERROR
xdemca1@siteA
ERROR
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
ERROR
xdemcb2@siteB
455
While in the state shown in Example 9-9, we confirmed the status of the replicated LUNs.
Since we only altered access to the storage from the host, the replicated LUNs remained
in a consistent state (Example 9-10).
Example 9-10 State of the LUNs
root@xdemca1:/usr/local/scripts>symrdf list pd
Symmetrix ID: 000190100304
Local Device View
---------------------------------------------------------------------------STATUS
MODES
RDF S T A T E S
Sym
RDF
--------- ----- R1 Inv
R2 Inv ---------------------Dev RDev Typ:G
SA RA LNK MDATE Tracks
Tracks Dev RDev Pair
---- ---- -------- --------- ----- ------- ------- --- ---- ------------0F53 00BF
0F54 00C0
R1:41
R1:41
RW RW RW
RW RW RW
Total
Track(s)
MB(s)
S..1S..1-
0
0
0 RW
0 RW
WD
WD
Synchronized
Synchronized
-------- -------0
0
0.0
0.0
A
X
D
1
X
=
=
=
=
=
Async, S =
Enabled, .
Disk Mode,
R1, 2 = R2
Enabled, .
From the results of our tests, we concluded that disabling the selective failover function
though the Inter-Site Resource Group Recovery panel works as expected.
To recover our environment, we brought the paths to the storage back by re-enabling the
HBAs as follows:
# powermt enable hba=0
# powermt enable hba=1
We followed this step with a resource group operation to bring the resource group back
online. After processing the resources, our resources were activated once again
(Example 9-11).
Example 9-11 Activated resources
root@xdemca1:/usr/local/scripts>clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------RG_sitea
ONLINE
xdemca1@siteA
OFFLINE
xdemca2@siteA
ONLINE SECONDARY
xdemcb1@siteB
OFFLINE
xdemcb2@siteB
456
en0
en1
en0
node1
en1
node2
In Figure 9-11, RSCT performs the following failure detection and diagnosis:
1. RSCT on node1 notices that heartbeat packets are no longer arriving from en1 and
notifies node2 (which has also noticed that heartbeat packets are no longer arriving from
en1).
2. RSCT on both nodes sends diagnostic packets between various combinations of NICs
(including out through one NIC and back in by using another NIC on the same node).
3. The node realizes that all packets that involve node1s en1 are vanishing, but packets that
involve node2s en1 are being received.
4. The diagnosis is that node1s en1 has failed.
5. After it determines that it is unable to communicate with the other node, RSCT concludes
that the other node has failed and must be taken over.
457
458
XD_data
A network that can be used for data replication only. This network
supports adapter swap, but not failover to another node. RSCT
heartbeat packets will be sent on this network. PowerHA 5.4.0 and
later versions support EtherChanneling links and up to four
XD_data networks.
XD_ip
XD_rs232
Considering the limitations of an environments infrastructure can help to identify single points
of failure. For example, consider an environment where both network and I/O traffic pass over
the same infrastructure (Figure 9-12).
Site A
Site B
xd_ip1
LAN
LAN
SAN
SAN
DWDM
DWDM
Node B
Node A
SITEAMETROVG
50GB
50GB
50GB
50GB
50GB
50GB
50GB
50GB
Figure 9-12 Single XD_IP network 4-node SAN Volume Controller Metro Mirror environment
In Figure 9-12, both Ethernet and fiber communications pass between the DWDM devices at
each site. If these interconnects are the only communication links between the sites, a
disruption to them causes cluster partitioning. If that situation occurs, the cluster nodes at
each site assume primary roles for the applications.
459
Figure 9-13 shows the same environment, but includes additional networks based on the
recommendations described previously.
Site A
Site B
xd_rs232
WAN
xd_ip2
xd_ip1
LAN
LAN
SAN
SAN
DWDM
DWDM
Node B
Node A
ECM VG: diskhb_vg1
hdisk2 000fe4111f25a1d1
1GB
1GB
SITEAMETROVG
50GB
50GB
50GB
50GB
50GB
50GB
50GB
50GB
In addition to the XD_ip network that communicates over the DWDM infrastructure, we
included a second XD_ip network that uses a separate network back bone. We also added an
XD_rs232 network and two dedicated shared LUNs for heartbeat networks between the sites.
Each one assume that the requirements and infrastructure are available to configure them.
Each one provides protection against separate component failures and individually hardens
the environment.
The second XD_ip network assumes that there are alternate switches that provide
connectivity between the sites. The XD_rs232 network assumes that you also have the
required hardware and infrastructure to support it. If neither option is available, multiple
networks within the DWDM interconnects might be defined with the assumption that the
physical links can be the single point of failure.
Sharing dedicated LUNs strictly for heartbeating is only possible if the inter-site connection
provides an extended view of the SAN. If so, dedicated LUNs could be shared between the
systems at each site. In our example, we used a LUN from each storage enclosure and
defined two separate disk heartbeat networks. To accommodate this setup, the appropriate
zoning and storage mappings must be in place. The heartbeating takes place independently
from the replication of the LUNs set to copy data between the sites.
460
The heartbeat traffic for the disk heartbeat networks also passes over the DWDM
connections. However, the added value is that there is still a heartbeating link if the physical
interfaces that support the XD_ip network are offline. Our diagram only shows a single node
in each site. Multisite implementations with additional nodes at each site need to balance
between designing a ring topology for the disk heartbeat networks and whether to map a LUN
from each storage subsystem to have even more redundant networks.
Environments that are dispersed in such a way that the SAN could not be extended ultimately
rely on redundant IP-based networks on separately leased lines and XD_rs232 extended
serial networks.
461
However, in the processing of the resources on site B, the cluster redirected the relationship
between the source and target LUNs. The replicated LUNs at the second site were changed
to a master role and the original site was changed to an auxiliary role. This change directly
impacted access to the LUNs on the primary site. In our tests, we lost write access to the
disks and several of our commands hung.
Example 9-12 shows output collected from the tests that were performed on our DS cluster
while in a partitioned state.
Example 9-12 Output from the DS cluster in a partitioned state
462
Figure 9-14 shows the SAN Volume Controller resources online on both sites with the role of
the source and target copies reversed.
Site A
Site B
xd_ip
Broken Links
-----------net_ether
net_ether
P575
P575
svcxd_B1
svcxd_B2
P550
P550
svcxd_A1
svcxd_A2
SAN
SAN
B8_8G4
NO
WRITE ACCESS
METRO MIRROR
Auxiliary
Auxiliary
Auxiliary
Auxiliary
METRO MIRROR
METRO MIRROR
METRO MIRROR
Master
Master
Master
Master
Figure 9-14 Partitioned SAN Volume Controller cluster scenario with resources active on site B
In a split scenario, we can actively run on the second site. Each of our sites is configured with
the designated IPs in separate network segments. Therefore, upon failover, we did not
experience duplicate IP address errors in the error log, which might be another consideration
in a real customer scenario, depending on the setup.
One of the considerations for our testing was the effect of an intermittent problem between
the sites. Therefore, the next step was to start the IP links again. After the IP links were
started, the RSCT communication was re-established and the nodes on the remote site (site
B) were halted. The halt was a result of the cluster that identifies that the resources were
hosted on both sites. The RSCT group services experience a domain merge and log a
GS_DOM_MER_ERR in the error report, which leads to the clstrmgrES core dumping and
halts the nodes (Example 9-13 on page 464).
463
LABEL:
IDENTIFIER:
Date/Time:
Sequence Number:
Machine Id:
Node Id:
Class:
Type:
WPAR:
Resource Name:
GS_DOM_MERGE_ER
9DEC29E1
Wed Mar 24 19:48:06 2010
206835
00CA02EF4C00
svcxd_b1
O
PERM
Global
grpsvcs
Description
Group Services daemon exit to merge domains
Probable Causes
Network between two node groups has repaired
Failure Causes
Network communication has been blocked.
Topology Services has been partitioned.
Recommended Actions
Check the network connection.
Check the Topology Services.
Verify that Group Services daemon has been restarted
Call IBM Service if problem persists
Detail Data
DETECTING MODULE
RSCT,NS.C,1.107.1.49,4461
ERROR ID
6Vb0vR0qGee9/BU4/ckbQ7....................
REFERENCE CODE
DIAGNOSTIC EXPLANATION
NS::Ack(): The master requests to dissolve my domain because of the merge with
other d
omain 1.24
These results are expected to protect data integrity by not ever allowing non-concurrent
resources on both sites at the same time. However, a word of caution here is that we were left
with the active nodes on site A with no write access to the LUNs. Effectively, the application
services were now offline, and we were left with no choice but to recover the environment
manually.
To evaluate the situation and assess the damage, the status of the following items must be
identified:
The role of the planning team and the system architect is to design the environment so that it
minimizes the risk of an intermittent network failure like the one described in the previous
scenario. If the cluster becomes partitioned, you must quickly identify the component
responsible for the failure. If network access is lost, attempting to reach the servers via other
interfaces might be required. Logging on to the management consoles (HMC) might be
required to determine whether the LPARs are still online. Bringing down all of the nodes in
one of the sites may be the best plan to avoid a halt of certain servers if the heartbeat
communication suddenly resumes.
A normal cluster shutdown can fail if access to the volumes is not available. A cluster stop
starts the application stop scripts, which attempt to end the corresponding processes.
However, cluster processing also attempts to unmount and vary off the resources, which
might fail if access to the volumes is unavailable. In such a situation, performing a hard reset
of the nodes is the best approach.
0322123010 U H LVDD
0322123010 U H LVDD
0322123010 P H LVDD
Polling the status of the cluster resources shows you whether the cluster considers the
resource group to be online, even if the application is not operational. To avoid this situation,
have application monitors in place. If none are configured, polling the process table and
checking the status of the specific applications also give an indication of the responsiveness
on the system.
The system administrator already has a list of checks for the individual applications that are
being hosted on the servers. Certain commands might hang when you try to poll the status if
some of the backing resources are unavailable:
# ps -ef | grep ora_pmon_hatest
oracle 548872
1
0 17:19:23
0:00 ora_pmon_hatest
It may still be difficult to predict which servers to bring offline. Therefore, performing such
actions as querying the disks and checking the status of the replicated resources is another
critical part of troubleshooting (Example 9-14).
Example 9-14 Status of the replicated resources
# lquerypv
00000000
00000010
00000020
00000030
00000040
-h /dev/hdisk4
C9C2D4C1 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000 00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
|................|
|................|
|................|
|................|
|................|
465
00000050
00000060
00000070
00000080
00000090
000000A0
000000B0
000000C0
000000D0
000000E0
000000F0
00000000
00000000
00000000
000FE411
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
2579EE4C
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
|................|
|................|
|................|
|....%y.L........|
|................|
|................|
|................|
|................|
|................|
|................|
|................|
When not responsive the command may hang or not any information:
# lquerypv -h /dev/hdisk
....
The command may hang and eventually time out if access to the disks is not available. The
method that is used to check the status of the replicated resources varies depending on the
replication type that is being used.
Example 9-15 shows the status from the primary node of a SAN Volume Controller cluster in
a partitioned scenario. In this output, the A1 node originally hosts the resources. After the
remote site activates the resources, it changes the role of the source volumes from master to
auxiliary. When this situation occurs, we are unable to query the disks.
Example 9-15 Primary node status on a SAN Volume Controller cluster for a partitioned scenario
466
Example 9-16 shows the adapters that are active, but they are different if a problem occurs.
When you use virtual Ethernet adapters from VIOS, troubleshooting might require a deeper
investigation to identify the reason for the network loss.
Example 9-16 Adapter status on the host
# ifconfig -a
en0:
flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT
,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 192.168.8.103 netmask 0xffffff00 broadcast 192.168.8.255
inet 192.168.100.173 netmask 0xffffff00 broadcast 192.168.100.255
inet 192.168.100.54 netmask 0xffffff00 broadcast 192.168.100.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1:
flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT
,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.12.5.36 netmask 0xfffffc00 broadcast 10.12.7.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT>
inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
inet6 ::1/0
tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
If a site split occurs, there might be situations in which a client considers bringing down the
entire environment and manually activating the resources. The client might then perform
manual integrity checks to better assess the situation. When the consistency of the data is
confirmed, the cluster might be restarted in a phased approach. First, reactivate the cluster
on the most appropriate site and then integrate the remaining cluster nodes that are based on
feedback from the specific network, storage, and application administrators.
Summary
In conclusion, the use of the PowerHA Enterprise Edition with the various replication methods
only provides as much resiliency as the infrastructure that supports it. Therefore, the
significance of addressing potential single points of failure in a multisite cluster is even more
apparent. Evaluate and plan the overall infrastructure in such a fashion that the risk of a
partitioned cluster is minimized. The cluster software attempts to always protect the integrity
of the data and efficiently manage the resources. Use such features as the custom
application monitoring and pager notification methods if no other notification is already
provided by separate external monitoring software. Going forward, the PowerHA
SystemMirror Enterprise Edition plans to release additional enhancements and will continue
to serve as the premier HA and disaster recovery solution on AIX for Power Systems.
467
468
10
Chapter 10.
469
cluster.es.pprc.cmds
cluster.es.pprc.rte
cluster.es.spprc.cmds
cluster.es.spprc.rte
cluster.msg.en_US.pprc
PPRC and SPPRC file sets: The PPRC and SPPRC file sets are not required for
Global Mirror support on PowerHA.
The following additional file sets included in SP3 (must be installed separately and require
the acceptance of licenses during the installation):
cluster.es.genxd
cluster.es.genxd.cmds
cluster.es.genxd.rte
cluster.msg.en_US.genxd
cluster.msg.en_US.genxd
Verify that all the data volumes that must be mirrored are visible to all relevant AIX hosts.
Verify that the DS8700 volumes are appropriately zoned so that the IBM FlashCopy
volumes are not visible to the PowerHA SystemMirror nodes.
Ensure all Hardware Management Consoles (HMCs) are accessible by using the Internet
Protocol network for all PowerHA SystemMirror nodes where you want to run Global
Mirror.
10.1.3 Considerations
The PowerHA SystemMirror Enterprise Edition with DS8700 Global Mirror has the following
considerations:
The AIX Virtual SCSI is not supported in this initial release.
No auto-recovery is available from a PPRC path or link failure.
If the PPRC path or link between Global Mirror volumes breaks down, the PowerHA
Enterprise Edition is unaware of it. (PowerHA does not process Simple Network
Management Protocol (SNMP) for volumes that use DS8K Global Mirror technology for
mirroring). In this case, the user must identify and correct the PPRC path failure.
Depending on timing conditions, such an event can result in the corresponding Global
Mirror session to go to a unrecoverable state. If this situation occurs, the user must
manually stop and restart the corresponding Global Mirror Session (by using the rmgmir
and mkgmir DSCLI commands) or an equivalent DS8700 interface.
Cluster Single Point Of Control (C-SPOC) cannot perform some Logical Volume Manager
(LVM) operations on nodes at the remote site that contain the target volumes.
Operations that require nodes at the target site to read from the target volumes result in an
error message in C-SPOC. Such operations include such functions as changing the file
system size, changing the mount point, and adding LVM mirrors. However, nodes on the
same site as the source volumes can successfully perform these tasks, and the changes
can be propagated later to the other site by using a lazy update.
Attention: For C-SPOC operations to work on all other LVM operations, you must
perform all C-SPOC operations with the DS8700 Global Mirror volume pairs in a
synchronized or consistent state. Alternatively, you must perform them in the active
cluster on all nodes.
The volume group names must be listed in the same order as the DS8700 mirror group
names in the resource group.
471
Txrmnia
For this test, the resources are limited. Each system has a single IP, an XD_ip network, and
single Fibre Channel (FC) host adapters. Ideally, redundancy might exist throughout the
system, including in the local Ethernet networks, cross-site XD_ip networks, and FC
connectivity. This scenario has a single resource group, ds8kgmrg, which consists of a service
IP address (service_1), a volume group (txvg), and a DS8000 Global Mirror replicated
resource (texasmg). To configure the cluster, see 10.6, Configuring the cluster on page 483.
For information about how to set up the storage units, see IBM System Storage DS8700
Architecture and Implementation, SG24-8786.
DS:
username: redbook
password: r3dbook
hmc1:
9.3.207.122
devid:
IBM.2107-75DC890
remotedevid:
IBM.2107-75DC980
473
Texas
Romania
2604
0A08
2C04
2C00
2600
2E00
2804
2800
Global Copy
Data Volume
Flash Copy
Table 10-1 shows the association between the source and target volumes of the replication
relationship and between their logical subsystems (LSS, the two most significant digits of a
volume identifier that are highlighted in bold in the table). Table 10-1 also indicates the
mapping between the volumes in the DS8000 units and their disk names on the attached AIX
hosts.
Table 10-1 AIX hdisk to DS8000 volume mapping
Site Texas
Site Romania
AIX disk
LSS/VOL ID
LSS/VOL ID
AIX disk
hdisk10
2E00
2800
hdisk2
hdisk6
2600
2C00
hdisk6
You can easily obtain this mapping by using the lscfg -vl hdiskX | grep Serial command
as shown in Example 10-3. The hdisk serial number is a concatenation of the storage image
serial number and the ID of the volume at the storage level.
Example 10-3 The hdisk serial number in the lscfg command output
474
475
3. Among the multiple displayed links, choose two that have their ports on different adapters.
Use them to create the PPRC path for the 2e:28 LSS pair (see Example 10-5).
Example 10-5 Creating pprc paths
476
477
2e 03
2010 6:11:07 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
Session 03 opened successfully.
26 03
2010 6:11:25 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
Session 03 opened successfully.
28 03
2010 5:39:02 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
Session 03 opened successfully.
2c 03
2010 5:39:15 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
Session 03 opened successfully.
Including all the source and target volumes in the Global Mirror session
Add the volumes in the Global Mirror sessions and verify their status by using the commands
that are shown in Example 10-11.
Example 10-11 Adding source and target volumes to the Global Mirror sessions
dscli> chsession -lss 26 -action add -volume 2600 03
Date/Time: October 5, 2010 6:15:17 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
CMUC00147I chsession: Session 03 successfully modified.
dscli> chsession -lss 2e -action add -volume 2e00 03
Date/Time: October 5, 2010 6:15:56 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
CMUC00147I chsession: Session 03 successfully modified.
dscli> lssession 26 2e
Date/Time: October 5, 2010 6:16:21 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
LSS ID Session Status Volume VolumeStatus PrimaryStatus
SecondaryStatus
FirstPassComplete
AllowCascading
===========================================================================================================
26
03
Normal 2600
Join Pending Primary Copy Pending Secondary Simplex True
Disable
2E
03
Normal 2E00
Join Pending Primary Copy Pending Secondary Simplex True
Disable
dscli>
dscli> chsession -lss
Date/Time: October 6,
CMUC00147I chsession:
dscli> chsession -lss
Date/Time: October 6,
478
root@leeann: lvlstmajor
50...
root@robert: lvlstmajor
44..54,56...
root@jordan: # lvlstmajor
50...
479
2. Create a volume group, called txvg, and a file system, called /txro. These volumes are
already identified in 10.4.2, Identifying the source and target volumes on page 473. They
are hdisk6 and hdisk10 on the jordan node. Example 10-13 shows a list of commands to
run on the jordan node.
Example 10-13 Creating txvg volume group on jordan
Verify that the shared disks have the same PVID on both nodes.
Run the rmdev -dl command for each hdisk.
Run the cfgmgr program.
Run the importvg command.
Example 10-14 Importing the txvg volume group on the leeann node
480
txvg
txvg
PVs
2
1
LV STATE
open/syncd
open/syncd
MOUNT POINT
/txro
N/A
4. To make the target volumes available to the attached hosts, use the failoverpprc
command on the secondary site as shown in Example 10-16.
Example 10-16 The failoverpprc command on the secondary site storage unit
dscli> failoverpprc -type gcp 2C00:2600 2800:2E00
Date/Time: October 6, 2010 3:55:19 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
CMUC00196I failoverpprc: Remote Mirror and Copy pair 2C00:2600 successfully reversed.
CMUC00196I failoverpprc: Remote Mirror and Copy pair 2800:2E00 successfully reversed.
dscli> lspprc 2C00:2600 2800:2E00
Date/Time: October 6, 2010 3:55:35 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
481
ID
State
Reason
Type
SourceLSS Timeout (secs) Critical Mode First Pass Status
====================================================================================================
2800:2E00 Suspended Host Source Global Copy 28
60
Disabled
True
2C00:2600 Suspended Host Source Global Copy 2C
60
Disabled
True
dscli>
5. Refresh and check the PVIDs. Then, import and vary off the volume group as shown in
Example 10-17.
Example 10-17 Importing the volume group txvg on the secondary site node, robert
txvg
txvg
PVs
2
1
LV STATE
closed/syncd
closed/syncd
MOUNT POINT
/txro
N/A
482
Adding a cluster
Adding nodes
Adding sites
Adding networks
Adding communication interfaces
Adding a cluster
To add a cluster:
1. From the command line, type the smitty hacmp command.
2. In SMIT, select Extended Configuration Extended Topology Configuration
Configure an HACMP Cluster Add/Change/Show an HACMP Cluster.
3. Enter the cluster name, which is Txrmnia in this scenario, as shown in Figure 10-3. Press
Enter.
Add/Change/Show an HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[Txrmnia]
* Cluster Name
Figure 10-3 Adding a cluster in the SMIT menu
483
Adding nodes
To add the nodes:
1. From the command line, type the smitty hacmp command.
2. In SMIT, select the path Extended Configuration Extended Topology
Configuration Configure HACMP Nodes Add a Node to the HACMP Cluster.
3. Enter the desired node name, which is jordan in this case, as shown in Figure 10-4. Press
Enter. The SMIT Command Status window shows the output.
Add a Node to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
[jordan]
[]
* Node Name
Communication Path to Node
4. In this scenario, repeat these steps two more times to add the additional nodes of leeann
and robert.
Adding sites
To add the nodes:
1. From the command line, type the smitty hacmp command.
2. In SMIT, select the path Extended Configuration Extended Topology
Configuration Configure HACMP Sites Add a Site.
3. Enter the desired site name, which in this scenario is the Texas site with the nodes jordan
and leeann, as shown in Figure 10-5. Press Enter. The SMIT Command Status window
shows the output.
Add a Site
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
* Site Name
* Site Nodes
[Entry Fields]
[Texas]
jordan leeann
4. In this scenario, repeat these steps to add the Romania site with the robert node.
484
+
+
Example 10-19 shows the site definitions. The dominance information is displayed, but not
relevant until a resource group is defined later by using the nodes.
Example 10-19 cllssite information about site definitions
./cllssite
---------------------------------------------------Sitename
Site Nodes
Dominance
--------------------------------------------------Texas
jordan leeann
Romania
robert
Protection Type
NONE
NONE
Adding networks
To add the nodes:
1. From the command line, type the smitty hacmp command.
2. In SMIT, select the path Extended Configuration Extended Topology
Configuration Configure HACMP Networks Add a Network to the HACMP
Cluster.
3. Choose the network type, which in this scenario is XD_ip.
4. Keep the default network name and press Enter (Figure 10-6).
Add an IP-Based Network to the HACMP Cluster
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
Network Name
Network Type
Netmask(IPv4)/Prefix Length(IPv6)
Enable IP Address Takeover via IP Aliases
IP Address Offset for Heartbeating over IP Aliases
[Entry Fields]
[net_XD_ip_01]
XD_ip
[255.255.255.0]
[Yes]
[]
5. Repeat these steps but select a network type of diskhb for the disk heartbeat network and
keep the default network name of net_diskhb_01.
485
4. Complete the SMIT menu fields. The first interface in this scenario is for jordan is shown
in Figure 10-7. Press Enter. The SMIT Command Status window shows the output.
Add a Communication Interface
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
[Entry Fields]
[jordan_base]
XD_ip
net_XD_ip_01
[jordan]
IP Label/Address
Network Type
Network Name
Node Name
5. Repeat these steps, and select Communication Devices to complete the disk heartbeat
network.
The topology is now configured. Also, you can see all the interfaces and devices from the
cllsif command output that is shown in Figure 10-8.
Adapter
jordan_base
jordandhb
leeann_base
leeanndhb
robert_base
Type
boot
service
boot
service
boot
Network
Net Type
net_XD_ip_01 XD_ip
net_diskhb_01 diskhb
net_XD_ip_01 XD_ip
net_diskhb_01 diskhb
net_XD_ip_01 XD_ip
Attribute
public
serial
public
serial
public
Node
jordan
jordan
leeann
leeann
robert
IP Address
9.3.207.209
/dev/hdisk8
9.3.207.208
/dev/hdisk8
9.3.207.207
486
In most true site scenarios, where each site is on different segments, it is common to create
at least two service IP labels. You create one for each site by using the Associated Site
option, which indicates the desire to have site-specific service IP labels. With this option, you
can have a unique service IP label at each site. However, we do not use them in this test
because we are on the same network segment.
Storage system
Mirror group
487
3. Complete the menu appropriately and press Enter. Figure 10-10 shows the configuration
for this scenario. The SMIT Command Status window shows the output.
Add a Storage Agent
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
*
*
*
*
[Entry Fields]
[ds8khmc]
[9.3.207.122]
[redbook]
[r3dbook]
It is possible to have multiple storage agents. However, this test scenario has only one
storage agent that manages both storage units.
Important: The user ID and password are stored as flat text in the
HACMPxd_storage_agent.odm file.
*
*
*
*
*
488
[Entry Fields]
[texasds8k]
ds8kmainhmc
Texas
[IBM.2107-75DC890]
[5005076308FFC004]
+
+
+
+
4. Repeat these steps for the storage system at Romania site, and name it romaniads8k.
Example 10-20 shows the configuration.
Example 10-20 Storage systems definitions
texasds8k
ds8kmainhmc
Texas
IBM.2107-75DC890
5005076308FFC004
romaniads8k
ds8kmainhmc
Romania
IBM.2107-75DC980
5005076308FFC804
*
*
*
*
[Entry Fields]
[texasmg]
texasds8k romaniads8k
[03]
+
automatic
[50]
[30]
[0]
+
+
Vendor Specific Identifier field: For the Vendor Specific Identifier field, provide only the
Global Mirror session number.
489
[Entry Fields]
[ds8kgmrg]
Startup Policy
Fallover Policy
Fallback Policy
+
+
+
ds8kgmrg
Prefer Primary Site
jordan leeann
robert
Online On Home Node Only
Fallover Policy
Fallback Policy
Service IP Label
Volume Groups
GENXD Replicated Resources
DS8000 Global Mirror Replicated Resources field: In the SMIT menu for adding
resources to the resource group, notice that the appropriate field is named DS8000 Global
Mirror Replicated Resources. However, when you view the menu by using the clshowres
command (Example 10-21 on page 490), the field is called GENXD Replicated Resources.
You can now synchronize the cluster, start the cluster, and begin testing it.
491
jordan# clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------ds8kgmrg
ONLINE
jordan@Texas
OFFLINE
leeann@Texas
ONLINE SECONDARY
robert@Romania
After each test, we show the Global Mirror states. Example 10-23 shows the normal running
production status of the Global Mirror pairs from each site.
Example 10-23 Beginning states of the Global Mirror pairs
492
493
4. Select the Romania site from the next menu as shown in Figure 10-15.
+--------------------------------------------------------------------------+
|
Select a Destination Site
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
# *Denotes Originally Configured Primary Site
|
|
Romania
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
| /=Find
n=Find Next
|
+--------------------------------------------------------------------------+
Figure 10-15 Selecting a site for a resource group move
494
Upon completion of the move, ds8kgmrg is online on the node robert as shown
Example 10-24.
Attention: During our testing, a problem was encountered. After we performed the first
resource group move between sites, we were unable to move it back because the pick
list for destination site is empty. We could move it back by node. Later in our testing, the
by-site option started working. However, it moved the resource group to the standby
node at the primary site instead of the original primary node. If you encounter similar
problems, contact IBM support.
Example 10-24 Resource group status after the site move to Romania
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------ds8kgmrg
ONLINE SECONDARY
jordan@Texas
OFFLINE
leeann@Texas
ONLINE
robert@Romania
6. Repeat the resource group move to move it back to its original primary site, Texas, and
node, jordan, to return to the original starting state. However, instead of using the option
to move it another site, use the option to move it to another node.
Example 10-25 shows that the Global Mirror statuses are now swapped, and the local site is
showing the LUNs now as the target volumes.
Example 10-25 Global Mirror status after the resource group move
*******************From node jordan at site Texas***************************
dscli> lssession 26 2E
Date/Time: October 10, 2010 4:04:44 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
LSS ID Session Status Volume VolumeStatus PrimaryStatus
SecondaryStatus
FirstPassComplete
AllowCascading
===========================================================================================================
======
26
03
Normal 2600
Active
Primary Simplex Secondary Copy Pending True
Disable
2E
03
Normal 2E00
Active
Primary Simplex Secondary Copy Pending True
Disable
dscli> lspprc 2600 2E00
Date/Time: October 10, 2010 4:05:26 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
ID
State
Reason Type
SourceLSS Timeout (secs) Critical Mode First Pass Status
=========================================================================================================
2800:2E00 Target Copy Pending Global Copy 28
unknown
Disabled
Invalid
2C00:2600 Target Copy Pending Global Copy 2C
unknown
Disabled
Invalid
*******************From remote node robert at site Romania***************************
dscli> lssession 28 2C
Date/Time: October 10, 2010 3:59:25 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
LSS ID Session Status
Volume VolumeStatus PrimaryStatus
SecondaryStatus
FirstPassComplete
AllowCascading
===========================================================================================================
==============
28
03
CG In Progress 2800
Active
Primary Copy Pending Secondary Simplex True
Disable
2C
03
CG In Progress 2C00
Active
Primary Copy Pending Secondary Simplex True
Disable
495
Begin with all three nodes active in the cluster and the resource group online on the primary
node as shown in Example 10-22 on page 492.
On the node jordan, we run the reboot -q command. The node leeann acquires the
ds8kgmrg resource group as shown in Example 10-26.
Example 10-26 Local node failover within the site Texas
root@leeann: clRGinfo
-----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------ds8kgmrg
OFFLINE
jordan@Texas
ONLINE
leeann@Texas
ONLINE SECONDARY
robert@Romania
Example 10-27 shows that the statuses are the same as when we started.
Example 10-27 Global Mirror pair status after a local failover
496
Upon the cluster stabilization, we run the reboot -q command on the leeann node to start a
site_down event. The robert node at the Romania site acquires the ds8kgmrg resource group
as shown in Example 10-28.
Example 10-28 Hard failover between sites
root@robert: clRGinfo
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------ds8kgmrg
OFFLINE
jordan@Texas
OFFLINE
leeann@Texas
ONLINE
robert@Romania
You can also see that the replicated pairs are now in the suspended state at the remote site as
shown in Example 10-29.
Example 10-29 Global Mirror pair status after site failover
*******************From remote node robert at site Romania***************************
dscli> lssession 28 2c
Date/Time: October 10, 2010 4:17:28 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
LSS ID Session Status Volume VolumeStatus PrimaryStatus
SecondaryStatus
FirstPassComplete
AllowCascading
===========================================================================================================
28
03
Normal 2800
Join Pending Primary Suspended Secondary Simplex False
Disable
2C
03
Normal 2C00
Join Pending Primary Suspended Secondary Simplex False
Disable
dscli> lspprc 2800 2c00
Date/Time: October 10, 2010 4:17:55 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
ID
State
Reason
Type
SourceLSS Timeout (secs) Critical Mode First Pass Status
====================================================================================================
2800:2E00 Suspended Host Source Global Copy 28
60
Disabled
False
2C00:2600 Suspended Host Source Global Copy 2C
60
Disabled
False
497
Important: Although the testing resulted in a site_down event, we never lost access to the
primary storage subsystem. PowerHA does not check storage connectivity back to the
primary site during this event. Before you move back to the primary site, re-establish the
replicated pairs and get them all back in sync. If you replace the storage, you might also
have to change the storage agent, storage subsystem, and mirror groups to ensure that
the new configuration is correct for the cluster.
Verify that the Global Mirror statuses at the primary site are suspended.
Fail back PPRC from the secondary site.
Verify that the Global Mirror status at the primary site shows the target status.
Verify that out-of-sync tracks are 0.
Stop the cluster to ensure that the volume group I/O is stopped.
Fail over the PPRC on the primary site.
Fail back the PPRC on the primary site.
Start the cluster.
2. On the remote node robert, fail back the PPRC pairs as shown in Example 10-31.
Example 10-31 Failing back PPRC pairs at the remote site
*******************From node robert at site Romania***************************
dscli> failbackpprc -type gcp 2C00:2600 2800:2E00
Date/Time: October 10, 2010 4:22:09 PM CDT IBM DSCLI Version: 6.5.15.19 DS:
IBM.2107-75DC980
CMUC00197I failbackpprc: Remote Mirror and Copy pair 2C00:2600 successfully failed back.
CMUC00197I failbackpprc: Remote Mirror and Copy pair 2800:2E00 successfully
498
3. After you run the fallback, check the status again of the pairs from the primary site to
ensure that they are now shown as Target (Example 10-32).
Example 10-32 Verifying that the primary site LUNs are now target LUNs
499
3. Again verify that the status is in the suspended state on the primary site and that the
remote site shows the copy state as shown in Example 10-35.
Example 10-35 Global Mirror pairs that are suspended on the primary site
Verify the status of the pairs at each site as shown in Example 10-37.
Example 10-37 Global Mirror pairs failed back to the primary site
*******************From node jordan at site Texas***************************
dscli> lspprc 2600 2e00
Date/Time: October 10, 2010 4:47:04 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
ID
State
Reason Type
SourceLSS Timeout (secs) Critical Mode First Pass Status
==================================================================================================
2600:2C00 Copy Pending Global Copy 26
60
Disabled
True
2E00:2800 Copy Pending Global Copy 2E
60
Disabled
True
******************From node robert at site Romania***************************
dscli> lspprc 2800 2c00
Date/Time: October 10, 2010 4:40:44 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC980
ID
State
Reason Type
SourceLSS Timeout (secs) Critical Mode First Pass Status
=========================================================================================================
2600:2C00 Target Copy Pending Global Copy 26
unknown
Disabled
Invalid
2E00:2800 Target Copy Pending Global Copy 2E
unknown
Disabled
Invalid
500
[Entry Fields]
now
[jordan,leeann,robert]
Automatically
true
true
false
Interactively
+
+
+
+
+
+
+
Upon startup of the primary node jordan, the resource group is automatically started on
jordan and back to the original starting point as shown in Example 10-38.
Example 10-38 Resource group status after restart
----------------------------------------------------------------------------Group Name
State
Node
----------------------------------------------------------------------------ds8kgmrg
ONLINE
jordan@Texas
OFFLINE
leeann@Texas
ONLINE SECONDARY
robert@Romania
2. Verify the pair and session status on each site as shown in Example 10-39.
Example 10-39 Global Mirror pairs back to normal
*******************From node jordan at site Texas***************************
dscli>lssession 26 2e
Date/Time: October 10, 2010 5:02:11 PM CDT IBM DSCLI Version: 6.5.15.19 DS: IBM.2107-75DC890
LSS ID Session Status
Volume VolumeStatus PrimaryStatus
SecondaryStatus
FirstPassComplete
AllowCascading
===========================================================================================================
==============
26
03
CG In Progress 2600
Active
Primary Copy Pending Secondary Simplex True
Disable
2E
03
CG In Progress 2E00
Active
Primary Copy Pending Secondary Simplex True
Disable
501
Romania
AIX DISK
LSS/VOL ID
AIX DISK
LSS/VOL ID
hdisk11
2605
hdisk10
2C06
f.
g.
h.
i.
3. Add the new disk into the volume group by using C-SPOC as follows:
Important: C-SPOC cannot perform certain LVM operations on nodes at the remote
site (that contain the target volumes). Such operations include operations that require
nodes at the target site to read from the target volumes. These operations cause an
error message in C-SPOC. These operations include such functions as changing file
system size, changing mount point, and adding LVM mirrors. However, nodes on the
same site as the source volumes can successfully perform these tasks. The changes
can be propagated later to the other site by using a lazy update.
For C-SPOC operations to work on all other LVM operations, perform all C-SPOC
operations with the Global Mirror volume pairs in synchronized or consistent states or
the ACTIVE cluster on all nodes.
a. From the command line, type the smitty cl_admin command.
b. In SMIT, select the path System Management (C-SPOC) Storage Volume
Groups Add a Volume to a Volume Group.
c. Select the txvg volume group from the menu.
d. Select the disk or disks by PVID as shown in Figure 10-17.
Set Characteristics of a Volume Group
Move cursor to desired item and press Enter.
Add a Volume to a Volume Group
Change/Show characteristics of a Volume Group
Remove a Volume from a Volume Group
Enable/Disable a Volume Group for Cross-Site LVM Mirroring Verification
+--------------------------------------------------------------------------+
|
Physical Volume Names
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
000a624a987825c8 ( hdisk10 on node robert )
|
|
000a624a987825c8 ( hdisk11 on nodes jordan,leeann )
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 10-17 Disk selection to add to the volume group
503
e. Verify the menu information, as shown in Figure 10-18, and press Enter.
Add a Volume to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
txvg
ds8kgmrg
jordan,leeann,robert
robert
hdisk10
Upon completion of the C-SPOC operation, the local nodes were updated, but the remote
node was not updated as shown in Example 10-40. This node was not updated because the
target volumes are not readable until the relationship is swapped. You receive an error
message from C-SPOC, as shown in the note after Example 10-40. However, the lazy update
procedure at the time of failover pulls in the remaining volume group information.
Example 10-40 New disk added to volume group on all nodes
txvg
txvg
txvg
txvg
txvg
txvg
root@robert: lspv
hdisk2
000a624a833e440f
hdisk6
000a625afe2a4958
hdisk10
000a624a987825c8
txvg
txvg
none
Attention: When you use C-SPOC to modify a volume group that contains a Global Mirror
replicated resource, you can expect to see the following error message:
cl_extendvg: Error executing clupdatevg txvg 000a624a833e440f on node robert
You do not need to synchronize the cluster because all of these changes are made to an
existing volume group. However, consider running a verification.
504
505
5. Complete the information in the final menu (Figure 10-20), and press Enter.
We added a new logical volume, named pattilv, which consists of 100 logical partitions
(LPARs) and selected raw for the type. We left all other values with their defaults.
Add a Logical Volume
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Resource Group Name
VOLUME GROUP name
Node List
Reference node
* Number of LOGICAL PARTITIONS
PHYSICAL VOLUME names
Logical volume NAME
Logical volume TYPE
POSITION on physical volume
RANGE of physical volumes
MAXIMUM NUMBER of PHYSICAL VOLUMES
to use for allocation
Number of COPIES of each logical
[MORE...15]
[Entry Fields]
ds8kgmrg
txvg
jordan,leeann,robert
jordan
[100]
hdisk11
[pattilv]
[raw]
outer_middle
minimum
[]
1
+
+
+
#
+
6. Upon completion of the C-SPOC operation, verify that the new logical volume is created
locally on node jordan as shown in Example 10-41.
Example 10-41 Newly created logical volume
root@jordan:lsvg -l txvg
txvg:
LV NAME
TYPE
txlv
jfs2
txloglv
jfs2log
pattilv
raw
LPs
250
1
100
PPs
150
1
100
PVs
3
1
1
LV STATE
open/syncd
open/syncd
closed/syncd
MOUNT POINT
/txro
N/A
N/A
Similar to when you create the volume group, you see an error message (Figure 10-21) about
being unable to update the remote node.
COMMAND STATUS
Command: OK
stdout: yes
stderr: no
506
[Entry Fields]
txvg
ds8kgmrg
robert,leeann,jordan
/txro
[/txro]
Megabytes
[1250]
[]
no
read/write
[]
+
#
+
+
+
Figure 10-22 Changing the file system size on the final C-SPOC menu
5. Upon completion of the C-SPOC operation, verify that the new file system size locally on
node jordan increased from 250 LPAR as shown in Example 10-41 on page 506 to 313
LPAR as shown Example 10-42.
Example 10-42 Newly increased file system size
root@jordan:lsvg -l txvg
txvg:
LV NAME
TYPE
txlv
jfs2
txloglv
jfs2log
pattilv
raw
LPs
313
1
100
PPs
313
1
100
PVs
3
1
1
LV STATE
open/syncd
open/syncd
closed/syncd
MOUNT POINT
/txro
N/A
N/A
A cluster synchronization is not required because technically the resources did not change.
All of the changes were made to an existing volume group that is already a resource in the
resource group.
507
Romania
AIX dISK
LSS/VOL ID
AIX dISK
LSS/VOL ID
hdisk11
2605
hdisk10
2C06
Now continue with the following steps, which are the same as those steps for defining new
LUNs:
1. Run the cfgmgr command on the primary node jordan.
2. Assign the PVID on the node jordan:
chdev -l hdisk11 -a pv=yes
3. Configure the disk and PVID on the local node leeann by using the cfgmgr command.
4. Verify that PVID shows up by using the lspv command.
5. Pause the PPRC on the primary site.
6. Fail over the PPRC to the secondary site.
7. Fail back the PPRC to the secondary site.
8. Configure the disk and PVID on the remote node robert by using the cfgmgr command.
9. Verify that PVID shows up by using the lspv command.
10.Pause the PPRC on the secondary site.
11.Fail over the PPRC to the primary site.
12.Fail back the PPRC to the primary site.
The main difference between adding a volume group and extending an existing one is that,
when you add a volume group, you must swap the pairs twice. When you extend an existing
volume group, you can get away with only swapping once. This difference is similar to the
original setup where we created all LVM components on the primary site and swap the PPRC
pairs to the remote site to import the volume group and then swap it back.
You can avoid performing two swaps, as we showed, by not choosing to include the third node
when creating the volume group. Then, you can swap the pairs, run cfgmgr on the new disk
with the PVID, import the volume group, and swap the pairs back.
508
509
510
5. Select the volume group type. In this scenario, we select scalable as shown in
Figure 10-25.
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Set Characteristics of a Volume Group
+--------------------------------------------------------------------------+
|
Volume Group Type
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
Legacy
|
|
Original
|
|
Big
|
|
Scalable
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 10-25 Choosing the volume group type for the new volume group pick list
[Entry Fields]
jordan,leeann,robert
[ds8kgmrg]
000a624a9bb74ac3
[princessvg]
4
[51]
false
Fast Disk Takeover or>
Scalable
32
256
+
#
+
+
+
+
511
Instead of using C-SPOC, you can perform the steps manually and then import the volume
groups on each node as needed. However, remember to add the volume group into the
resource group after you create it. With C-SPOC, you can automatically add it to the resource
group while you are creating the volume group.
You can also use the C-SPOC CLI commands (Example 10-43). These commands are in the
/usr/es/sbin/cluster/cspoc directory, and all begin with the cli_ prefix. Similar to the SMIT
menus, their operation output is also saved in the cspoc.log file.
Example 10-43 C-SPOC CLI commands
root@jordan: ls cli_*
cli_assign_pvids cli_extendlv
cli_chfs
cli_extendvg
cli_chlv
cli_importvg
cli_chvg
cli_mirrorvg
cli_crfs
cli_mklv
cli_crlvfs
cli_mklvcopy
cli_mkvg
cli_on_cluster
cli_on_node
cli_reducevg
cli_replacepv
cli_rmfs
cli_rmlv
cli_rmlvcopy
cli_syncvg
cli_unmirrorvg
cli_updatevg
Upon completion of the C-SPOC operation, the local nodes are updated, but the remote node
is not as shown in Example 10-44. The remote nodes are not updated because the target
volumes are not readable until the relationship is swapped. You see an error message from
C-SPOC as shown in the note after Example 10-44. After you create all LVM structures, you
swap the pairs back to the remote node and import the new volume group and logical volume.
Example 10-44 New disk added to volume group on all nodes
princessvg
princessvg
-V 51 -c -y princessvg -Q
While this message is normal, if you select any remote nodes, you can omit the remote
nodes and then you do not see the error message. This step is allowed because you
manually import it anyway.
When creating the volume group, it usually is automatically added to the resource group as
shown in Example 10-45 on page 512. However, with the error message indicted in the
previous attention box, it might not be automatically added. Therefore, double check that the
volume group is added into the resource group before continuing. Otherwise, we do not have
to change the resource group any further. The new LUN pairs are added to the same storage
subsystems and the same session (3) that is already defined in the mirror group texasmg.
Example 10-45 New volume group added to existing resource group
ds8kgmrg
Prefer Primary Site
jordan leeann
robert
Online On Home Node Only
Fallover To Next Priority Node
Never Fallback
serviceip_2
txvg princessvg +
texasmg
PPs
38
PVs
1
LV STATE
closed/syncd
MOUNT POINT
N/A
513
F2=Refresh
F6=Command
F3=Cancel
F7=Edit
+
+
F4=List
F8=Image
514
11
Chapter 11.
515
HACMP
HACMP
HACMP
HACMP
Hitachi
Hitachi
Hitachi
Hitachi
516
11.1.3 Considerations
Keep in mind the following considerations for mirroring PowerHA SystemMirror Enterprise
Edition with TrueCopy/HUR:
AIX Virtual SCSI is not supported in this initial release.
Logical Unit Size Expansion (LUSE) for Hitachi is not supported.
Only fence-level NEVER is supported for synchronous mirroring.
Only HUR is supported for asynchronous mirroring.
The dev_name must map to a logical device, and the dev_group must be defined in the
HORCM_LDEV section of the horcm.conf file.
The PowerHA SystemMirror Enterprise Edition TrueCopy/HUR solution uses dev_group
for any basic operation, such as the pairresync, pairevtwait, or horctakeover operation.
If several dev_names are in a dev_group, the dev_group must be enabled for consistency.
PowerHA SystemMirror Enterprise Edition does not trap Simple Network Management
Protocol (SNMP) notification events for TrueCopy/HUR storage. If a TrueCopy link goes
down when the cluster is up and later the link is repaired, you must manually
resynchronize the pairs.
The creation of pairs is done outside the cluster control. You must create the pairs before
you start the cluster services.
Resource groups that are managed by PowerHA SystemMirror Enterprise Edition cannot
contain volume groups with both TrueCopy/HUR-protected and
non-TrueCopy/HUR-protected disks.
All nodes in the PowerHA SystemMirror Enterprise Edition cluster must use same horcm
instance.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
517
You cannot use Cluster Single Point Of Control (C-SPOC) for the following Logical Volume
Manager (LVM) operations to configure nodes at the remote site that contain the target
volume:
Creating a volume group
Operations that require nodes at the target site to write to the target volumes
For example, changing the file system size, changing the mount point, or adding LVM
mirrors cause an error message in C-SPOC. However, nodes on the same site as the
source volumes can successfully perform these tasks. The changes are then
propagated to the other site by using a lazy update.
C-SPOC on other LVM operations: For C-SPOC operations to work on all other LVM
operations, perform all C-SPOC operations when the cluster is active on all PowerHA
SystemMirror Enterprise Edition nodes and the underlying TrueCopy/HUR PAIRs are in
a PAIR state.
518
If you are installing CCI from a CD, use the RMinstsh and RMuninst scripts on the CD to
automatically install and uninstall the CCI software.
Important: You must install the Hitachi CCI software into the /HORCM/usr/bin directory.
Otherwise, you must create a symbolic link to this directory.
For other media, use the instructions in the following sections.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
519
520
service
horcm1
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
521
HORCM_CMD
#dev_name => hdisk of Command Device
#UnitID 0 (Serial# eg. 45306)
/dev/hdisk19
HORCM_DEV
#Map dev_grp to LDEV#
#dev_group dev_name port# TargetID LU# MU#
VG01
test01
CL1-B
1
5
0
VG01
work01
CL1-B
1
21 0
VG01
work02
CL1-B
1
22 0
HORCM_INST
#dev_group ip_address
VG01
10.15.11.194
service
horcm1
NOTE 1: For the horcm instance to use any available command device, in case one of
them fails, it is RECOMMENDED that, in your horcm file, under HORCM_CMD
section, the command device, is presented in the format below,
where 10133 is the serial # of the array:
\\.\CMD-10133:/dev/hdisk/
For example:
\\.\CMD-10133:/dev/rhdisk19 /dev/rhdisk20
NOTE 2: The Device_File will show "-----" for the "pairdisplay -fd" command,
which will also cause verification to fail, if the ShadowImage license
has not been activated on the storage system and the MU# column is not
empty.
It is therefore recommended that the MU# column be left blank if the
ShadowImage license is NOT activated on the storage system.
522
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
523
Miami
Austin
Jessica
Krod
Maddi
USPV Microcode
60-06-16/00
USPV Microcode
60-06-05/00
FC Links
USP-VM Ser#35754
USP-V Ser#45306
hdisc38
hdisc39
truesyncvg
hdisc40
hdisc38
TrueCopy sync
hdisc41
ursasyncvg
truesyncvg
hdisc40
URS async
hdisc39
hdisc41
ursasyncvg
Each site consists of two type Ethernet networks. In this case, both networks are used for a
public Ethernet and for cross-site networks. Usually the cross-site network is on separate
segments and is an XD_ip network. It is also common to use site-specific service IP labels.
Example 11-2 shows the interlace list from the cluster topology.
Example 11-2 Test topology information
root@jessica: llsif
Adapter
Type
jessica
boot
jessicaalt
boot
service_1
service
service_2
service
bina
boot
bina alt
boot
service_1
service
service_2
service
krod
boot
krod alt
boot
service_1
service
service_2
service
maddi
boot
maddi alt
boot
service_1
service
service_2
service
524
Network
Net Type
net_ether_02 ether
net_ether_03 ether
net_ether_03 ether
net_ether_03 ether
net_ether_02 ether
net_ether_03 ether
net_ether_03 ether
net_ether_03 ether
net_ether_02 ether
net_ether_03 ether
net_ether_03 ether
net_ether_03 ether
net_ether_02 ether
net_ether_03 ether
net_ether_03 ether
net_ether_03 ether
Attribute
public
public
public
public
public
public
public
public
public
public
public
public
public
public
public
public
Node
jessica
jessica
jessica
jessica
bina
bina
bina
bina
krod
krod
krod
krod
maddi
maddi
maddi
maddi
IP Address
9.3.207.24
207.24.1.1
1.2.3.4
1.2.3.5
9.3.207.77
207.24.1.2
1.2.3.4
1.2.3.5
9.3.207.79
207.24.1.3
1.2.3.4
1.2.3.5
9.3.207.78
207.24.1.4
1.2.3.4
1.2.3.5
In this scenario, each node or site has four unique disks that are defined through each of the
two separate Hitachi storage units. The jessica and bina nodes at the Austin site have two
disks, hdisk38 and hdisk3. These disks are the primary source volumes that use TrueCopy
synchronous replication for the truesyncvg volume group. The other two disks, hdisk40 and
hdisk41, are to be used as the target secondary volumes that use HUR for asynchronous
replication from the Miami site for the ursasyncvg volume group.
The krod and bina nodes at the Miami site have two disks, hdisk38 and hdisk39. These disks
are the secondary target volumes for the TrueCopy synchronous replication of the truesyncvg
volume group from the Austin site. The other two disks, hdisk40 and hdisk41, are to be used
as the primary source volumes for the ursasyncvg volume group that uses HUR for
asynchronous replication.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
525
2. In the path verification window (Figure 11-3), check the information and record the LUN
number and LDEV numbers. You use this information later. However, you can also retrieve
this information from the AIX system after the devices are configured by the host. Click
OK.
526
3. Back on the LUN Manager tab (Figure 11-4), click Apply for these paths to become active
and the assignment to be completed.
You have completed assigning four more LUNs for the nodes at the Austin site. However, the
lab environment already had several LUNs, including both command and journaling LUNs in
the cluster nodes. These LUNs were added solely for this test scenario.
Important: If these LUNs are the first ones to be allocated to the hosts, you must also
assign the command LUNs. See the appropriate Hitachi documentation as needed.
For the storage unit at the Miami site, repeat the steps that you performed for the Austin site.
The host group, KrodMaddi, is assigned to port CL1-B on the Hitachi UPS-VM storage unit
with the serial number 35764. Usually the host group is assigned to multiple ports for full
multipath redundancy. Figure 11-5 on page 528 shows the result of these steps.
Again, record both the LUN numbers and LDEV numbers so that you can easily refer to them
as needed when you create the replicated pairs. The numbers are also required when you
add the LUNs into device groups in the appropriate horcm.conf file.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
527
528
Table 11-1 translates the LDEV hex values of each LUN and its corresponding decimal value.
Table 11-1 LUN number to LDEV number comparison
Austin - 45306
Miami - 35764
LUN number
LDEV-HEX
LDEV-DEC number
LUN number
LDEV-HEX
LDEV-DEC number
000A
00:01:10
272
001C
00:01:0C
268
000B
00:01:11
273
001D
00:01:0D
269
000C
00:01:12
274
001E
00:01:0E
271
000D
00:01:13
275
001F
00:01:0E
272
Although the pairing can be done by using the CCI, the example in this section shows how to
create the replicated pairs through the Hitachi Storage Navigator. The appropriate commands
are in the /HORCM/usr/bin directory. In this scenario, none of the devices were configured to
the AIX cluster nodes.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
529
2. In the TrueCopy Pair Operation window (Figure 11-7), select the appropriate port, CL-1E,
and find the specific LUNs to use (00-00A and 00-00B).
In this scenario, we predetermined that we want to pair these LUNs with 00-01C and
00-01D from the Miami Hitachi storage unit on port CL1-B. Notice in the occurrence of
SMPL in the Status column next to the LUNs. SMPL indicates simplex, meaning that no
mirroring is being used with that LUN.
3. Right-click the first Austin LUN (00-00A), and select Paircreate Synchronize
(Figure 11-7).
6
7
530
4. In the full synchronous Paircreate menu (Figure 11-8), select the port and LUN that you
previously created and recorded. Click Set.
Because we have only one extra remote storage unit, the RCU field already shows the
proper one for Miami.
5. Repeat step 4 for the second LUN pairing. Figure 11-8 shows details of the two pairings.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
531
6. After you complete the pairing selections, on the Pair Operation tab, verify that the
information is correct, and click Apply to apply them all at one time.
Figure 11-9 shows both of the source LUNs in the middle of the pane. It also shows an
overview of which remote LUNs they are to be paired with.
532
This step automatically starts copying the LUNs from the local Austin primary source to the
remote Miami secondary source LUNs. You can also right-click a LUN and select Detailed
Information as shown in Figure 11-10.
10
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
533
After the copy is done, the status is displayed as PAIR as shown in Figure 11-11. You can also
view this status from the management interface of either one of the storage units.
11
534
12
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
535
2. In the Universal Replicator Pair Operation window (Figure 11-13), select the appropriate
port CL-1B and find the specific LUNs that you want to use, which are 00-01E and 00-01F
in this example). We already predetermined that we want to pair these LUNs with 00-0C
and 00-00D from the Austin Hitachi storage unit on port CL1-E.
Right-click one of the desired LUNs and select Paircreate.
13
536
14
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
537
4. After you complete the pairing selections, on the Pair Operation tab, verify that the
information is correct and click Apply to apply them all at one time.
When the pairing is established, the copy automatically begins to synchronize with the
remote LUNs at the Austin site. The status changes to COPY, as shown in Figure 11-15,
until the pairs are in sync. After the pairs are synchronized, their status changes to PAIR.
15
538
5. Upon completion of the synchronization of the LUNs, configure the LUNs into the AIX
cluster nodes. Figure 11-16 shows an overview of the Hitachi replicated environment.
root@jessica:
hdisk38
hdisk39
hdisk40
hdisk41
none
none
none
none
None
None
None
None
Although the LUN and LDEV numbers were written down during the initial LUN assignments,
you must identify the correct LDEV numbers of the Hitachi disks and the corresponding AIX
hdisks by performing the following steps:
16
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
539
1. On the PowerHA SystemMirror Enterprise Edition nodes, select the Hitachi disks and the
disks that will be used in the TrueCopy/HUR relationships by running the inqraid
command. Example 11-4 shows hdisk38-hdisk41, which are the Hitachi disks that we just
added.
Example 11-4 Hitachi disks added
root@jessica:
# lsdev -Cc disk|grep hdisk|/HORCM/usr/bin/inqraid
hdisk38 -> [SQ] CL1-E Ser =
45306 LDEV = 272 [HITACHI
HORC = P-VOL HOMRCF[MU#0 = SMPL MU#1 = SMPL
RAID5[Group 1- 2] SSID = 0x0005
hdisk39 -> [SQ] CL1-E Ser =
45306 LDEV = 273 [HITACHI
HORC = P-VOL HOMRCF[MU#0 = SMPL MU#1 = SMPL
RAID5[Group 1- 2] SSID = 0x0005
hdisk40 -> [SQ] CL1-E Ser =
45306 LDEV = 274 [HITACHI
HORC = S-VOL HOMRCF[MU#0 = SMPL MU#1 = SMPL
RAID5[Group 1- 2] SSID = 0x0005 CTGID = 10
hdisk41 -> [SQ] CL1-E Ser =
45306 LDEV = 275 [HITACHI
HORC = S-VOL HOMRCF[MU#0 = SMPL MU#1 = SMPL
RAID5[Group 1- 2] SSID = 0x0005 CTGID = 10
] [OPEN-V
MU#2 = SMPL]
] [OPEN-V
MU#2 = SMPL]
] [OPEN-V
MU#2 = SMPL]
] [OPEN-V
MU#2 = SMPL]
2. Edit the HORCM LDEV section in the horcm#.conf file to identify the dev_group that will be
managed by PowerHA SystemMirror Enterprise Edition. In this example, we use the
horcm2.conf file.
Hdisk38 (ldev 272) and hdisk39 (ldev 273) are the pair for the synchronous replicated
resource group, which is primary at the Austin site. Hdisk40 (ldev 275) and hdisk41
(ldev276) are the pair for an asynchronous replicated resource, which is primary at the
Miami site.
Specify the device groups (dev_group) in the horcm#.conf file. We are using dev_group
htcdg01 with dev_names htcd01 and htcd02 for the synchronous replicated pairs. For the
asynchronous pairs, we are using dev_group hurdg01 and dev_names hurd01 and hurd02.
The device group names are needed later when you check that status of the replicated
pairs and when you define the replicated pairs as a resource for PowerHA Enterprise
Edition to control.
Important: Do not edit the configuration definition file while HORCM is running. Shut
down HORCM, edit the configuration file as needed, and then restart HORCM.
Example 11-5 shows the horcm2.conf file from the jessica node, at the Austin site.
Because two nodes are at the Austin site, the same updates were performed to the
/etc/horcm2.conf file on the bina node. Notice that you can use either the decimal value
of the LDEV or the hexadecimal value.
We specifically did one pair each way just to show it and to demonstrate that it works.
Although several groups were already defined, only those groups that are relevant to this
scenario are shown.
Example 11-5 Horcm2.conf file used for the Austin site nodes
root@jessica:
/etc/horcm2.conf
HORCM_MON
#Address of local node...
#ip_address
service
540
poll(10ms)
timeout(10ms)
r9r3m11.austin.ibm.com
52323
1000
HORCM_CMD
#hdisk of Command Device...
#dev_name
dev_name
#UnitID 0 (Serial# 45306)
#/dev/rhdisk10
\\.\CMD-45306:/dev/rhdisk10 /dev/rhdisk14
HORCM_LDEV
#Map dev_grp
#dev_group
#
#--------htcdg01
htcdg01
hurdg01
hurdg01
to LDEV#...
dev_name
Serial#
--------htcd01
htcd02
hurd01
hurd02
------45306
45306
45306
45306
CU:LDEV
(LDEV#)
-------272
273
01:12
01:13
3000
dev_name
MU#
---
siteA
siteB
hdisk
-> hdisk
--------------------
root@krod:
horcm2.conf
HORCM_MON
#Address of local node...
#ip_address
service
r9r3m13.austin.ibm.com 52323
poll(10ms)
1000
HORCM_CMD
#hdisk of Command Device...
#dev_name
dev_name
#UnitID 0 (Serial# 35764)
#/dev/rhdisk10
# /dev/hdisk19
\\.\CMD-45306:/dev/rhdisk11 /dev/rhdisk19
#HUR_GROUP
htcdg01
htcdg01
hurdg01
HUR_103_153 45306
htcd01
35764
htcd02
35764
hurd01
35764
01:53
268
269
01:0E
timeout(10ms)
3000
dev_name
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
541
hurdg01
hurd02
35764
01:0F
# Address of remote node for each dev_grp...
HORCM_INST
#dev_group
htcdg01
hurdg01
ip_address service
bina.austin.ibm.com
bina.austin.ibm.com
52323
52323
3. Map the hdisks that are protected by TrueCopy to the TrueCopy device groups by using
the raidscan command. In the following example, 2 is the HORCM instance number:
lsdev -Cc disk|grep hdisk | /HORCM/usr/bin/raidscan -IH2 -find inst
The -find inst option of the raidscan command registers the device file name (hdisk) to
all mirror descriptors of the LDEV map table for HORCM. This option also allows the
matching volumes on the horcm.conf file in protection mode and is started automatically
by using the /etc/horcmgr command. Therefore, you do not need to use this option
normally. This option is ended to avoid wasteful scanning when the registration is finished
based on HORCM.
Therefore, if HORCM no longer needs the registration, no further action is taken and it
exits. You can use the -find inst option with the -fx option to view LDEV numbers in the
hexadecimal format.
4. Verify that the PAIRs are established by running either the pairvdisplay command or the
pairvolchk command against the device groups htcdg01 and hurdg01.
Example 11-7 shows how we use the pairvdisplay command. For device group htcdg01,
the status of PAIR and fence of NEVER indicates that they are a synchronous pair. For
device group hurdg01, the ASYNC fence option clearly indicates that it is in an
asynchronous pair. Also, notice that the CTG field shows the consistency group number
for the asynchronous pair that is managed by HUR.
Example 11-7 The pairdisplay command to verify that the pair status is synchronized
M CTG JID AP
- 1
- - 1
- -
M CTG JID AP
- 10
3 1
- 10
3 2
- 10
3 1
- 10
3 2
To show the output in Example 11-7, we removed the last three columns of the output
because it was not relevant to what we are checking.
542
Unestablished pairs: If pairs are not yet established, the status is displayed as SMPL. To
continue, you must create the pairs. For instructions about creating pairs from the
command line, see the Hitachi Command Control Interface (CCI) User and Reference
Guide, MK-90RD011, which you can download from:
http://communities.vmware.com/servlet/JiveServlet/download/1183307-19474
Otherwise, if you are using Storage Navigator, see 11.4.2, Creating replicated pairs on
page 528.
root@jessica:lsvg truesyncvg
VOLUME GROUP:
truesyncvg
00cb14ce00004c000000012b564c41b9
VG STATE:
active
VG PERMISSION:
read/write
MAX LVs:
256
LVs:
3
OPEN LVs:
3
TOTAL PVs:
2
STALE PVs:
0
ACTIVE PVs:
2
MAX PPs per VG:
32768
LTG size (Dynamic): 256 kilobyte(s)
HOT SPARE:
no
PV RESTRICTION:
none
root@jessica:lsvg -l truesyncvg
lsvg -l truesyncvg
truesyncvg:
LV NAME
TYPE
LPs
oreolv
jfs2
125
majorlv
jfs2
125
truefsloglv
jfs2log
1
VG IDENTIFIER:
PP SIZE:
TOTAL PPs:
FREE PPs:
USED PPs:
QUORUM:
VG DESCRIPTORS:
STALE PPs:
AUTO ON:
MAX PVs:
AUTO SYNC:
BB POLICY:
PPs
125
125
1
PVs
1
1
1
4 megabyte(s)
988 (3952 megabytes)
737 (2948 megabytes)
251 (1004 megabytes)
2 (Enabled)
3
0
no
1024
no
relocatable
LV STATE
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
/oreofs
/majorfs
N/A
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
543
We create the ursasyncvg big volume group on the krod node where the primary LUNs
are located. We also create the logical volumes, jfslog, and file systems as shown in
Example 11-9.
Example 11-9 Ursasyncvg volume group information
root@krod:lspv
hdisk40
00cb14ce5676ad24
hdisk41
00cb14ce5676afcf
ursasyncvg
ursasyncvg
root@krod:lsvg ursasyncvg
VOLUME GROUP:
ursasyncvg
00cb14ce00004c000000012b5676b11e
VG STATE:
active
VG PERMISSION:
read/write
MAX LVs:
512
LVs:
3
OPEN LVs:
3
TOTAL PVs:
2
STALE PVs:
0
ACTIVE PVs:
2
MAX PPs per VG:
130048
MAX PPs per PV:
1016
LTG size (Dynamic): 256 kilobyte(s)
HOT SPARE:
no
root@krod:lsvg -l ursasyncvg
ursasyncvg:
LV NAME
TYPE
ursfsloglv
jfs2log
hannahlv
jfs2
julielv
jfs2
LPs
2
200
220
active
active
VG IDENTIFIER:
PPs
2
200
220
PP SIZE:
TOTAL PPs:
FREE PPs:
USED PPs:
QUORUM:
VG DESCRIPTORS:
STALE PPs:
AUTO ON:
4 megabyte(s)
1018 (4072 megabytes)
596 (2384 megabytes)
422 (1688 megabytes)
2 (Enabled)
3
0
no
MAX PVs:
AUTO SYNC:
BB POLICY:
128
no
relocatable
PVs
1
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
N/A
/hannahfs
/juliefs
2. Vary off the newly created volume groups by running the varyoffvg command. To import
the volume groups onto the other three systems, the pairs must be in sync.
We run the pairresync command as shown in Example 11-10 on the local disks and make
sure that they are in the PAIR state. This process verifies that the local disk information
was copied to the remote storage. Notice that the command is being run on the respective
node that contains the primary source LUNs and where the volume groups are created.
Example 11-10 Pairresync command
544
3. Split the pair relationship so that the remote systems can import the volume groups as
needed on each node. Run the pairsplit command against the device group as shown in
Example 11-11.
Example 11-11 The pairsplit command to suspend replication
chdev
chdev
chdev
chdev
-l
-l
-l
-l
hdisk38
hdisk39
hdisk40
hdisk41
-a
-a
-a
-a
pv=yes
pv=yes
pv=yes
pv=yes
chdev
chdev
chdev
chdev
-l
-l
-l
-l
hdisk38
hdisk39
hdisk40
hdisk41
-a
-a
-a
-a
pv=yes
pv=yes
pv=yes
pv=yes
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
545
5. Verify that the PVIDs are correctly showing on each system by running the lspv command
as shown in Example 11-14. Because all four of the nodes have the exact hdisk
numbering, we show the output only from one node, the bina node.
Example 11-14 LSPV listing to verify PVIDs are present
bina@root: lspv
hdisk38
00cb14ce564c3f44
hdisk39
00cb14ce564c40fb
hdisk40
00cb14ce5676ad24
hdisk41
00cb14ce5676afcf
none
none
none
none
6. Import the volume groups on each node as needed by using the importvg command.
Specify the major number that you used earlier.
7. Disable both the auto varyon and quorum settings of the volume groups by using the chvg
command.
8. Vary off the volume group as shown in Example 11-15.
Attention: PowerHA SystemMirror Enterprise Edition attempts to automatically set the
AUTO VARYON to NO during verification, except in the case of remote TrueCopy/HUR.
Example 11-15 Importing the replicated volume groups
9. Re-establish the pairs that you split in step 3 on page 545 by running the pairresync
command again as shown in Example 11-10 on page 544.
10.Verify again if they are in sync by using the pairdisplay command as shown in
Example 11-7 on page 542.
546
In these steps, the cluster topology was configured, including all four nodes, both sites, and
networks.
*
*
*
*
*
*
*
[Entry Fields]
[truelee]
SYNC
[htcdg01]
AUTO
[horcm2]
[300]
[3600]
+
+
+
#
#
*
*
*
*
*
*
*
[Entry Fields]
[ursasyncRR]
ASYNC
[hurdg01]
AUTO
[horcm2]
[300]
[3600]
+
+
+
#
#
547
For a complete list of all of defined TrueCopy/HUR replicated resources, run the cllstc
command, which is in the /usr/es/sbin/cluster/tc/cmds directory. Example 11-16 shows
the output of the cllstc command.
Example 11-16 The cllstc command to list the TrueCopy/HUR replicated resources
root@jessica: cllstc -a
Name
CopyMode DeviceGrps
truelee
SYNC
htcdg01
ursasyncRR
ASYNC
hurdg01
RecoveryAction
AUTO
AUTO
HorcmInstance HorcTimeOut
horcm2
300
horcm2
300
PairevtTimeout
3600
3600
548
emlecRG
jessica bina maddi
Online On Home Node Only
Fallover To Next Priority Node
Never Fallback
Prefer Primary Site
Service IP Label
Volume Groups
Hitachi TrueCopy Replicated Resources
service_1
truesyncvg
truelee
valhallaRG
krod maddi bina
Online On Home Node Only
Fallover To Next Priority Node
Never Fallback
Prefer Primary Site
service_2
ursasyncvg
ursasyncRR
Figure 11-18 Error messages that are found during TrueCopy/HUR replicated resource verification
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
549
Each test, except for the last reintegration test, begins in the same initial state of each site
that hosts its own production resource group on the primary node as shown in
Example 11-18.
Example 11-18 Beginning of test cluster resource group states
clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
ONLINE
jessica@Austin
OFFLINE
bina@Austin
ONLINE SECONDARY
maddi@Miami
valhallaRG
ONLINE
krod@Miami
OFFLINE
maddi@Miami
ONLINE SECONDARY
bina@Austin
Before each test, we start copying data from another file system to the replicated file systems.
After each test, we verify that the site service IP address is online and new data is in the file
systems. We also had a script that inserts the current time and date into a file on each file
system. Because of the small amounts of I/O in our environment, we were unable to
determine whether we lost any data in the asynchronous replication.
550
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
551
3. In the Move a Resource Group to Another Node / Site panel (Figure 11-19), select the
ONLINE instance of the emlecRG resource group to be moved.
Move a Resource Group to Another Node / Site
Move cursor to desired item and press Enter.
+--------------------------------------------------------------------------+
|
Select Resource Group(s)
|
|
|
| Move cursor to desired item and press Enter. Use arrow keys to scroll.
|
|
|
|
#
|
|
# Resource Group
State
Node(s) / Site |
|
#
|
|
emlecRG
ONLINE
jessica / Austi |
|
emlecRG
ONLINE SECONDARY
maddi / Miami
|
|
valhallarg
ONLINE
krod / Miami
|
|
|
|
#
|
|
# Resource groups in node or site collocation configuration:
|
|
# Resource Group(s)
State
Node / Site
|
|
#
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 11-19 Moving the Austin resource group across to site Miami
4. In the Select a Destination Site panel, select the Miami site as shown in Figure 11-20.
+--------------------------------------------------------------------------+
|
Select a Destination Site
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
# *Denotes Originally Configured Primary Site
|
|
Miami
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 11-20 Selecting the site for resource group move
552
root@maddi# clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
ONLINE SECONDARY
jessica@Austin
OFFLINE
bina@Austin
ONLINE
maddi@Miami
valhallarg
ONLINE
OFFLINE
OFFLINE
krod@Miami
maddi@Miami
bina@Austin
6. Repeat the resource group move to move it back to its original primary site and node to
return to the original starting state.
Attention: In our environment, after the first resource group move between sites, we were
unable to move the resource group back without leaving the pick list for the destination site
empty. However, we were able to move it back by node, instead of by site. Later in our
testing, the by-site option started working, but it moved it to the standby node at the
primary site instead of the original primary node. If you encounter similar problems, contact
IBM support.
To begin, all four nodes are active in the cluster and the resource groups are online on the
primary node as shown in Example 11-18 on page 550.
1. On the jessica node, run the reboot -q command. The bina node acquires the emlecRG
resource group as shown in Example 11-20.
Example 11-20 Local node failover within the Austin site
root@bina: clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
OFFLINE
jessica@Austin
ONLINE
bina@Austin
OFFLINE
maddi@Miami
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
553
valhallarg
ONLINE
OFFLINE
ONLINE SECONDARY
krod@Miami
maddi@Miami
bina@Austin
2. Run the pairdisplay command (as shown in Example 11-21) to verify that the pairs are
still established because the volume group is still active on the primary site.
Example 11-21 Pairdisplay status after a local site failover
PairVol(L/R) (Port#,TID,
htcd01(L)
(CL1-E-0, 0,
htcd01(R)
(CL1-B-0, 0,
htcd02(L)
(CL1-E-0, 0,
htcd02(R)
(CL1-B-0, 0,
LU),Seq#,LDEV#.P/S,Status,Fence,Seq#,P-LDEV#
10)45306 272.P-VOL PAIR NEVER ,35764 268
28)35764 268.S-VOL PAIR NEVER ,----- 272
11)45306 273.P-VOL PAIR NEVER ,35764 269
29)35764 269.S-VOL PAIR NEVER ,----- 273
M CTG JID AP
- - 1
- - - - 1
- - -
3. Upon cluster stabilization, run the reboot -q command on the bina node. The maddi node
at the Miami site acquires the emlecRG resource group as shown in Example 11-22.
Example 11-22 Hard failover between sites
root@maddi: clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
OFFLINE
jessica@Austin
OFFLINE
bina@Austin
ONLINE
maddi@Miami
valhallarg
ONLINE
OFFLINE
OFFLINE
krod@Miami
maddi@Miami
bina@Austin
4. Verify that the replicated pairs are now in the suspended state from the command line as
shown in Example 11-23.
Example 11-23 Pairdisplay status after a hard site failover
554
PairVol(L/R) (Port#,TID,
htcd01(L) (CL1-B-0, 0,
htcd01(R) (CL1-E-0, 0,
htcd02(L) (CL1-B-0, 0,
htcd02(R) (CL1-E-0, 0,
You can also verify that the replicated pairs are in the suspended state by using the
Storage Navigator (Figure 11-21).
Important: Although our testing resulted in a site_down event, we never lost access to
the primary storage subsystem. In a true site failure, including loss of storage,
re-establish the replicated pairs, and synchronize them before you move them back to
the primary site. If you must change the storage LUNs, modify the horcm.conf file, and
use the same device group and device names. You do not have to change the cluster
resource configuration.
17
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
555
Important: The resource group settings of the Inter-site Management Policy, also known
as the site relationship, dictate the behavior of what occurs upon reintegration of the
primary node. Because we chose Prefer Primary Site, the automatic fallback occurred.
Initially we are unable to restart the cluster on the jessica node because of verification errors
at startup, which are similar to the errors shown in Figure 11-18 on page 549. Of the two
possible reasons for these errors, one reason is that we failed to include starting the horcm
instance on bootup. The second is reason is that we also had to remap the copy protected
device groups by running the raidscan command again.
Important: Always ensure that the horcm instance is running before you rejoin a node into
the cluster. In some cases, if all instances, cluster nodes, or both are down, you might need
to run the raidscan command again.
root@bina: clRGinfo
Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
ONLINE
jessica@Austin
OFFLINE
bina@Austin
ONLINE SECONDARY
maddi@Miami
556
valhallarg
ONLINE SECONDARY
OFFLINE
ONLINE
krod@Miami
maddi@Miami
bina@Austin
6. Repeat these steps to move a resource group back to the original primary krod node at
the Miami site.
Attention: In our environment, after the first resource group move between sites, we were
unable to move the resource group back without leaving the pick list for the destination site
empty. However, we were able to move it back by node, instead of by site. Later in our
testing, the by-site option started working, but it moved it to the standby node at the
primary site instead of the original primary node. If you encounter similar problems, contact
IBM support.
To begin, all four nodes are active in the cluster, and the resource groups are online on the
primary node as shown in Example 11-18 on page 550. Follow these steps:
1. On the krod node, run the reboot -q command. The maddi node brings the valhallaRG
resource group online, and the remote bina node maintains the online secondary status as
shown in Example 11-25. This time the failover time was noticeably longer, specifically in
the fsck portion. The longer amount of time is most likely a symptom of the asynchronous
replication.
Example 11-25 Local node fallover within the Miami site
root@maddi: clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
ONLINE
jessica@Austin
OFFLINE
bina@Austin
ONLINE SECONDARY
maddi@Miami
valhallarg
OFFLINE
ONLINE
ONLINE SECONDARY
krod@Miami
maddi@Miami
bina@Austin
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
557
2. Run the pairdisplay command as shown in Example 11-26 to verify that the pairs are still
established because the volume group is still active on the primary site.
Example 11-26 Status using the pairdisplay command after the local Miami site fallover
M
-
3. Upon cluster stabilization, run the reboot -q command on the maddi node. The bina node
at the Austin sites acquires the valhallaRG resource group as shown in Example 11-27.
Example 11-27 Hard failover from Miami site to Austin site
root@bina: clRGinfo
----------------------------------------------------------------------------Group Name
Group State
Node
----------------------------------------------------------------------------emlecRG
ONLINE
jessica@Austin
OFFLINE
bina@Austin
OFFLINE
maddi@Miami
valhallarg
OFFLINE
OFFLINE
ONLINE
krod@Miami
maddi@Miami
bina@Austin
Important: Although our testing resulted in a site_down event, we never lost access to
the primary storage subsystem. In a true site failure, including loss of storage,
re-establish the replicated pairs, and synchronize them before you move them back to
the primary site. If you must change the storage LUNs, modify the horcm.conf file, and
use the same device group and device names. You do not have to change the cluster
resource configuration.
558
Important: Always ensure that the horcm instance is running before you rejoin a node to
the cluster. In some cases, if all instances, cluster nodes, or both are down, you might need
to run the raidscan command again.
Port
CL1-E
Port
CL-1B
CU
01
CU
01
LUN
000E
LUN
001B
LDEV
01:14
LDEV
01:1F
jessica hdisk#
hdisk42
krod hdisk#
hdisk42
bina hdisk#
hdisk42
maddi hdisk#
hdisk42
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
559
htcd03
35764
01:1F
The jessica and bina nodes on the Austin site added the following new line:
htcdg01
htcd03
45306
01:14
M
-
You are now ready to use C-SPOC to add the new disk into the volume group:
Important: You cannot use C-SPOC for the following LVM operations to configure nodes at
the remote site that contain the target volume:
Creating a volume group
Operations that require nodes at the target site to write to the target volumes
For example, changing the file system size, changing the mount point, or adding LVM
mirrors cause an error message in C-SPOC. However, nodes on the same site as the
source volumes can successfully perform these tasks. The changes are then
propagated to the other site by using a lazy update.
For C-SPOC operations to work on all other LVM operations, perform all C-SPOC
operations with the (TrueCopy/HUR) volume pairs in the Synchronized or Consistent states
or the cluster ACTIVE on all nodes.
1. From the command line, type the smitty cl_admin command.
2. In SMIT, select the path System Management (C-SPOC) Storage Volume
Groups Add a Volume to a Volume Group
3. Select the volume group truesyncvg from the menu.
560
5. Verify the menu information, as shown in Figure 11-23, and press Enter.
Add a Volume to a Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
truesyncvg
emlecRG
bina,jessica,krod,mad>
bina
hdisk42
The krod node does not need the volume group because it is not a member of the resource
group. However, we started with all four nodes seeing all volume groups and decided to leave
the configuration that way. This way, we have more flexibility later if we need to change the
cluster configuration to allow the krod node to take over as a last resort.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
561
Upon completion of the C-SPOC operation, all four nodes now have the new disk as a
member of the volume group as shown in Example 11-29.
Example 11-29 New disk added to the volume group on all nodes
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
truesyncvg
hdisk42
truesyncvg
00cb14ce74090ef3
active
active
active
We do not need to synchronize the cluster because all of these changes are made to an
existing volume group. However, you might want to run the cl_verify_tc_config command to
verify the resources that are replicated correctly.
562
[Entry Fields]
emlecRG
truesyncvg
bina,jessica,krod,mad>
jessica
[50]
#
hdisk42
[micah]
[raw]
+
outer_middle
+
minimum
+
[]
#
1
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
563
6. Upon completion of the C-SPOC operation, verify that the new logical was created locally
on the jessica node as shown Example 11-30.
Example 11-30 Newly created logical volume
PPs
125
125
1
50
PVs
1
1
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
/oreofs
/majorfs
N/A
N/A
564
[Entry Fields]
truesyncvg
emlecRG
krod,maddi,bina,jessi>
/oreofs
[/oreofs]
M
[1536]
[]
no
read/write
[]
/
+
#
+
+
5. Upon completion of the C-SPOC operation, verify the new file system size locally on the
jessica node as shown in Example 11-31.
Example 11-31 Newly increased file system size
PPs
384
125
1
50
PVs
1
1
1
1
LV STATE
closed/syncd
closed/syncd
closed/syncd
closed/syncd
MOUNT POINT
/oreofs
/majorfs
N/A
N/A
You do not need to synchronize the cluster because all of these changes are made to an
existing volume group. However, you might want to make sure that the replicated resources
verify correctly. Use the cl_verify_tc_config command first to isolate the replicated
resources specifically.
htcd04
45306
00:20
On the Austin site, the jessica and bina nodes added the following new line:
htcdg01
htcd04
35764
00:0A
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
565
10.Map the devices and device group on any node. We ran the raidscan command on the
jessica node. See Table 11-3 for more configuration details.
lsdev -Cc disk|grep hdisk|/HORCM/usr/bin/raidscan -IH2 -find inst
Table 11-3 Details on the Austin and Miami LUNs
Austin - Hitachi USPV - 45306
Port
CL1-E
Port
CL-1B
CU
00
CU
00
LUN
000F
LUN
0021
LDEV
00:20
LDEV
00:0A
jessica hdisk#
hdisk43
krod hdisk#
hdisk43
bina hdisk#
hdisk43
maddi hdisk#
hdisk43
11.Verify that the htcgd01 device group pairs are now showing the new pairs that consist of
hdisk42 on each system as shown in Example 11-32.
Example 11-32 New LUN pairs add to htcgd01 device group
566
M
-
3. In the Node Names panel, select the specific nodes. We chose all four as shown in
Figure 11-27.
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
+--------------------------------------------------------------------------+
|
Node Names
|
|
|
| Move cursor to desired item and press F7.
|
|
ONE OR MORE items can be selected.
|
| Press Enter AFTER making all selections.
|
|
|
| > bina
|
| > jessica
|
| > krod
|
| > maddi
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F7=Select
F8=Image
F10=Exit
|
F1| Enter=Do
/=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 11-27 Selecting a volume group node
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
567
568
5. In the Volume Group Type panel, select the volume group type. We chose Scalable as
shown in Figure 11-29.
Volume Groups
Move cursor to desired item and press Enter.
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Set Characteristics of a Volume Group
+--------------------------------------------------------------------------+
|
Volume Group Type
|
|
|
| Move cursor to desired item and press Enter.
|
|
|
|
Legacy
|
|
Original
|
|
Big
|
|
Scalable
|
|
|
| F1=Help
F2=Refresh
F3=Cancel
|
| F8=Image
F10=Exit
Enter=Do
|
F1| /=Find
n=Find Next
|
F9+--------------------------------------------------------------------------+
Figure 11-29 Selecting the volume group type for a new volume group
6. In the Create a Scalable Volume Group panel, select the resource group. We chose
emlecRG as shown in Figure 11-30.
Create a Scalable Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
Node Names
Resource Group Name
PVID
VOLUME GROUP name
Physical partition SIZE in megabytes
Volume group MAJOR NUMBER
Enable Cross-Site LVM Mirroring Verification
Enable Fast Disk Takeover or Concurrent Access
Volume Group Type
Maximum Physical Partitions in units of 1024
Maximum Number of Logical Volumes
[Entry Fields]
bina,jessica,krod,mad>
[emlecRG]
00cb14ce75bab41a
[truetarahvg]
4
[58]
false
no
Scalable
32
256
+
#
+
+
+
+
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
569
8. Verify that the volume group is successfully created, which we do on all four nodes as
shown in Example 11-33.
Example 11-33 Newly created volume group on all nodes
truetarahvg
truetarahvg
truetarahvg
truetarahvg
When you create the volume group, the volume group is automatically added to the
resource group as shown in Example 11-34. However, we do not have to change the
resource group any further, because the new disk and device are added to the same
device group and TrueCopy/HUR replicated resource.
Example 11-34 Newly added volume group also added to the resource group
emlecRG
jessica bina maddi
Online On Home Node Only
Fallover To Next Priority Node
Never Fallback
Prefer Primary Site
service_1
truesyncvg truetarahvg
truelee
9. Repeat the steps in 11.6.2, Adding a new logical volume on page 562, to create a logical
volume, named tarahlv on the newly created volume group truetarahvg. Example 11-35
shows the new logical volume.
Example 11-35 New logical volume on newly added volume group
PPs
25
PVs
1
LV STATE
closed/syncd
MOUNT POINT
N/A
10.Manually run the cl_verify_tc_config command to verify that the new addition of the
replicated resources is complete.
570
Important: During our testing, we encountered a defect after the second volume group
was added to the resource group. The cl_verify_tc_config command produced the
following error messages:
cl_verify_tc_config: ERROR - Disk hdisk38 included in Device Group htcdg01 does
not match any hdisk in Volume Group truetarahvg.
cl_verify_tc_config: ERROR - Disk hdisk39 included in Device Group htcdg01 does
not match any hdisk in Volume Group truetarahvg.
cl_verify_tc_config: ERROR - Disk hdisk42 included in Device Group htcdg01 does
not match any hdisk in Volume Group truetarahvg.
Errors found verifying the HACMP TRUECOPY/HUR configuration. Status=3
These results incorrectly imply a one to one relationship between the device
group/replicated resource and the volume group, which is not intended. To work around
this problem, ensure that the cluster is down, do a forced synchronization, and then start
the cluster but ignore the verification errors. Usually when performing both a forced
synchronization and then starting the cluster, do not ignore errors. Contact IBM support to
see whether a fix is available.
Synchronize the resource group change to include the new volume that you just added.
Usually you can perform this task within a running cluster. However, because of the defect
that is mentioned in the previous Important box, we had to have the cluster down to
synchronize it. To perform this task:
1. From the command line, type the smitty hacmp command.
2. In SMIT, select the path Extended Configuration Extended Verification and
Synchronization and Verification
3. In the HACMP Verification and Synchronization display (Figure 11-31), for Force
synchronization if verification fails, select Yes.
HACMP Verification and Synchronization
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
F2=Refresh
F6=Command
[Entry Fields]
[Both]
[No]
+
+
[Yes]
[No]
[Standard]
+
+
+
F3=Cancel
F7=Edit
F4=List
F8=Image
4. Verify the information is correct, and press Enter. Upon completion, the cluster
configuration is in sync and can now be tested.
5. Repeat the steps for a rolling system failure as explained in 11.5.2, Rolling site failure of
the Austin site on page 553. In this scenario, the tests are successful.
Chapter 11. Disaster recovery by using Hitachi TrueCopy and Universal Replicator
571
572
Related publications
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
Other publications
These publications are also relevant as further information sources:
EMC PowerPath for AIX Version 5.3 Installation and Administration Guide, P/N 300-008-341
EMC Solutions Enabler Symmetrix SRDF Family CLI Version 7.0 Product Guide, P/N
300-000-877
EMC Solutions Enabler Version 7.0 Installation Guide, P/N 300-008-918
EMC Symmetrix Remote Data Facility (SRDF) Connectivity Guide, P/N 300-003-885
EMC Symmetrix Remote Data Facility (SRDF) Product Guide, P/N 300-001-165
HACMP for AIX 6.1 Administration Guide, SC23-4862
HACMP for AIX 6.1 Concepts and Facilities Guide, SC23-4864
HACMP for AIX 6.1 Geographic LVM: Planning and Administration Guide, SA23-1338
HACMP for AIX 6.1 Installation Guide, SC23-5209
HACMP for AIX 6.1 Metro Mirror: Planning and Administration Guide, SC23-4863
HACMP for AIX 6.1 Planning Guide, SC23-4861
573
Online resources
These websites are also relevant as further information sources:
IBM PowerHA SystemMirror for AIX
http://www.ibm.com/systems/power/software/availability/aix/index.html
IBM PowerHA High Availability wiki
http://www.ibm.com/developerworks/wikis/display/WikiPtype/High%20Availability
Implementation Services for Power Systems for PowerHA for AIX
http://www-935.ibm.com/services/us/index.wss/offering/its/a1000032
Online PowerHA manuals (still called HACMP.)
http://http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/
com.ibm.cluster.hacmp.doc/hacmpbooks.html
GEO to GLVM Migration white paper
http://www.ibm.com/systems/resources/systems_p_os_aix_whitepapers_pdf_aix_glvm.pdf
IBM PowerHA SystemMirror for AIX page
http://http://www.ibm.com/systems/power/software/availability/aix/index.html
IBM PowerHA High Availability Wiki
http://en.wikipedia.org/wiki/IBM_High_Availability_Cluster_Multiprocessing
Yahoo PowerHA User Forum
http://tech.groups.yahoo.com/group/hacmp/
External site where you can download PowerHA PTFs
http://www.software.ibm.com/webapp/set2/sas/f/hacmp/home.html
Disaster Recovery Solutions for System p Servers and AIX 5L
http://www.ibm.com/systems/resources/systems_p_software_whitepapers_disaster_
recovery.pdf
Documentation for the PowerHA Enterprise Edition is supplied in HTML and PDF formats
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.
doc/doc/base/hacmp.htm
574
Back cover
Includes multiple
implementation
scenarios
Describes networking
planning and design
INTERNATIONAL
TECHNICAL
SUPPORT
ORGANIZATION
BUILDING TECHNICAL
INFORMATION BASED ON
PRACTICAL EXPERIENCE
IBM Redbooks are developed
by the IBM International
Technical Support
Organization. Experts from
IBM, Customers and Partners
from around the world create
timely technical information
based on realistic scenarios.
Specific recommendations
are provided to help you
implement IT solutions more
effectively in your
environment.
ISBN 0738437999