Datastorage2 2
Datastorage2 2
Foreword
This course introduces technologies of traditional RAID and RAID 2.0+. The
evolution of RAID technologies aims at data protection and performance
improvement.
1 Huawei Confidential
Objectives
2 Huawei Confidential
Contents
1. Traditional RAID
2. RAID 2.0+
3 Huawei Confidential
Background
Problems in traditional computer systems must be addressed.
Instructions
processed per
second
CPU > 1 million Disks become the system performance bottleneck.
4 Huawei Confidential
What Is RAID?
Redundant Array of Independent Disks (RAID) combines multiple physical disks into one
logical disk in different ways, for the purposes of read/write performance and data
security improvement.
How large is
a logical disk?
Logical disk
5 Huawei Confidential
Data Organization Forms
Disk striping: Space in each disk is divided into multiple strips of a specified size. Data is
also divided into blocks based on strip size when data is being written.
Strip: A strip consists of one or more consecutive sectors in a disk, and multiple strips
form a stripe.
Stripe: A stripe consists of strips of the same location or ID on multiple disks in the
same array.
D3 D4 D5 Stripe 1
D0 D1 D2 Stripe 0
Data strips in Data strips in Data strips in
a disk a disk a disk
6 Huawei Confidential
Data Protection Techniques
Mirroring: Data copies are stored on another redundant disk.
Exclusive or (XOR)
XOR is widely used in digital electronics and computer science.
XOR is a logical operation that outputs true only when inputs differ (one is true, the other is
false).
0 ⊕ 0 = 0, 0 ⊕ 1 = 1, 1 ⊕ 0 = 1, 1 ⊕ 1 = 0
1 1 0
0 1 1
0 0 0
7 Huawei Confidential
Common RAID Levels and Classification Criteria
RAID levels use different combinations of data organization forms and data
protection techniques.
Distributed Parity: In this method, parity information is spread across all the
disks in the array. Each disk contains a portion of the parity information,
8 Huawei Confidential Parity in data storage refers to a method of error detection and correction by adding extra bits to data. In RAID
configurations, parity is used to ensure data integrity. It works by calculating a value (the parity bit) based on the bits of
the stored data. For example, in a simple even parity scheme, if the number of 1s in a set of bits is odd, the parity bit will
be set to 1 to make the total even, and 0 otherwise. In case of a drive failure, the parity information can be used to
reconstruct the lost data by performing bitwise calculations against the remaining data. This helps maintain data
redundancy and reliability.
15->14:15
D5 D 0, D 1, D 2, D 3, D 4, D 5
D4
D3
Physical disk 1 Physical disk 2
D2 D6
D4 D5 Stripe 2
D1
D2 D3 Stripe 1
D0
D0 D1 Stripe 0
9 Huawei Confidential
How Does RAID 1 Work
D1
D1 D1
Logical disk D0 D0
10 Huawei Confidential
How Does RAID 3 Work
Write data to C.
Write data to B.
Read data.
Write data to A.
C
A 0, A 1, A 2, B 0, B 1, B 2, C 0, C 1, C 2
A C0 C1 C2 P3
B0 B1 B2 P2
A0 A1 A2 P1
Logical disk
Note: A write penalty occurs when just a small amount of new data needs to be written to one or two disks.
11 Huawei Confidential
How Does RAID 5 Work
Write data.
Read data.
D5
D 0, D 1, D 2, D 3, D 4, D 5
D4
D3
D2
Physical disk 1 Physical disk 2 Physical disk 3
D1
P2 D4 D5
D0
D2 P1 D3
D0 D1 P0
Logical disk
12 Huawei Confidential
RAID 6
RAID 6
Requires at least N + 2 (N > 2) disks and provides extremely high data reliability and
availability.
RAID 6 DP
13 Huawei Confidential
In this context, **P parity** is calculated using the bitwise XOR operation on all data blocks (D0, D1, D2, etc.), producing a single parity
block. **Q parity** is generated similarly, but incorporates weighted values (, , ) for each data block, allowing for recovery of up to two lost
data blocks by leveraging both P and Q parity information.
In summary, P and Q parity data provide redundancy for data recovery, with P covering all data blocks and Q incorporating weightings
for enhanced recovery capabilities.
Q = (α * D 0) ⊕ (β * D 1) ⊕ (γ * D 2)...
Physical disk 1 Physical disk 2 Physical disk 3 Physical disk 4 Physical disk 5
P1 Q1 D0 D1 D2 Stripe 0
D3 P2 Q2 D4 D5 Stripe 1
D6 D7 P3 Q3 D8 Stripe 2
D9 D 10 D 11 P4 Q4 Stripe 3
Q5 D 12 D 13 D 14 P5 Stripe 4
14 Huawei Confidential
How Does RAID 6 DP Work?
Double parity (DP) adds another disk in addition to the horizontal XOR parity disk used in RAID 4 to store
diagonal XOR parity data.
P0 to P3 in the horizontal parity disk represent the horizontal parity data for respective disks.
DP 0 to DP 3 in the diagonal parity disk represent the diagonal parity data for respective data disks and the
horizontal parity disk.
D8 D9 D 10 D 11 P2 DP 2 Stripe 2
D 12 D 13 D 14 D 15 P3 DP 3 Stripe 3
15 Huawei Confidential
How Does RAID 10 Work?
RAID 10 consists of nested RAID 1 + RAID 0 levels and allows disks to be mirrored
(RAID 1) and then striped (RAID 0). RAID 10 is also a widely used RAID level.
User data D 0, D 1, D 2, D 3, D 4, D 5
D4 D4 D5 D5
D2 D2 D3 D3
D0 D0 D1 D1
16 Huawei Confidential
How Does RAID 50 Work?
RAID 50 consists of nested RAID 5 + RAID 0 levels. RAID 0 is implemented after RAID 5
is implemented.
D 0, D 1, D 2, D 3, D 4, D 5, D 6, D 7...
D 0, D 1, D 4, D 5, D 8, D 9 D 2, D 3, D 6, D 7, D 10, D 11
P4 D8 D9 P5 D 10 D 11 Stripe 2
D4 P2 D5 D6 P3 D7 Stripe 1
D0 D1 P0 D2 D3 P1 Stripe 0
Physical Physical Physical Physical Physical Physical
disk 1 disk 2 disk 3 disk 4 disk 5 disk 6
RAID 5 RAID 5
RAID 0
17 Huawei Confidential
Contents
1. Traditional RAID
2. RAID 2.0+
18 Huawei Confidential
RAID Evolution
Hot
spare
19 Huawei Confidential
CK (Check Block): This indicates a specific block that is used for error detection and correction. In RAID systems, especially those designed
for high data integrity like RAID 2.0+, check blocks help to maintain consistency and provide a mechanism for recovering lost or corrupted
data.
CKG (Check Group): This term often refers to a collection of check blocks that are associated with a specific set of data blocks. In the many-
How Does RAID 2.0+ Work? to-many architecture of RAID 2.0+, multiple check groups can exist to perform error correction across different segments of data. Each
check group can contain check blocks that have parity or other error correction information for the related data blocks in the system.
Extent
LUN Extent
Extent
Disk
Disk 0 Disk 1 Disk k Disk n
RAID 2.0+ utilizes a fine-grained striping approach where data is divided into multiple bits and spread across several
disks, each dedicated to a single bit of data. It employs ECC (Error Correction Code) for error detection and correction,
storing parity data across additional disks to enhance reliability. This setup allows for high performance and fault
tolerance, enabling recovery from multiple disk failures.
20 Huawei Confidential
Hot spare block space: This refers to a reserved disk that is ready to take over if one of the active disks fails. Hot spares provide instant recovery
capabilities without requiring manual intervention, maintaining system availability during disk replacement.
Reconstruction
Traditional RAID (many-to-one) RAID 2.0+ (many-to-many)
01 02 03 51 52 53 CKG 0 (RAID 5)
HDD 0 HDD 5
HDD 0 04 05 06 HDD 5 54 55 56
Hot spare 07 08 09 57 58 59
HDD 1 14 15 16 HDD 6 64 65 66
17 18 19 67 68 69 CKG 1 (RAID 5)
HDD 2 HDD 7 21 22 23 71 72 73
HDD 2 24 25 26 HDD 7 74 75 76
27 28 29 77 78 79
HDD 3 HDD 8 31 32 33 81 82 83
HDD 3 34 35 36 HDD 8 84 85 86 CKG 2 (RAID 5)
37 38 39 87 88 89 52 13 63 74 85
HDD 4 HDD 9 ckg parity
41 42 43 91 92 93 yomkin ey
Hot spare disk HDD 4 44 45 46 HDD 9 94 95 96 Unused CK yomkn le
Hot spare
47 48 49 97 98 99 block space
In traditional
RAID
configurations,
particularly
those using
A storage pool comprises physical disks, where space is divided into fine-grained chunks that form Check Groups (CKGs). CKGs are further subdivided
into extents, which are smaller units of storage. Multiple types of disks can be used, allowing for tiered or non-tiered configurations. Finally, several extents
combine to create a Volume, enabling quick LUN creation visible to hosts.
Logical Objects
Storage pool LUNs that can be
Chunk CKG Extent Volume
consisting of viewed on the host
physical disks
Tiered
Not tiered
A CKG is
divided into Not tiered
Multiple types of Space provided by each Chunks from spaces of a
Several extents
disks are added to disk is divided into fine- different disks smaller LUNs can be created quickly.
form one volume.
a storage pool. grained chunks. form a CKG. granularity.
22 Huawei Confidential
fil array , disk domain howa combinaison mtee disks aprés reservation mtee des disk pour hot spare
Disk Domain
A disk domain is a combination of disks (which can be all disks in the array). After the disks are
combined and reserved for hot spare capacity, it provides storage resources for the storage pool.
Tiers
High-performance tier
Performance tier
Disk domain #2
Capacity tier
23 Huawei Confidential
A storage pool serves as a container for storage resources utilized by application servers and can
include different storage tiers, which group storage media based on performance levels. Tier 0 (SSD)
offers high performance for frequently accessed data, Tier 1 (SAS) provides moderate performance
for less frequently accessed data, and Tier 2 (NL-SAS) is suited for infrequently accessed mass data.
Storage Pool and Tier Each tier employs specific RAID levels and policies to ensure data protection and performance
optimization. This structure allows organizations to balance performance and cost based on their
application needs.
A storage pool is a storage resource container. The storage resources used by application servers are all from
storage pools.
A storage tier is a collection of storage media providing the same performance level in a storage pool.
Different storage tiers manage storage media of different performance levels and provide storage space for
applications that have different performance requirements.
Storage Supported
Tier Type Application
Tier Disk Type RAID Level RAID Policy
High- Best for storage of data that is RAID 1 1D + 1D, 1D + 1D + 1D + 1D
Tier 0 performance SSD frequently accessed with high
tier performance and price. 2D + 2D or 4D + 4D, which is automatically
RAID 10
selected by a storage system.
Best for storage of data that is less
Performance frequently accessed with relatively RAID 3 2D + 1P, 4D + 1P, 8D + 1P
Tier 1 SAS
tier high performance and moderate RAID 5 2D + 1P, 4D + 1P, 8D + 1P
price.
(2D + 1P) x 2, (4D + 1P) x 2, or (8D + 1P) x
Best for storage of mass data that RAID 50
2
is infrequently accessed with low
Tier 2 Capacity tier NL-SAS RAID 6 2D + 2P, 4D + 2P, 8D + 2P, 16D + 2P
performance and price, and large
capacity per disk.
24 Huawei Confidential
Disk Group
A disk group (DG) is a set of disks of the same type in a disk domain. The disk
type can be SSD, SAS, or NL-SAS.
SSD
Disk type
SAS NL-SAS
25 Huawei Confidential
Logical Drive
A logical drive (LD) is a disk that is managed by a storage system and corresponds
to a physical disk.
LD 0 LD 1 LD 2 LD 3
26 Huawei Confidential
Chunk
A chunk (CK) is a disk space of a specified size allocated from a storage pool. It is
the basic unit of a RAID array.
Chunk Chunk
27 Huawei Confidential
Chunk Group
A chunk group (CKG) is a logical storage unit that consists of CKs from different
disks in the same DG based on the RAID algorithm. It is the minimum unit for
allocating resources from a disk domain to a storage pool.
CKG CKG
Disk Disk
CK
DG DG
28 Huawei Confidential
Extent
Each CKG is divided into logical storage spaces of a fixed and adjustable size called
extents. Extent is the minimum unit (granularity) for migration and statistics of
hot data. It is also the minimum unit for space application and release in a storage
pool.
LUN 0 (thick)
Extent
CKG
LUN 1 (thick)
29 Huawei Confidential
Grain
When a thin LUN is created, extents are divided into blocks of a fixed size, called
grains. A thin LUN allocates storage space by grains. Logical block addresses
(LBAs) in a grain are consecutive.
LUN (thin)
Extent Grain
CKG
30 Huawei Confidential
Volume and LUN
A volume is an internal management object in a storage system.
A LUN is a storage unit that can be directly mapped to a host for data reads and writes.
A LUN is the external embodiment of a volume.
Server
LUN
Volume Storage
31 Huawei Confidential
Contents
1. Traditional RAID
2. RAID 2.0+
32 Huawei Confidential
Huawei Dynamic RAID Algorithm
Common RAID algorithm Huawei dynamic RAID algorithm
When a block in a RAID array fails, recover the data in the faulty When a block in a RAID array fails, recover and migrate the data in
block, migrate all the data in the RAID array, and then shield the the faulty block, shield the faulty block, and reconstruct a new
RAID array. RAID array using remaining blocks.
Result: A large amount of available flash memory space is wasted. Benefit: The flash memory space is fully and effectively used.
4. Shield and obsolete the faulty RAID array, which wastes space. 4. Reconstruct a new RAID array using remaining blocks to store data.
ch 0 ch 1 ch n-1 ch n ch P ch 0 ch 1 ch n-1 ch n ch P
PBA 0 16 17 ...
… 100 60 P0 PBA 0 16 17 ... 100 60
60 PPm+2
0
1. A block in a RAID array fails.
1. A block in a RAID array fails.
PBA 1 101 160 … 10 11 P1 PBA 1 101 160 ... 10 11 P1
2. Create a new RAID array to store the data of the RAID array where a block fails. 2. Create a new RAID array to store the data in the faulty block.
33 Huawei Confidential
RAID-TP
RAID protection is essential to a storage system for consistent high reliability and performance. However, the
reliability of RAID protection is challenged by uncontrollable RAID array construction time due to drastic
increase in capacity.
Traditional
RAID
34 Huawei Confidential