Systemz Architecture Course
Systemz Architecture Course
System z Architecture
E-mail: joachim_von_buttlar@de.ibm.com
Phone: +49-7031-16-2914
Agenda
Introduction
System Configuration
Architecture Modes
Register Sets
Storage
Interrupts
Timing Facilities
Instructions
Storage Protection
Virtual Storage
Multiprocessing
Input/Output
Initial Program Loading (IPL)
Partitioning and Virtualization
Parallel Sysplex
Trademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.
DB2* HyperSwap System z9*
Cool Blue IBM* Tivoli*
DRDA* IBM logo* WebSphere*
DS8000 OMEGAMON* z9
ESCON* Parallel Sysplex* zArchitecture*
eServer ResourceLink z/OS*
FICON* System p ` z/VM*
FlashCopy* System Storage z/VSE
GDPS* System x zSeries*
HiperSockets System z
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that
any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and
the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may
have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be
subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the
performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
z990
zAAP
z900
Capacity
3000 1.7
GHz
2500
MHz
1.2
2000 GHz
770
MHz
1500
550
420
300 MHz
MHz MHz
1000
500
0
1997 1998 1999 2000 2003 2005 2008
G4 G5 G6 z900 z990 z9 EC z10 EC
G4 - 1st full-custom CMOS S/390 z900 - Full 64-bit z/Architecture z10 EC – Architectural
G5 - IEEE-standard BFP; branch target prediction z990 - Superscalar CISC pipeline extensions
G6 - Cu BEOL z9 EC - System level scaling
Bringing high
performance computing
benefits to commercial
workloads
CEC Cage
Power
Supplies
Processor Books
and Memory
Hybrid
3x I/O Cooling
cages
Support
Elements
Front View
Connectivity HiperSockets z10 EC New HiperSockets Layer 2 and Multiple Write Facility
FICON for SANs Up to 336 FICON channels on z10 EC and z9 EC
Total channels Same - Up to 1024 channels
Internal I/O Bandwidth z10 EC has industry standard 6 GBps InfiniBand supports high speed connectivity and
high bandwidth versus z9 EC using 2.7 GBps Self Time Interconnects (STIs)
Enhanced I/O structure Star L2 Cache Book Interconnect versus Ring Topology interconnect on z9 EC
Coupling Coupling with InfiniBand – improved distance and potential cost savings
Cryptography Improved AES 192 and 256 and stronger hash algorithm with Secure Hash Algorithm
(SHA-512)
LAN Connectivity New OSA-Express3 for 10 Gigabit Ethernet connectivity
On Demand / RAS Capacity Provisioning Mgr z10 EC & z/OS (1.9) for policy based advice and automation
RAS Focus z10 EC can help eliminate preplanning required to avoid scheduled outages
Just in Time deployment of Capacity on Demand offerings CBU and On/Off CoD plus new Capacity for Planned
Capacity Events are resident on z10 EC
0-6
E12 1/17 0 - 12 0 - 12 0 - 12 3 2 352
0-6
0 - 13
E26 2/34 0 - 26 0 - 26 0 - 16 6 2 752
0 - 13
0 - 20
E40 3/51 0 - 40 0 - 40 0 - 16 9 2 1136
0 - 20
0 - 28
E56 4/68 0 - 56 0 - 56 0 - 16 10 2 1520
0 - 28
0 - 32
E64 4/77 0 - 64 0 - 64 0 - 16 11 2 1520
0 - 32
1 A minimum of one CP, IFL, or ICF must be purchased on every model
2 Machine with 13 or more CPs – all must run at full capacity
3 One zAAP and one zIIP may be purchased for each CP purchased
Operating systems
z/OS z/TPF z/VSE™
Providing intelligent dispatching on Support for 64+ processors Interoperability with Linux
z10 EC for performance Workload charge pricing on System z
Up to 64-way support Exploit encryption technology Exploit encryption
Simplified capacity provisioning on technology
z10 EC MWLC pricing with
New high availability disk solution sub-capacity option
with simplified management
Enabling extreme storage volume
scaling
Facilitating new zIIP exploitation z/VM
Consolidation of many virtual
images in a single LPAR
Enhanced management
Linux on System z functions for virtual images
Large Page Support improves performance Larger workloads with more
Linux CPU Node Affinity is designed to avoid cache pollution scaleability
Software support for extended CP Assist instructions AES & SHA
Note: Please refer to the PSP buckets for the latest PTFs
E64
z9 EC
E56
Concurrent Upgrade
E40
E26
z990
E12
• Compared to z9 EC (1)
New IBM Systems Director Active Energy Manager (AEM) for Linux on System z V3.1
– Offers a single view of actual energy usage across multiple heterogeneous IBM platforms within the
infrastructure
– AEM V3.1 energy management data can be exploited by Tivoli® enterprise solutions such as IBM
Tivoli Monitoring, IBM Tivoli Usage and Accounting Manager, and IBM Tivoli OMEGAMON® XE on
z/OS
– AEM V3.1 is a key component of IBM’s Cool Blue™ portfolio within Project Big Green
System z PU Characterization
The type of Processor Units (PUs) that can be ordered on z9:
– Central Processors (CPs)
• Provides processing capacity for z/Architecture™ and ESA/390 instruction sets
• Runs z/OS, z/VM, VSE/ESA, z/VSE, TPF/ESA, z/TPF, Linux for System z and Linux
under z/VM or Coupling Facility
– System z Application Assist Processors (zAAPs)
• Under z/OS, zAAPs are used for Java processing by the Java™ Virtual Machine
(JVM) as well as XML processing
– System z Integrated Information Processors (zIIPs)
• First exploited by DB2 Version 8 for z/OS (requires z/OS V1.7)
– Integrated Facility for Linux (IFL)
• Provides additional processing capacity for Linux workloads
– Internal Coupling Facility (ICF)
• Provides additional processing capacity for the execution of the Coupling Facility
Control Code (CFCC) in a CF LPAR
– System Assist Processors (SAPs)
• SAPs manage the start and ending of I/O operations for all Logical Partitions and all
attached I/O
System z Architecture
System Configuration
System Configuration
A system consists of
– One or more CPUs
– Main storage accessible by all CPUs
– I/O devices
CPUs plus main storage are referred to as Central
Electronic Complex (CEC)
System z Architecture
Architecture Modes
System z Architecture
Register Sets
Register Sets
Program mask
– Bit 20: enable fixed-point overflow exception
– Bit 21: enable decimal overflow exception
– Bit 22: enable hex floating-point exponent underflow exception
– Bit 23: enable hex floating-point significance exception
EA – extended addressing (64-bit addressing mode, BA must also
be 1)
BA – basic addressing (31-bit addressing mode, 0 = 24-bit
addressing mode)
Instruction address
– It is stepped by the length of the current instruction
System z Architecture
Storage
Storage
Integral Boundaries
Alignment Requirements
System z Architecture
Interrupts
Interrupts
Interrupt Action
Interrupt Masking
System z Architecture
Timing Facilities
Timing Facilities
Time-of-Day Clock
First 8 bits (“zeros”) will be used after September 17, 2042 (good until
year 38,400 A.D.)
Bit 59 equals 1 microsecond
Bit 111 equals 222 * 10-24 seconds
Programmable field used to generate unique value (set by SET CLOCK
PROGRAMMABLE FIELD instruction, SCKPF)
Clock Comparator
CPU Timer
Timer Stepping
System z Architecture
Instructions
Instruction Set
For comparison: System p also has over 700 instructions, half of them
being vector-related operations
Comparisons
– Signed arithmetic and unsigned (LOGICAL)
– 16, 32, and 64 bit
Branches (absolute and relative, looping instructions)
Subroutine linkage
– BRANCH AND SAVE [AND SET MODE]
Bit testing
– TEST UNDER MASK
– FIND LEFTMOST ONE
Storage-to-storage copy and compare
– MOVE [LONG [EXTENDED]]
– COMPARE LOGICAL [LONG [EXTENDED]]
Conversion to/from packed decimal format
– CONVERT TO BINARY / DECIMAL
String processing
– TRANSLATE, TRANSLATE AND TEST
– SEARCH STRING, COMPARE LOGICAL STRING (for null-terminated strings in C)
Conversion little / big endian
– LOAD REVERSED, STORE REVERSED
Checksum generation
Sorting
– COMPARE AND FORM CODEWORD, UPDATE TREE
Encryption
– CIPHER MESSAGE
Atomic updates, locking
– COMPARE AND SWAP, PERFORM LOCKED OPERATION
Decimal Instructions
Floating-Point Instructions
Three precisions
– Short (32-bit), ca. 6 – 7 decimal digits
– Long (64-bit), ca. 16 – 17 decimal digits
– Extended (128-bit), ca. 33 – 34 decimal digits
Binary floating-point format (BFP) characteristics
– IEEE 754 standard
– Number range (long precision): ~ 4.9 x 10-324 ≤ M ≤ ~ 1.8 x 10308
– Supports infinity and NaN (not-a-number)
Decimal floating-point format (DFP) characteristics
– Introduced with System z9 GA3 (millicode), new: hardware implementation on System z10
– IEEE P754 standard
– Number range (long precision): 1 x 10-398 ≤ M ≤ (1016 – 1) x 10369
– Supports infinity and NaN
– Exact representation of decimal fractions (e.g. 0.1)
Hexadecimal floating-point (HFP) characteristics
– System z unique, introduced in S/360
– Number range (any precision): ~ 5.4 x 10-79 ≤ M ≤ ~ 7.2 x 1075
– Does not support infinity and NaN
Instruction Formats
Instruction Execution
After the instruction has been fetched, the instruction address in
the PSW is incremented by the instruction length (2, 4, or 6 bytes)
An instruction ends in one of the following ways:
– Completion and partial completion
• This is the normal end
– Suppression
• Instruction is not executed, but PSW instruction address has been updated
• Occurs with most of the program interrupt conditions (program old PSW points
after the failing instruction)
– Nullification
• Instruction is not executed and PSW instruction address has not been updated
• Occurs with program interrupt conditions that pertain to DAT handling (i.e. the
instruction will be executed after the page has been loaded)
• Occurs also for some interruptible instructions (MVCL, CLCL) that resume at the
point of interrupt
– Termination
• Instruction may have been partially executed and PSW instruction address has
been updated
Instruction Format RR
AR R1,R2
AGR R1,R2
Instruction Format RX
A R1,4(R2,R3)
AG R1,-4(R2,R3)
AY R1,-4(R2,R3)
is assembled as 0x E3 1 2 3 FFC FF 5A
Instruction Format RI
AGHI R1,-1
All 47 variations of
ADD (z10)
Compare Instructions
Mnemonic R1 R2 Remarks
BRANCH ON CONDITION
M1 field used as 4-bit mask, each bit corresponds to one condition code (0-3)
BC[R] 8,… means branch on cc 0
Any combination of mask bits is possible, e. g. 2+1 means branch on cc 2 or 3
Branch target is contained in register (R2) or is accumulated from B2, D2, and X2
M1 field used as 4-bit mask, each bit corresponds to one condition code (0-3)
BRC[L] 8,… means branch if cc is 0
Any combination of mask bits is possible, e. g. 2+1 means cc 2 or cc 3
Branch target is relative, assembled as number of halfwords forward / backward
(+/- 64KB for BRC, +/- 4GB for BRCL)
Subroutine Linkage
BRANCH AND SAVE stores the current instruction address
from the PSW (i.e., the address of the instruction following
BASR) in R1 and branches to the second-operand address:
EXECUTE Instruction
Purpose:
– Execute an instruction outside the instruction stream
– Execute an instruction whose length field is known only at run-time
Format:
Function:
– The low-order byte of R1 is (conceptually) ORed into the second byte of the
target instruction located at the second-operand address, then the target
instruction is executed
– If R1 is specified as 0, the target instruction is not modified before execution
Example:
A MOVE instruction (MVC) copies 1 – 256 bytes from the second
operand to the first operand
Long-running Instructions
On completion
– The addresses are incremented
– The lengths are decremented
– If completion was partial, cc 3 is
set
Partial Completion
Interruptible Instructions
System z Architecture
Storage Protection
Key-controlled protection
Low-address protection
DAT protection (formerly called page protection)
Access-list-controlled protection
Storage Keys
Storage Key
Protection Rules
Storage-Protection-Override
Fetch-Protection-Override
Low-Address Protection
DAT Protection
Prevents a page (4K) from being altered, does not provide fetch protection
Used, for instance, to implement the POSIX fork() function
Enabled by setting the P bit in a page-table entry (PTE):
System z Architecture
Virtual Storage
Virtual Storage
Finally, the page-table entries contain the real address of the page
frames:
ST origin PT origin
RS table
ST origin PT origin
Legend:
PT origin Page address
RF = region-first RT table
RS = region-second PT origin Page address
64-bit real address
RT = region-third
S = segment Segment Page address
P = page table
Page address
Page table
128 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group
Translation-Lookaside Buffer
Access-Register Mode
Purpose: access multiple address spaces at a time
Enabled when PSW.5 = 1 (DAT) and bits 16-17 are set 01
Access register is implicitly the same as the base register (NOT index register):
LG R1,8(R3,R4)
AR4 is used to translate the logical address 8(R3,R4) to a real address
If the contents of the access register is 0, then primary-space address translation is used (via CR1)
If the contents of the access register is 1, then secondary-space address translation is used (via
CR7)
Base register 0 means no access register is to be used (not the contents of AR0)
For further details, check the Principles of Operation, “Access-Register Translation“ in chapter 5
System z Architecture
Multiprocessing
Multiprocessing
0x6000
0x4000
0x2000
0
CPU 0 CPU 1 CPU 2
Prefix = 0 Prefix = 0x2000 Prefix = 0x4000
Purpose
– Safe update of data in storage shared between several routines
running on different processors
– May also be needed on uniprocessor when routines run in different
threads/tasks
Concept
– Fetch the original value from storage
– Create a new value, based on the original value
– Compare the private copy of the original value to the current value in
storage – if it has not changed since the original fetch, replace it with
the new value
– However, if the value in storage has changed in between, retry the
process
Note: Operand must be aligned according to its length (word, double word, quad word
boundary)
SIGP R1,R3,D2(B2)
– R1: returned status
– R1+1: parameter
– R3: target CPU
– D2(B2): order code
System z Architecture
Input/Output
Input/Output
Device
Channel path
Input/Output
Device Numbers
*
IMID RESOURCE PART=((LP1,1))
*
******************************************************************
* FICON Channel *
******************************************************************
CH21 CHPID PCHID=521,PATH=(CSS(0),21),TYPE=FC, *
PART=LP1
CNTLUNIT CUNUMBR=555,UNITADD=((10,16)),UNIT=FC, *
PATH=((CSS(0),21))
IODEVICE ADDRESS=(5210,16),CUNUMBR=555,UNIT=FC,UNITADD=10,*
STADET=Y
Subchannels
Subchannels (continued…)
Subchannel-Information Block
Modification of Subchannels
CCW Flags
CD: chain data (use the same CCW command on the next
CCW)
CC: chain command (0 = this is the last CCW)
SLI: suppress incorrect length indication
SKP: do not read data on input
PCI: program-controlled interruption (generate intermediate
I/O interrupt when this CCW is fetched)
IDA: indirect data addressing
S: suspend channel program without generating I/O interrupt
When the last CCW has been processed, the channel signals
ending status to the channel subsystem
The subchannel is now made status-pending and an
interruption condition is generated for its I/O-interruption
subclass (ISC)
An I/O interrupt is said to be floating, i.e. it is “offered” to all
CPUs that are enabled for this ISC
If more than one CPU is enabled for this I/O interrupt, only
one will actually take it
As an alternative, the CPU may poll for an interrupt using the
TEST PENDING INTERRUPTION (TPI) instruction
Important:
– Normally, an I/O operation ends with device status
channel end/device end (CE/DE)
– Sometimes, channel end status comes as separate
status before device end
– Unit check is signaled for a variety of exceptional
conditions
– Normally, the subchannel status is 0
Partitions Partitions
Subchannels Subchannels
63K 63K
Channels Channels
63K
Base
Base
Base
Base
Base Bases
Base
Alias
Alias
Alias
Alias
Alias
Alias
Alias
Aliases
Bases Aliases
z9 – 2094 Processor
Partitions Partitions
Subchannels Subchannels
63.75K 64K 63.75K 64K
Base Alias
Base Alias
Bases Aliases
Channels Channels
MIDAWs continued…
MIDAW Layout
System z Architecture
IPL Process
Absolute storage
24
16
8
Initial IPL CCW reads 24 bytes
0 address of next CCW
IPL Process
Absolute storage
System z Architecture
Partitioning and
Virtualization
Purpose
– Run more than one operating system on a CEC
Method
– Make each operating system think it owns a whole
machine
Comes in two flavors
– LPAR (logical partitioning)
– z/VM (virtual machine)
LPAR
z/VM
Trans-
trans- Trans-
actions
action actions
Business
Applications
Appl.+DB
CICS
IMS Java
& Consolidation
Linux
EJB Cluster/Parallel
Appl.
Applications File/Disk/Print
WebSphere
DL/I Siebel e-commerce
DB2
JVM Linux
System z Platform
History of SIE
Guest PSW
Guest CPU timer, clock comparator
Guest epoch difference (for guest TOD clock)
Guest control registers
Guest general registers 14 and 15
Guest prefix register
On SIE entry
– Load guest general registers 0 – 13
– Load guest floating-point registers, floating-point-control
register
– Load guest access registers
On SIE exit
– Handle interception
Interception controls
– Certain instructions
– SVCs by SVC number
– LCTL for any set of control registers
PSW enabled for I/O interrupts (PSW.6 = 1)
PSW enabled for external interrupts (PSW.7 = 1)
Stop request (triggered from another CPU)
An interception
– The state descriptor is updated, an interception code is stored
and the host program resumes after the SIE instruction.
Guest Storage
Absolute storage of an
LPAR must be contiguous
LPAR 3
LPAR 2
Activating and deactivating
LPARs may lead to
fragmentation of storage
LPAR 1
LPAR?
LPAR Hypervisor
LPAR 3
LPAR 3
LPAR 1
LPAR 1
Scenario
– Application running in an operating system
– Operating system (guest-2) running under z/VM
– z/VM (guest-1) running in an LPAR
– LPAR managed by the LPAR hypervisor (host)
System z Architecture
Parallel Sysplex
Parallel Sysplex
2064
z9 EC 2094
ISC-3, ICB-4
STI
2096 2084
STI
z10 EC
E64 no ICB-4 STI
STI
z9 BC STI z990
ISC-3, ICB-4 ISC-3, ICB-4
12x IB-SDR
2094 IFB 2086
3 GBps
z9 EC IFB STI
Dedicated CF STI STI
PSIFB, ISC-3, ICB-4 STI z890
IFB ISC-3, ICB-4
STI
2096 z10 EC
z9 BC STI PSIFB, ISC-3, and
Dedicated CF ICB-4 (Except E64)
PSIFB, ISC-3, ICB-4 12x IB-SDR 12x IB-DDR
IFB IFB
3 GBps 6 GBps
** STI = ISC-3 and ICB-4 links which use STI technology
216 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group
Coupling Channels
z/OS CF
(Sender) (Receiver)
MCB MCB
(command) (command)
Data Data
(up to 64K (up to 64K
bytes) bytes)
MRB MRB
(response) (response)
Secondary Message
Message Vectors
CF Duplexing
CF Duplexing (continued…)
System z Architecture
Appendix
References (1)
http://www.elink.ibmlink.ibm.com/publications/servlet/pbi.wss
References (2)
http://www.redbooks.ibm.com/abstracts/sg247515.html
http://www.redbooks.ibm.com/abstracts/sg247516.html
References (3)
http://www.research.ibm.com/journal/rd46-45.html
http://www.research.ibm.com/journal/rd51-12.html
References (4)
http://www.research.ibm.com/journal/rd48-34.html
http://domino.research.ibm.com/tchjr/journalindex.nsf/495f80c
9d0f539778525681e00724804/05ffabe33879ed1485256bfa00685
ddf?OpenDocument