100% found this document useful (1 vote)
605 views

Systemz Architecture Course

DB2 Cool Blue is a trademark of Intel Corporation in the United States and other countries. UNIX is a registered trademark of Linus Torvalds in the United States and other countries. Red Hat, the Red Hat "Shadow Man" logo, and all Red Hat-based trademarks and logos are trademarks of Red Hat, Inc.

Uploaded by

api-3839798
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
605 views

Systemz Architecture Course

DB2 Cool Blue is a trademark of Intel Corporation in the United States and other countries. UNIX is a registered trademark of Linus Torvalds in the United States and other countries. Red Hat, the Red Hat "Shadow Man" logo, and all Red Hat-based trademarks and logos are trademarks of Red Hat, Inc.

Uploaded by

api-3839798
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 230

Systems and Technology Group

System z Architecture

Joachim von Buttlar


System z Firmware Development
IBM Laboratories Böblingen, Germany

E-mail: joachim_von_buttlar@de.ibm.com
Phone: +49-7031-16-2914

IBM Laboratories Böblingen May 2008 © 2008 IBM Corporation


Systems and Technology Group

Agenda
 Introduction
 System Configuration
 Architecture Modes
 Register Sets
 Storage
 Interrupts
 Timing Facilities
 Instructions
 Storage Protection
 Virtual Storage
 Multiprocessing
 Input/Output
 Initial Program Loading (IPL)
 Partitioning and Virtualization
 Parallel Sysplex

2 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Trademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries.
DB2* HyperSwap System z9*
Cool Blue IBM* Tivoli*
DRDA* IBM logo* WebSphere*
DS8000 OMEGAMON* z9
ESCON* Parallel Sysplex* zArchitecture*
eServer ResourceLink z/OS*
FICON* System p ` z/VM*
FlashCopy* System Storage z/VSE
GDPS* System x zSeries*
HiperSockets System z

* Registered trademarks of IBM Corporation

The following are trademarks or registered trademarks of other companies.


Intel is a trademark of Intel Corporation in the United States, other countries, or both.
Java and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countries
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.
Red Hat, the Red Hat "Shadow Man" logo, and all Red Hat-based trademarks and logos are trademarks or registered trademarks of Red Hat, Inc., in the United States and
other countries.

* All other products may be trademarks or registered trademarks of their respective companies.

Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that
any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and
the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may
have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be
subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the
performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

3 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

History – from System/360 to System z10

 1964: Introduction of S/360


 1970: S/370, introduced virtual storage and virtual machines
 1983: 370/XA, eXtended Architecture, introduced 31-bit addressing
and a new channel subsystem
 1988: ESA/370, introduced LPAR, access registers
 1990: ESA/390, introduced ESCON channels, crypto, Sysplex Timer
(ETR)
 2000: z/Architecture, introduced 64-bit architecture (z900)
 2003: Up to 32-way, up to 30 LPARs, multiple books, multiple
channel subsystems (z990)
 2005: Up to 54-way, up to 60 LPARs, multiple subchannel sets (z9)
 2008: Up to 64-way, new I/O hubs, up to 1.5TB central storage (z10)

4 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Continuing the modular design for flexibility


Facilitates upgradeability and availability
IBM System z10 Enterprise Class (z10 EC)
Machine Type: 2097
5 Models: E64
E56
E40
E26
E12
Processor Units (PUs): Memory: I/O:

 New Enterprise Quad Core technology –  6 GB/s InfiniBand


 Up to 1.5 TB / 384 GB
4.4 GHz
per book host buses for
 One to four book modular design I/O
 Sub-capacity available up to 12 CPs  16 GB HSA separately
 Enhanced capacity 64-way model managed and not  New OSA-
 17 PUs per book (17 and 20 for Model included in customer Express3 10 GbE
E64) purchased memory
 InfiniBand
– New core sparing technology
 Star L2 cache book Coupling Links
– More SAPs per system
– Configurable PUs allow you to design the system to interconnect
meet your needs (e.g. CPs, specialty engines, SAPs)

5 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Improved server performance and scalability with faster engines,


more engines and dispatching synergy

 The z10 EC uniprocessor delivers more capacity than z9™ EC uniprocessor *


 The z10 EC 64-way offers more server capacity than the largest z9 EC**
 Introducing HiperDispatch for improved synergy with z/OS® operating system
to deliver scalability and performance

4.4 GHz processor chip


z10 EC
Customer Engines

Hardware Decimal Floating Point


IFL z9 EC
Crypto
zIIP

z990
zAAP

z900

Capacity

Significant capacity for traditional growth and consolidation

6 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

IBM z10 EC Continues the CMOS Mainframe Heritage


4000 4.4
GHz
3500

3000 1.7
GHz
2500
MHz

1.2
2000 GHz
770
MHz
1500
550
420
300 MHz
MHz MHz
1000

500

0
1997 1998 1999 2000 2003 2005 2008
G4 G5 G6 z900 z990 z9 EC z10 EC
 G4 - 1st full-custom CMOS S/390  z900 - Full 64-bit z/Architecture  z10 EC – Architectural
 G5 - IEEE-standard BFP; branch target prediction  z990 - Superscalar CISC pipeline extensions
 G6 - Cu BEOL  z9 EC - System level scaling

7 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Making high performance a reality


 New Enterprise Quad Core z10 processor chip
– 4.4 GHz
– Additional throughput means improved
price/performance
– 50+ instructions added to improve compiled
code efficiency
– Support for 1MB page frames

 Hardware accelerators on the chip


– Hardware data compression
– Cryptographic functions
Enterprise Quad
– Hardware Decimal Floating point Core z10 processor
chip
 Java™ and Linux workloads get performance
improvements from new core pipeline design

8 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

zAAPs – not just for Java anymore!


More new application technology exploiters, more new benefits
 zAAP designed to help implement new application technologies on System z
– Java was the first exploiter – lowering the cost of computing for WebSphere®
Application Server and other Java technology-based applications le
vailab
A
– z/OS XML System Services (introduced with z/OS V1.9 and rolled back to V1.8 9/07
and V1.7) helps make hosting XML data and transactions on System z more
attractive. DB2 9 and Enterprise Cobol V4.1 are the first exploiters.
 …. and more on Java
– SDK6 on z10 EC delivers improved performance over SDK5 on z9 EC
– New function on z10 EC may benefit Java performance
• New core pipeline design of z10 processor chip
• Up to 3X more available memory than the z9 EC
• PLUS zAAP price is same for z10 EC as z9 EC and we offer no charge MES
upgrades when moving to new technology

IBM System z10 Application Assist Processor – zAAP


9 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Consolidation with Linux gets a “green light”


z10 EC may help customers become more energy efficient:
 Deploy energy efficient technologies – reduce energy
consumption and save floor space
Economics of IFLs and z/VM® help to drive down the cost of IT
 IFLs attractively priced, have no impact on z/OS license fees, and
z/VM and Linux software priced at real engine capacity
 ‘No charge’ MES upgrades available when upgrading to new
technology

Integrated Facility for Linux – IFL


10 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Helping to drive down the cost of IT


Now even more workloads can benefit from zIIP
 zIIP can help to integrate data across the enterprise to improve resource optimization and lower
the cost of ownership for eligible data serving and transaction processing workloads
– Centralized data serving - First to exploit zIIP were workloads such as BI, ERP, and CRM applications
running on distributed servers with remote connectivity to DB2 V8
le
– Network encryption - With z/OS V1.8, IPSec processing added making the zIIP an IPSec encryption engine ilab
Ava 07
helpful in creating highly secure connections in an enterprise 9/
– Serving XML data – introduced with z/OS V1.9, enables XM parsing from DB2 9 inserting and saving
XML data over DRDA® to take advantage of zIIPs !
new
– Remote mirror – zIIP assisted z/OS Global Mirror function (zGM, formerly XRC)
!
• Most of the System Data Mover (SDM) processing eligible for zIIP offload new
• Helps reduce server utilization at recovery site
• Potential for reduced software costs
– Exploiting of zIIPs by ISVs
 zIIPs offer economics to help you
– zIIPs attractively priced with no IBM software charges on the zIIP
– No charge MES upgrades available for zIIPs when upgrading to new technology

IBM System z10 Integrated Information Processor - zIIP

11 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Focused performance boost


Hardware Decimal Floating Point
 Decimal arithmetic widely used in commercial and financial applications
 Computations often handled in software
 First delivered with the System z9 - brought improved precision and function
– Avoids rounding and other problems with binary/decimal conversions
 On z10 EC integrated on every core giving a performance boost to execution of
decimal arithmetic
 Growing industry support for DFP standardization
– Java BigDecimal, C#, XML, C/C++, GCC, DB2® V9, Enterprise PL/1, Assembler
– Endorsed by key software vendors including Microsoft® and SAP
– Open standard definition led by IBM

Bringing high
performance computing
benefits to commercial
workloads

12 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z10 EC – Under the Covers (Models e56 and e64)


Internal
Batteries
(optional)

CEC Cage

Power
Supplies

Processor Books
and Memory

Hybrid
3x I/O Cooling
cages
Support
Elements

Front View

13 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z10 Books

14 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z10 Multi-chip Module

15 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z10 EC functional comparison to z9 EC


Processor / Memory  Uniprocessor Perf.  New 4.4GHz processor chip
 System Capacity  z10 EC has 5 and z9 EC has 5 models, both with up to 4 books
 Processor Design  z10 EC has up to 64 PUs to configure, up to 54 on z9 EC
 Models  z10 EC has up to 100 Capacity settings versus 78 on the z9 EC
 Processing Units (PUs)  z10 EC has up to 1.5 TB vs. up to 512 GB on z9 EC
 Granular Capacity  z10 EC has fixed 16 GB HSA, z9 EC had HSA carved from purchased memory
 Memory
 Fixed HSA

Virtualization  LPARs  z10 EC has up to 64 logical processors in an LPAR versus 54 on z9 EC


 HiperDispatch  z10 EC has HiperDispatch for improved synergy with z/OS Operating System to deliver
scalability and performance

Connectivity  HiperSockets  z10 EC New HiperSockets Layer 2 and Multiple Write Facility
 FICON for SANs  Up to 336 FICON channels on z10 EC and z9 EC
 Total channels  Same - Up to 1024 channels
 Internal I/O Bandwidth  z10 EC has industry standard 6 GBps InfiniBand supports high speed connectivity and
high bandwidth versus z9 EC using 2.7 GBps Self Time Interconnects (STIs)
 Enhanced I/O structure  Star L2 Cache Book Interconnect versus Ring Topology interconnect on z9 EC
 Coupling  Coupling with InfiniBand – improved distance and potential cost savings
 Cryptography  Improved AES 192 and 256 and stronger hash algorithm with Secure Hash Algorithm
(SHA-512)
 LAN Connectivity  New OSA-Express3 for 10 Gigabit Ethernet connectivity

On Demand / RAS  Capacity Provisioning Mgr  z10 EC & z/OS (1.9) for policy based advice and automation
 RAS Focus  z10 EC can help eliminate preplanning required to avoid scheduled outages
 Just in Time deployment of  Capacity on Demand offerings CBU and On/Off CoD plus new Capacity for Planned
Capacity Events are resident on z10 EC

Environmentals  Monitoring  z10 EC displays energy efficiency on SAD screens


 Utilizes IBM Systems Director Active Energy Manager for Linux on System z for trend
calculations and management of other servers that participate

16 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Flexible ordering to meet your requirements


 Huge flexibility in configuration options
 Upgradeability within the z10 EC
 Ability to change PU configuration if business need changes

zAAPs Std Std Max


Model Books/PUs CPs1,2 IFLs1 ICFs1
and zIIPs3 SAPs Spares Memory GB

0-6
E12 1/17 0 - 12 0 - 12 0 - 12 3 2 352
0-6
0 - 13
E26 2/34 0 - 26 0 - 26 0 - 16 6 2 752
0 - 13
0 - 20
E40 3/51 0 - 40 0 - 40 0 - 16 9 2 1136
0 - 20
0 - 28
E56 4/68 0 - 56 0 - 56 0 - 16 10 2 1520
0 - 28
0 - 32
E64 4/77 0 - 64 0 - 64 0 - 16 11 2 1520
0 - 32
1 A minimum of one CP, IFL, or ICF must be purchased on every model
2 Machine with 13 or more CPs – all must run at full capacity
3 One zAAP and one zIIP may be purchased for each CP purchased

17 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Helping to get you connected to your world


 Improved performance and flexibility for connectivity
 Broad set of options to meet your needs
 Excellent investment protection when you upgrade to the z10 EC
To the Network
Within the server  OSA-Express3
!
– 10 Gigabit Ethernet new
 HiperSockets™ !
new  OSA-Express2
– Multi Write Facility
– 1000BASE-T Ethernet
!
– Layer2 support new – Gigabit Ethernet LX and SX
 Integrated console controller – 10 Gigabit Ethernet LR
 Integrated communications For Clustering
controller support
w!
 InfiniBand Coupling Links ne
To the Data  ICB-4
 FICON/FCP  ISC-3 (peer mode only)
– FICON® Express4  IC (define only)
w!
– FICON Express2  STP - NTP Client Support ne
– FICON Express  No support for active participation of a z10 EC
(Required for FCV) in the same Parallel Sysplex® cluster with:
 ESCON® * Note: Red items carry forward on a Machine MES – IBM eServer™ zSeries® 900 (z900)
only, not available for new system orders – IBM eServer zSeries 800 (z800)

18 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Operating systems
z/OS z/TPF z/VSE™
 Providing intelligent dispatching on  Support for 64+ processors  Interoperability with Linux
z10 EC for performance  Workload charge pricing on System z
 Up to 64-way support  Exploit encryption technology  Exploit encryption
 Simplified capacity provisioning on technology
z10 EC  MWLC pricing with
 New high availability disk solution sub-capacity option
with simplified management
 Enabling extreme storage volume
scaling
 Facilitating new zIIP exploitation z/VM
 Consolidation of many virtual
images in a single LPAR
 Enhanced management
Linux on System z functions for virtual images
 Large Page Support improves performance  Larger workloads with more
 Linux CPU Node Affinity is designed to avoid cache pollution scaleability
 Software support for extended CP Assist instructions AES & SHA

19 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z10 EC operating system support


ESA/390 z/Architecture®
Operating System
(31-bit) (64-bit)
z/OS Version 1 Releases 7(1), 8 and 9 No Yes

Linux on System z, RHEL 4, 5 & SLES 9, 10 No Yes

z/VM Version 5 Release 2(2) and 3(2) No Yes

z/VSE Version 3 Release 1(3) Yes No

z/VSE Version 4 Release 1(4) No Yes

z/TPF Version 1 Release 1 No Yes


TPF Version 4 Release 1 (ESA mode only) Yes No

1. z/OS R1.7 + zIIP Web Deliverable required for HiperDispatch.


2. Requires Compatibility Support which allows z/VM to IPL and operate on the z10 EC providing z9 functionality for the base OS
and Guests.
3. z/VSE V3 runs in 31-bit mode only. It does not implement z/Architecture, and specifically does not implement 64-bit mode
capabilities. z/VSE is designed to exploit select features of IBM System z9® and zSeries hardware.
4. z/VSE V4 is designed to exploit 64-bit real memory addressing, but will not support 64-bit virtual memory addressing

Note: Please refer to the PSP buckets for the latest PTFs

20 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Protecting your investment in IBM System z technology


 Designed to protect your investment by offering upgrades from z9 EC and z990
to the z10 EC
 Full upgradeability within the System z10 family
– Upgrade to Model E64 will require a planned outage
 Temporary or permanent growth when you need it
– New provisioning architecture

E64

z9 EC
E56

Concurrent Upgrade
E40

E26

z990
E12

21 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Increasing capacity, reducing outages and enhancing


capabilities
 Five hardware models  6.0 GB/s InfiniBand (IB)
HCA to I/O interconnect
 Faster Uni Processor 1
 SCSI IPL included in Base
 Up to 64 customer PUs LIC
 36 CP Subcapacity Settings  OSA-Express3 10 Gb/s
 Star Book Interconnect  HiperSockets Layer 2
 Up to 1.5 TB memory Support
 Separate, fixed 16 GB HSA  InfiniBand Coupling Links
 Large Page Support  Capacity Provisioning
Support
 HiperDispatch
 Scheduled Outage
 Enhanced CPACF SHA 512, Reduction
AES 192 and 256-bit keys
 Improved RAS
 Hardware Decimal Floating
Point  FICON LX Fiber Quick
Connect
 Just in Time Deployment
 Power Monitoring
for capacity offerings –
permanent and temporary

• Compared to z9 EC (1)

22 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

IBM System z10 Enterprise Class


Innovative Enterprise Systems Solutions, Now and in the Future

IBM System z10 Enterprise Class


enables clients to consolidate and
virtualize their server environment…

to reduce costs and simplify their IT


infrastructure….

with high performance, energy efficient


green technologies ,…

providing the most resilient and secure


system to support business innovation
and growth.

23 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Managing energy consumption within the infrastructure


 ResourceLink™ provides tools to calculate server energy requirements before you
purchase a new system or an upgrade

 Offers a 14% improvement in performance per KWh over z9 EC

 Has energy efficiency monitoring tool


– Power and thermal information displayed via the System Activity Display (SAD)

 New IBM Systems Director Active Energy Manager (AEM) for Linux on System z V3.1
– Offers a single view of actual energy usage across multiple heterogeneous IBM platforms within the
infrastructure
– AEM V3.1 energy management data can be exploited by Tivoli® enterprise solutions such as IBM
Tivoli Monitoring, IBM Tivoli Usage and Accounting Manager, and IBM Tivoli OMEGAMON® XE on
z/OS
– AEM V3.1 is a key component of IBM’s Cool Blue™ portfolio within Project Big Green

24 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z PU Characterization
 The type of Processor Units (PUs) that can be ordered on z9:
– Central Processors (CPs)
• Provides processing capacity for z/Architecture™ and ESA/390 instruction sets
• Runs z/OS, z/VM, VSE/ESA, z/VSE, TPF/ESA, z/TPF, Linux for System z and Linux
under z/VM or Coupling Facility
– System z Application Assist Processors (zAAPs)
• Under z/OS, zAAPs are used for Java processing by the Java™ Virtual Machine
(JVM) as well as XML processing
– System z Integrated Information Processors (zIIPs)
• First exploited by DB2 Version 8 for z/OS (requires z/OS V1.7)
– Integrated Facility for Linux (IFL)
• Provides additional processing capacity for Linux workloads
– Internal Coupling Facility (ICF)
• Provides additional processing capacity for the execution of the Coupling Facility
Control Code (CFCC) in a CF LPAR
– System Assist Processors (SAPs)
• SAPs manage the start and ending of I/O operations for all Logical Partitions and all
attached I/O

25 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

System Configuration

26 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System Configuration

 A system consists of
– One or more CPUs
– Main storage accessible by all CPUs
– I/O devices
 CPUs plus main storage are referred to as Central
Electronic Complex (CEC)

27 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System Configuration (continued…)

 With z900 and older machines, a system could be run in


either of two modes
– Basic mode: one operating system owned the whole machine
– LPAR mode: multiple logical partitions (LPARs) could be
defined, each of them hosting an operating system
 With z990 and later, LPAR mode became mandatory
 The presence of LPAR mode is transparent to the software
running in that LPAR
 An operating system running in an LPAR controls only the
resources assigned to it (CPUs, storage, I/O)

28 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Architecture Modes

29 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Architecture Modes: ESA/390 vs. z/Architecture

 System z supports two architecture modes: ESA/390 and


z/Architecture
 In ESA/390 architecture, the system can run old (31-bit) operating
systems and applications (just as if you ran on an S/390 system
such as 9672 G6)
 For reasons of compatibility, a configuration is in ESA/390
architecture initially
 To switch to z/Architecture, the operating system issues a Signal
Processor (SIGP) instruction with order code “set architecture”
 All CPUs except the one that issues SIGP set architecture must be
in stopped state
 This order applies to all CPUs in the configuration, including the
one that issues SIGP

30 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/Architecture Key Characteristics

 64-bit architecture, uses 64-bit storage addresses and 64-bit


integer arithmetic instructions
 Superset of former ESA/390 architecture that provided 31-bit
storage addresses and 32-bit integer arithmetic instructions
 Incompatibilities between z/Architecture and ESA/390
usually affect only operating systems, not applications
– Different PSW formats
– Address translation process for virtual storage
– Layout of the assigned storage locations

31 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Register Sets

32 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Register Sets

 16 general registers (0 – 15)


– Used for address generation/calculation as well as for integer arithmetic
(signed and unsigned)
– Each register has 64 bits, numbered from 0 to 63
– In ESA/390 architecture, only the low-order 32 bits are accessible (bits 32-63)
 16 floating-point registers (0 – 15)
– Used for binary, decimal, and hexadecimal floating-point operations
– Each register has 64 bits
– Extended precision floating-point arithmetic uses register pairs (e. g., 0/2, 1/3,
4/6, 5/7, etc.)
 Floating-point-control register
– Contains IEEE exception masks and flags, defines rounding mode
– This is a 32-bit register

33 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Register Sets (continued…)

 16 access registers (0 – 15)


– Used to access address spaces
– Each register has 32 bits
 Prefix register
– Used to define the absolute addresses of the assigned
storage locations for a CPU
– This is a 32-bit register
– This register is accessible only by the operating system

34 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Register Sets (continued…)

 16 control registers (0 – 15)


– Used by the operating system only to control interrupt
handling, virtual storage, tracing facilities, etc.
– Each register has 64 bits
– In ESA/390 architecture, only the low-order 32 bits are
accessible (bits 32-63)

35 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Program Status Word (PSW)

 The PSW contains information required for the


execution of the current program:
– Instruction address
– Addressing mode
– Condition code
– Interrupt masks
– Indicator problem/supervisor state (user/kernel mode)

36 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Program Status Word – ESA/390 Architecture

37 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Program Status Word – z/Architecture

38 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Program Status Word – Fields

 R – enable program event recording (PER)


 T – enable dynamic address translation (virtual storage)
 I – enable I/O interrupts
 E – enable external interrupts
 Key – define storage protection key
 M – enable machine checks
 W – wait state
 P – problem state (0 = supervisor state)
 AS – address space (00 = primary space mode, 01 = secondary
space mode, 10 = access register mode, 11 = home space mode)
 CC – condition code

39 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Program Status Word – Fields (continued…)

 Program mask
– Bit 20: enable fixed-point overflow exception
– Bit 21: enable decimal overflow exception
– Bit 22: enable hex floating-point exponent underflow exception
– Bit 23: enable hex floating-point significance exception
 EA – extended addressing (64-bit addressing mode, BA must also
be 1)
 BA – basic addressing (31-bit addressing mode, 0 = 24-bit
addressing mode)
 Instruction address
– It is stepped by the length of the current instruction

40 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Problem State (User Mode)

 In problem state, the program must not execute privileged


instructions
– No access to control registers
– No access to timers
– No access to storage keys
– Access only to “uncritical” parts of the PSW
– In general: No access to architecture facilities that are vital to
the system as a whole
 If authorized, a program in problem state may execute
certain semiprivileged instructions (e.g. PROGRAM CALL)

41 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Supervisor State (Kernel Mode)

 In supervisor state, a program may exploit all


architecture facilities
 Transition from problem to supervisor state
– Via an interrupt (by loading a new PSW that
indicates supervisor state)
 Transition from supervisor to problem state
– Via the privileged LOAD PSW [EXTENDED]
instruction

42 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Storage

43 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage

 First of all: this is a BIG ENDIAN machine (like System p,


unlike Intel)
 Storage addresses are byte addresses
 Types of address:
– Virtual: translated by dynamic address translation (DAT) to
real addresses
– Real: translated to absolute addresses using the prefix register
– Absolute: after applying the prefix register
– Logical: the address seen by the program (this can either be a
virtual or a real address)

44 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Integral Boundaries

45 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Alignment Requirements

 Instructions are always aligned on halfword


boundaries
 Operands of non-privileged instructions normally
do not have to be aligned on an integral boundary
– Exception: Compare and Swap
 Operands of privileged instructions normally must
be aligned on an integral boundary
– Example: Set CPU Timer

46 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Addressing Modes

 There are three addressing modes:

PSW.31 (EA) PSW.32 (BA) Addressing Address range


mode
0 0 24-bit 16M
0 1 31-bit 2G
1 0 Invalid n/a
1 1 64-bit 16E (exa-bytes)

 The instructions SAM24, SAM31, and SAM64 can be used to


switch the addressing mode

47 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Interrupts

48 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interrupts

 There are six classes of interrupts:


– Supervisor call
– Program
– Machine check
– External
– Input/output
– Restart
 Each class is associated with a pair of old/new
PSWs in the assigned storage locations

49 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interrupt Action

 An interrupt comprises the following steps:


– The current PSW is stored in the assigned location
named “old PSW”, e.g. “program old PSW” for a program
interrupt
– Additional information, such as an interrupt code, is
stored
– A new PSW is loaded from an assigned location, e.g. the
“program new PSW”
 No registers are saved, this is done by software

50 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

PSWs in the Assigned Storage Locations


Real addresses Contents
0x120 – 0x012F Restart old PSW
0x130 – 0x013F External old PSW
0x140 – 0x014F Supervisor-call old PSW
0x150 – 0x015F Program old PSW
0x160 – 0x016F Machine-check old PSW
0x170 – 0x017F I/O old PSW
0x1A0 – 0x01AF Restart new PSW
0x1B0 – 0x01BF External new PSW
0x1C0 – 0x01CF Supervisor-call new PSW
0x1D0 – 0x01DF Program new PSW
0x1E0 – 0x01EF Machine-check new PSW
0x1F0 – 0x01FF I/O new PSW

51 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interrupt Masking

 I/O interrupts are masked by PSW.6


– Additionally, there are subclass mask bits in control register 6:
• I/O interruption subclass (ISC) 0
• …
• I/O interruption subclass (ISC) 7
 External interrupts are masked by PSW.7
– Additionally, there are subclass mask bits in control register 0:
• CPU timer
• Clock comparator
• …
 Repressible machine checks are masked by PSW.13
– Additionally, there are subclass mask bits in control register 14:
• Channel report
• Degradation
• …

52 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interrupt Masking (continued…)

 Some program interrupts are masked by PSW.20 – PSW.23:


– Fixed-point overflow
– Decimal overflow
– Hexadecimal-floating-point exponent underflow
– Hexadecimal-floating-point significance

 All other program interrupts cannot be masked

53 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interrupt Masking (continued…)

 There is no masking for

– Supervisor calls (the whole purpose of the SUPERVISOR CALL


instruction is to invoke the supervisor via the interrupt mechanism)

– Restart (this is a manual operation available from the support element


(SE) intended to restart the operating system)

– Exigent machine checks (if PSW.13 is 0, the CPU check stops). An


example of such a situation is instruction processing damage.

54 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Timing Facilities

55 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Timing Facilities

 Time-of-day (TOD) clock (one for the system)


 Clock comparator (one per CPU)
 CPU timer (one per CPU)

56 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Time-of-Day Clock

 Format of the TOD clock:

 Value 0 defined as January 1, 1900, 00:00:00 UTC


 Overflows on September 17, 2042, 23:53 UTC
 STORE CLOCK (STCK) instruction returns first 64 bits

57 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Time-of-Day Clock (continued…)

 STORE CLOCK EXTENDED (STCKE) returns 128 bits:

 First 8 bits (“zeros”) will be used after September 17, 2042 (good until
year 38,400 A.D.)
 Bit 59 equals 1 microsecond
 Bit 111 equals 222 * 10-24 seconds
 Programmable field used to generate unique value (set by SET CLOCK
PROGRAMMABLE FIELD instruction, SCKPF)

58 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Clock Comparator

 Same format as TOD clock (bit 51 = 1


microsecond)
 Continuously compared to TOD clock
 If TOD clock passes the clock comparator value,
an external interrupt is generated
 Good for real-time measurements
 Set with SET CLOCK COMPARATOR (SCKC), read
with STORE CLOCK COMPARATOR (STCKC)

59 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

CPU Timer

 Same format as TOD clock (bit 51 = 1


microsecond), but bit 0 is sign bit
 Stepped backwards
 When the CPU timer is < 0, an external interrupt is
generated
 Stopped when CPU stops
 Set with SET CPU TIMER (SPT), read with STORE
CPU TIMER (STPT)

60 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Timer Stepping

 On a real running system, TOD clock and CPU


timer are stepped at the same rate.

 On a virtual system, the CPU timer is stepped only


when the virtual system is dispatched, so it may
appear to step slower than the TOD clock.

Assume (on z/VM) clock comparator is set to TOD


clock + 5 seconds, CPU timer is set to 2 seconds.

You don’t know in advance who expires first.

61 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Instructions

62 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Set

 S/360 (November 1970) had 143 instructions


 System z10 GA1 Principles of Operation (February 2008) describe 834
instructions
 Groups:
– General instructions
– Decimal instructions
– Floating-point instructions
• General
• Hexadecimal
• Binary
• Decimal
– Control instructions (privileged)
– I/O instructions (privileged)

 For comparison: System p also has over 700 instructions, half of them
being vector-related operations

63 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Set (continued…)

 Additionally, some instructions are not described


in the Principles of Operation:
– Coupling instructions (Parallel Sysplex)
– Queued-directed I/O, HiperSockets
– Dynamic I/O configuration (HCD)
– Service Call (SCLP)
– Instructions associated with logical partitioning (LPAR)
and virtualization (z/VM)
 These instructions are all privileged

64 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

General Instructions (1)


 Load into/store from general registers
– LOAD, INSERT, STORE
– 8, 16, 32, and 64 bit
 Binary integer arithmetic
– ADD, SUBTRACT, MULTIPLY, DIVIDE
– 16, 32, and 64 bit
– Signed (2-complement)
• From 0 to 2,147,483,647: 0 – 0x7FFFFFFF (32 bit)
• From -1 to -2,147,483,648: 0xFFFFFFFF – 0x80000000
– Unsigned (LOGICAL)
• From 0 to 4,294,967,295: 0 – 0xFFFFFFFF (32 bit)
 Shift/rotate operations
– SHIFT (left and right, signed arithmetic and logical), ROTATE (left)
– 32 and 64 bit
 Bitwise logical operations
– AND, OR, EXCLUSIVE OR
– 8, 16, 32, and 64 bit, 1 – 256 bytes
– Combined ROTATE THEN INSERT / AND / OR / XOR

65 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

General Instructions (2)

 Comparisons
– Signed arithmetic and unsigned (LOGICAL)
– 16, 32, and 64 bit
 Branches (absolute and relative, looping instructions)
 Subroutine linkage
– BRANCH AND SAVE [AND SET MODE]
 Bit testing
– TEST UNDER MASK
– FIND LEFTMOST ONE
 Storage-to-storage copy and compare
– MOVE [LONG [EXTENDED]]
– COMPARE LOGICAL [LONG [EXTENDED]]
 Conversion to/from packed decimal format
– CONVERT TO BINARY / DECIMAL

66 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

General Instructions (3)

 String processing
– TRANSLATE, TRANSLATE AND TEST
– SEARCH STRING, COMPARE LOGICAL STRING (for null-terminated strings in C)
 Conversion little / big endian
– LOAD REVERSED, STORE REVERSED
 Checksum generation
 Sorting
– COMPARE AND FORM CODEWORD, UPDATE TREE
 Encryption
– CIPHER MESSAGE
 Atomic updates, locking
– COMPARE AND SWAP, PERFORM LOCKED OPERATION

 Note: Some possibly long-running instructions are interruptible

67 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Decimal Instructions

 Use packed decimal format


– One decimal digit per 4 bits (hex digit) encoded 0 – 9
– 1 – 31 decimal digits (always odd number)
– Rightmost hex digit is sign (A, C, E, F mean plus, B, D mean minus)
– e. g., 0x123C is decimal +123, 0x400D is decimal -400
– Used a lot in commercial applications (COBOL, PL/I)
 Integer arithmetic (+, -, *, /)
 Comparison
 Data validation
 Conversion to printable format (EBCDIC)
– EDIT, EDIT AND MARK

68 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Floating-Point Instructions
 Three precisions
– Short (32-bit), ca. 6 – 7 decimal digits
– Long (64-bit), ca. 16 – 17 decimal digits
– Extended (128-bit), ca. 33 – 34 decimal digits
 Binary floating-point format (BFP) characteristics
– IEEE 754 standard
– Number range (long precision): ~ 4.9 x 10-324 ≤ M ≤ ~ 1.8 x 10308
– Supports infinity and NaN (not-a-number)
 Decimal floating-point format (DFP) characteristics
– Introduced with System z9 GA3 (millicode), new: hardware implementation on System z10
– IEEE P754 standard
– Number range (long precision): 1 x 10-398 ≤ M ≤ (1016 – 1) x 10369
– Supports infinity and NaN
– Exact representation of decimal fractions (e.g. 0.1)
 Hexadecimal floating-point (HFP) characteristics
– System z unique, introduced in S/360
– Number range (any precision): ~ 5.4 x 10-79 ≤ M ≤ ~ 7.2 x 1075
– Does not support infinity and NaN

69 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Floating-Point Instructions (continued…)

 One set of floating-point registers used for all


formats
 Load into/store from floating-point registers
 Floating-point arithmetic (+, -, *, /, square root)
 Comparison
 Conversion to/from binary integer
 Conversion between BFP/DFP/HFP formats

70 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Control and I/O Instructions

 All instructions are privileged, i. e., available to the operating


system only
 Operate on vital system resources
– Control registers
– Program status word
– Storage keys
– DAT tables
– Timers
– Real storage
– etc.
 Even read access to resources such as CPU timer is privileged

71 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Formats

 An instruction is 2, 4, or 6 bytes in length


 It is always halfword-aligned
 The first two bits of an instruction determine its length:
– 00: 2 bytes
– 01 and 10: 4 bytes
– 11: 6 bytes
 The operation code consists of 8, 12, or 16 bits
– An 8-bit operation code always occupies the first byte
– A 12-bit operation code occupies the first byte and the second half of
the second byte
– A 16-bit operation code occupies the first byte and either the second
byte or the last byte of a 6-byte instruction

72 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Examples for 8-, 12-, and 16-bit operation codes

73 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Execution
 After the instruction has been fetched, the instruction address in
the PSW is incremented by the instruction length (2, 4, or 6 bytes)
 An instruction ends in one of the following ways:
– Completion and partial completion
• This is the normal end
– Suppression
• Instruction is not executed, but PSW instruction address has been updated
• Occurs with most of the program interrupt conditions (program old PSW points
after the failing instruction)
– Nullification
• Instruction is not executed and PSW instruction address has not been updated
• Occurs with program interrupt conditions that pertain to DAT handling (i.e. the
instruction will be executed after the page has been loaded)
• Occurs also for some interruptible instructions (MVCL, CLCL) that resume at the
point of interrupt
– Termination
• Instruction may have been partially executed and PSW instruction address has
been updated

74 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RR

 AR R1,R2

adds the contents of general register 2 to the contents of


general register 1. Both operands are 32-bit and signed. In
storage, the instruction appears as 0x1A 12 (operation code,
operands):

75 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RRE

 AGR R1,R2

adds the contents of general register 2 to the contents of


general register 1. Both operands are 64-bit and signed. In
storage, the instruction appears as 0xB908 0012:

The third byte of the instruction is ignored.

76 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RX

 A R1,4(R2,R3)

adds the contents of general register 2 to the contents of


general register 3, then adds the value 4. The sum is the
address of a fullword in storage whose contents is added to
the contents of general register 1. In storage, the instruction
appears as 0x5A 1 2 3 004:

77 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RX (continued…)

 B2 is called base register. If it is 0, the value 0 is used, not the


contents of general register 0.
 D2 is called displacement. It has 12 bits and thus ranges from
0 to 4095.
 X2 is called index register. If it is 0, the value 0 is used, not
the contents of general register 0.
 The actual length of the resulting address (24, 31, or 64 bits)
is controlled by the addressing mode in the PSW.

78 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RX (continued…)

 Nearly all instructions that access storage use the D2(B2)


operand.
 Many of them allow for the specification of an index register
in the form D2(X2,B2). It is typically used to access array
elements in a loop.
 If neither base nor index register is used, the displacement
alone forms the address. This is used to access the assigned
storage locations.

79 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RXY (long displacement)

 The 12-bit displacement was considered insufficient, so


z900 (GA3) introduced a 20-bit signed displacement:

DH2 is the high-order part of the displacement, so

AG R1,-4(R2,R3)

is assembled as 0xE3 1 2 3 FFC FF 08

(-4 = 0xFF FFC)


80 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Instruction Format RXY (long displacement) cont…

 For many of the old ESA/390 32-bit instructions a long-


displacement variation was created:

AY R1,-4(R2,R3)

is assembled as 0x E3 1 2 3 FFC FF 5A

81 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Instruction Format RI

 AGHI R1,-1

adds -1 to general register 1 (64-bit):

The instruction is assembled as 0xA7 1 B FFFF

82 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

All 47 variations of
ADD (z10)

83 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Some Hints to Understand Mnemonics


 A = add (in its basic form 32 bits,  L = logical (unsigned integer)
signed) as first letter only
 P = packed decimal format (one decimal
 B = binary floating-point (IEEE format) digit per hex digit)
 C = carry (a 1 into high-order part of an  R = register (as second operand)
integer)
 U = unnormalized single precision hex
 D = double precision floating-point (64 floating-point (32 bits)
bits)
 W = unnormalized double precision hex
 E = single precision floating-point (32 floating-point (64 bits)
bits)
 X = extended precision floating-point
 F = full word (32 bits, signed) (128 bits)
 G = “grand” (64 bits, signed)  Y = long-displacement operand in
storage (20 bits displacement instead of
 H = half word (16 bits, signed) 12 bits)
 I = immediate (operand is part of
instruction, not somewhere else in
storage)

84 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The condition code in the PSW


 The condition code (cc) in the PSW is set by many, but not all
instructions
 If an instruction does not set the cc, it remains unchanged
 Examples of instructions that set the cc:
– Add
– Subtract
– Compare
 Examples of instructions that do not set the cc:
– Load
– Store
– Multiply
– Divide
 There are conditional branching instructions that act depending on
the condition code

85 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Setting the condition code in the PSW

 The meaning of the cc is individual for an


instruction but there are common rules
– cc = 0 means “equal operands” for comparisons, “zero
result” for add and subtract operations
– cc = 1 means “first operand is low” for comparisons,
“negative result” for add and subtract operations
– cc = 2 means “first operand is high” for comparisons,
“positive result” for add and subtract operations
– cc = 3 is used for various purposes, e.g. to indicate
overflow

86 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions

 The compare instructions set the condition code


in the PSW:
– cc = 0 Operands are equal
– cc = 1 First operand is low
– cc = 2 First operand is high

 A subsequent branch instructions is used to act


on that condition code

87 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions: Register-to-Register

Mnemonic R1 R2 Remarks

CR 32-bit signed 32-bit signed


CGR 64-bit signed 64-bit signed
CGFR 64-bit signed 32-bit signed 2nd operand is
implicitly
sign-extended
CLR 32-bit unsigned 32-bit unsigned
CLGR 64-bit unsigned 64-bit unsigned
CLGFR 64-bit unsigned 32-bit unsigned

88 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions: Register-to-Storage

Mnemonic R1 D2(X2,B2) Remarks

C, 32-bit signed 32-bit signed


CY
CG 64-bit signed 64-bit signed

CGF 64-bit signed 32-bit signed 2nd operand is


implicitly
sign-extended
CH, 32-bit signed 16-bit signed 2nd operand is
CHY implicitly
sign-extended

89 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions: Register-to-Immediate


Mnemonic R1 I2 Remarks

CHI 32-bit signed 16-bit signed 2nd operand is


implicitly
sign-extended
CFI 32-bit signed 32-bit signed
CLFI 32-bit unsigned 32-bit unsigned
CGHI 64-bit signed 16-bit unsigned 2nd operand is
implicitly
sign-extended
CGFI 64-bit signed 32-bit signed 2nd operand is
implicitly
sign-extended
CLGFI 64-bit unsigned 32-bit unsigned

90 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions: Storage-to-Immediate

Mnemonic D1(B1) I2 Remarks

CLI, 8-bit unsigned 8-bit unsigned


CLIY

91 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare Instructions: Storage-to-Storage


Mnemonic First Operand Second Operand Remarks

CLC 1 - 256 bytes 1 - 256 bytes Same length


for both
operands
CLCL 0 – (16M-1) 0 – (16M-1) Shorter
bytes bytes operand is
padded;
Interruptible
instruction
CLCLE 0 – (16E-1) 0 – (16E-1) Shorter
bytes bytes operand is
padded;
Interruptible
instruction

92 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

BRANCH ON CONDITION

 M1 field used as 4-bit mask, each bit corresponds to one condition code (0-3)
 BC[R] 8,… means branch on cc 0
 Any combination of mask bits is possible, e. g. 2+1 means branch on cc 2 or 3
 Branch target is contained in register (R2) or is accumulated from B2, D2, and X2

93 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

BRANCH RELATIVE ON CONDITION [LONG]

 M1 field used as 4-bit mask, each bit corresponds to one condition code (0-3)
 BRC[L] 8,… means branch if cc is 0
 Any combination of mask bits is possible, e. g. 2+1 means cc 2 or cc 3
 Branch target is relative, assembled as number of halfwords forward / backward
(+/- 64KB for BRC, +/- 4GB for BRCL)

94 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Common Branch Mnemonics (High-Level Assembler)

Mnemonics Short for Branch [relative] on… Condition


code
B[R]Z, B[R]C 8, zero, 0
B[R]E equal
B[R]M, B[R]C 4, minus, mixed, 1
B[R]L low
B[R]P, B[R]C 2, plus, 2
B[R]H high
B[R]O B[R]C 1, ones, 3
overflow
B[R]NZ, B[R]C 7, non-zero, 1, 2, or 3
B[R]NE not equal
B[RU] B[R]C 15, unconditionally 0, 1, 2, or 3

Note: J for “jump” may be used instead of BR for “branch relative”

95 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Subroutine Linkage
 BRANCH AND SAVE stores the current instruction address
from the PSW (i.e., the address of the instruction following
BASR) in R1 and branches to the second-operand address:

 Special case: BASR R1,0 saves the instruction address but


does not branch (very common to establish program’s base
register)
96 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Subroutine Linkage (continued…)

 BRANCH RELATIVE AND SAVE [LONG] works like BAS, but


the branch target is relative (i.e. assembled as the number of
halfwords to jump forward / backward):

97 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

TEST UNDER MASK: Storage-to-Immediate

 A single byte in storage is tested against an 8-bit mask


– cc = 0 Selected bits are all 0
– cc = 1 Selected bits are mixed 0 and 1
– cc = 2 --
– cc = 3 Selected bits are all 1

98 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

TEST UNDER MASK: Register-to-Immediate


Mnemonic R1
TMHH bits 0 – 15
TMHL bits 16 – 31
TM[L]H bits 32 – 47
TM[L]L bits 48 – 63

 16 bits in a register are tested against a 16-bit mask


– cc = 0 Selected bits are all 0
– cc = 1 Selected bits are mixed; leftmost selected bit is 0
– cc = 2 Selected bits are mixed; leftmost selected bit is 1
– cc = 3 Selected bits are all 1

99 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

EXECUTE Instruction

 Purpose:
– Execute an instruction outside the instruction stream
– Execute an instruction whose length field is known only at run-time
 Format:

 Function:
– The low-order byte of R1 is (conceptually) ORed into the second byte of the
target instruction located at the second-operand address, then the target
instruction is executed
– If R1 is specified as 0, the target instruction is not modified before execution

100 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

EXECUTE Instruction (continued…)

 Example:
A MOVE instruction (MVC) copies 1 – 256 bytes from the second
operand to the first operand

 With EXECUTE, the length field L can be supplied from a register


(note: length must be reduced by 1 because length field contains 0
– 255)
 New with z10: EXECUTE RELATIVE LONG (does not require a base
register)

101 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Long-running Instructions

 Some instructions may process large amounts of data


– Excessive time required to complete execution
– Instruction accesses many pages of storage
• Not all pages are necessarily available at the same time in a virtual
storage environment
 Some of these instructions are interruptible
 Other instructions indicate partial completion with cc 3

102 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Long-running Instruction: MOVE LONG EXTENDED

 On completion
– The addresses are incremented
– The lengths are decremented
– If completion was partial, cc 3 is
set

103 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Partial Completion

 MOVE LONG EXTENDED (and a few others) may perform partial


completion:

– The registers that describe their operands are updated


– cc 3 is set to indicate partial completion
– Software has to branch back to the instruction on cc 3:

LOOP MVCLE R2,R4,0


BRO LOOP

– Interrupts may occur after each execution


– Software has the option to do something else before resuming the
instruction
– This approach is used for all new long-running instructions

104 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interruptible Instructions

 Some instructions (e.g. MOVE LONG, COMPARE LONG) are


interruptible
– Such an instruction may consist of several units of operation
– After each unit of operation, the registers that describe their
operands are updated
– If the instruction is not completed, the instruction address in the
PSW is stepped back to the beginning of the instruction
– The instruction may now be resumed at the point of interruption
– This approach is transparent to application software

105 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Storage Protection

106 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Protection Mechanisms

 Key-controlled protection
 Low-address protection
 DAT protection (formerly called page protection)
 Access-list-controlled protection

107 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Keys

 A storage key is associated with each 4K-byte block of


real storage. The storage key has the following format:

 ACC = access-control bits


 F = fetch-protection bit
 R = reference bit
 C = change bit

108 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Key
Protection Rules

109 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage-Protection-Override

 Allows to store into areas with storage key 9


although PSW key is neither 0 nor 9
 Useful when application (using PSW key 9) and
operating system component (using PSW key
other than 0 or 9) share some read/write data
 Controlled by CR0.39
 For details, read the Principles of Operation,
chapter 3

110 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Fetch-Protection-Override

 Overrides fetch protection for effective addresses


0 – 2047
– Applied before virtual-to-real address translation
 Allows to apply fetch protection for effective
addresses 2048 – 4095 while overriding it for the
other parts of that page
 Controlled by CR0.38
 For details, read the Principles of Operation,
chapter 3

111 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Low-Address Protection

 Prevents storing into effective addresses 0 – 511


and 4096 – 4607 (first 512 bytes of first and
second 4K block)
– Applied before virtual-to-real address translation
 Used as additional safety net to prevent corruption
of vital system information in the assigned storage
locations, such as the new and old PSWs
 Controlled by CR0.35

112 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

DAT Protection

 Prevents a page (4K) from being altered, does not provide fetch protection
 Used, for instance, to implement the POSIX fork() function
 Enabled by setting the P bit in a page-table entry (PTE):

 Can be applied to an entire segment (1M) by setting the P bit in the


segment-table entry:

113 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

DAT Protection (continued…)

 New with z10:


– Can be applied to an entire region (2G, 4T, 8P) by setting
the P bit in the region-table entry

114 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Virtual Storage

115 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage

 Virtual storage is created by multi-level lookup


tables in storage that describe the virtual-to-real
address translation. This process is called
dynamic address translation (DAT).
 The base pointers to the top-level tables are kept
in control registers (CR1, CR7, and CR13) or are
described by access registers.
 Multiple address spaces may be accessed at the
same time.

116 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage – Address Space Control


Several PSW bits control which address space is used:

117 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (ESA/390)

 In ESA/390 architecture, an address space in virtual storage


is up to 2G in size, consisting of
– 1M segments
– 4K pages

 A virtual address consists of the following components


(segment index, page index, base index):

118 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (ESA/390) continued…

 Two-level lookup tables are used. The segment-table


designation (STD) points to the segment table:

 It is contained in CR1, CR7, CR13, or described by an


access register
 For historical reasons, CR0 bits 8 – 12 must be set to 10110

119 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (ESA/390) continued…

 The segment-table entries, in turn, point to page tables:

 Finally, the page-table entries contain the real address of the page
frames:

120 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (z/Architecture)

 z/Architecture extends the address size from 31 to 64 bits:

 In z/Architecture, an address space in virtual storage is up to


16E in size, consisting of
– 2G regions
– 1M segments
– 4K pages

121 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (z/Architecture) continued…

 The new region index consists of 33 bits:

 It is subdivided in 3 groups of 11 bits each: Region-first,


region-second, and region-third index.

122 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual Storage (z/Architecture) continued…

 Since a virtual address now consists of six parts (region-first,


region-second, region-third, segment, page, base), a five-
level table lookup would be required:
– Costly in terms of performance
– Currently unnecessary, because even a huge 4T address space
can be handled with a region-third index (region-first and
region-second will always be zero)

 Therefore, z/Architecture DAT allows to specify at which level


translation is to start (region-first, region-second, region-
third, or segment)

123 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Maximum Address Space Sizes

DAT table used Maximum address space size

Segment table 2 Gigabytes (231)


Region-third table 4 Terabytes (242)
Region-second table 8 Pentabytes (253)
Region-first table 16 Exabytes (264)

124 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/Architecture Address Space Control Element

 An address space control element (ASCE)


describes an address space, e.g.:

 It is contained in CR1, CR7, CR13, or described by


an access register

125 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/Architecture DAT Table Entries

 The region-table entries all


have the same format:

126 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/Architecture DAT Table Entries (continued…)

 The segment-table entries look as follows:

 And finally here are the page-table entries:

127 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/Architecture Dynamic Address Translation


Address Space Control Element RFX RSX RTX SX PX BX
11 11 11 11 8 12
bits bits bits bits bits bits

RST origin 64-bit virtual address


RST origin

RST origin RTT origin

RST origin RTT origin

RTT origin ST origin


RF table
RTT origin ST origin

ST origin PT origin
RS table
ST origin PT origin
Legend:
PT origin Page address
RF = region-first RT table
RS = region-second PT origin Page address
64-bit real address
RT = region-third
S = segment Segment Page address
P = page table
Page address

Page table
128 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Translation-Lookaside Buffer

 To speed up the translation process, translations


are cached in a TLB
 Each CPU has its own TLB
 The TLB is filled in automatically by hardware as
the program executes
 When DAT tables are changed, TLB entries must
be purged

129 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Translation-Lookaside Buffer (continued…)

 The TLB may be purged completely by the


instruction PURGE TLB (PTLB)
– This clears the TLB of the CPU issuing PTLB
 Selected entries may be purged by
– INVALIDATE PAGE TABLE ENTY (IPTE)
– INVALIDATE DAT TABLE ENTRY (IDTE)
– COMPARE AND SWAP AND PURGE (CSP, CSPG)
– The entry is updated in the DAT tables as well as in the
TLBs of all CPUs in the configuration

130 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z10 DAT Enhancements

 With the growing size of address spaces, page tables


become huge
– Remember: One page table entry covering 4K bytes requires 8
bytes. E. g., to map 4G, you need page tables occupying 8M of
storage
 With z10, a segment table entry may either point to
– a page table (as before) or
– directly to a 1M area in absolute storage, thus eliminating the
need for 256 pages table entries (256 * 8 = 2K)
 This feature is sometimes referred to as “large page”.

131 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Enhanced DAT: No Page Table Required


Address Space Control Element RFX RSX RTX SX BX
11 11 11 11 20 bits
bits bits bits bits

RST origin 64-bit virtual address


RST origin

RST origin RTT origin

RST origin RTT origin

RTT origin ST origin


RF table
RTT origin ST origin

ST origin Segment address


RS table
ST origin Segment address 64-bit absolute address
Legend:
Segment address
RT table
RF = region-first Segment address
RS = region-second
RT = region-third
S = segment Segment
table

132 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The “Invalid” Bit

 Address translation is prevented when the I


(invalid) bit is set in a region-, segment- or page
table entry
 Access to such an address results in a page-,
segment-, or region-translation exception
(program interrupt)
 The instruction or, in case of an interruptible
instruction, the unit of operation is nullified, i.e.,
the program old PSW points to the instruction, not
after it

133 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Access-Register Mode
 Purpose: access multiple address spaces at a time
 Enabled when PSW.5 = 1 (DAT) and bits 16-17 are set 01
 Access register is implicitly the same as the base register (NOT index register):
LG R1,8(R3,R4)
AR4 is used to translate the logical address 8(R3,R4) to a real address

 If the instruction is written


LG R1,8(R4,R3)
AR3 is used to translate the logical address 8(R4,R3) to a real address (same logical address as
above)

 If the contents of the access register is 0, then primary-space address translation is used (via CR1)
 If the contents of the access register is 1, then secondary-space address translation is used (via
CR7)
 Base register 0 means no access register is to be used (not the contents of AR0)
 For further details, check the Principles of Operation, “Access-Register Translation“ in chapter 5

134 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Virtual = Real (V=R) Address Spaces

 Sometimes it is desirable to access real storage while


running virtual (R = 1):

 This real-space designation is held in CR1, CR7, CR13, or


described by an access register

135 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Multiprocessing

136 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Multiprocessing

 Largest z900 has 16 CPUs (model 216)


 Largest z990 has 32 CPUs (model D32)
 Largest z9 EC has 54 CPUs (model S54)
 Largest z10 has 64 CPUs (model e64)
 Shared main storage
 Prefix area unique for each CPU
 Shared data must be updated interlocked by special instructions:
– TEST AND SET (antique)
– COMPARE AND SWAP [AND STORE]
– PERFORM LOCKED OPERATION
 CPUs communicate via SIGNAL PROCESSOR instruction (SIGP) and
external interrupts

137 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Assigned Storage Locations and Prefixing

 Other names: fixed storage locations, low core, prefix area


 Assigned storage locations are used to exchange
information between system and software, e. g. to handle
interrupts
 They are located in the address range 0 – 0x1FFF (real)
 Each CPU has to manage its own information, so for each
CPU this address range must be mapped to a different place
in absolute storage
 The prefix register of each CPU specifies this absolute
address
 The operating system has to ensure that each CPU has a
different prefix register

138 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Mapping of Real to Absolute Storage (Prefixing)


Real storage Real storage Real storage Absolute storage

0x6000

0x4000

0x2000

0
CPU 0 CPU 1 CPU 2
Prefix = 0 Prefix = 0x2000 Prefix = 0x4000

139 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Example: Prefix Register = 0x4000


 If the prefix register contains 0x4000, then
– Real addresses 0 – 0x1FFF translate to 0x4000 – 0x5FFF absolute
– Real addresses 0x4000 – 0x5FFF translate to 0 – 0x1FFF absolute
– In all other cases, real and absolute addresses are identical
 Essentially, the ranges 0 – 0x1FFF and the range of the prefix register
are swapped
 Each CPU has a separate area addressable as 0 – 0x1FFF
 If the prefix register is 0, prefixing has no effect
 The prefix register must specify an address < 2G
 In ESA/390 architecture, the prefix register contains an address on a
4K boundary and applies only to 4K (0 – 0xFFF)
 For a detailed mapping, check the Principles of Operation, “Assigned
Storage Locations” in chapter 3

140 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare and Swap

 Purpose
– Safe update of data in storage shared between several routines
running on different processors
– May also be needed on uniprocessor when routines run in different
threads/tasks
 Concept
– Fetch the original value from storage
– Create a new value, based on the original value
– Compare the private copy of the original value to the current value in
storage – if it has not changed since the original fetch, replace it with
the new value
– However, if the value in storage has changed in between, retry the
process

141 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare and Swap (continued…)

Example 1: increment a counter

L R0,COUNT fetch original value


LOOP LR R1,R0 create copy
AHI R1,1 increment copy
CS R0,R1,COUNT if R0 = COUNT then
* save R1 in COUNT
BRNZ LOOP else
* load COUNT into R0
...
COUNT DS F full word

142 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare and Swap (continued…)


Example 2: obtain a lock (non-preferred solution)

LHI R1,-1 create lock (0xFFFFFFFF)


LOOP LHI R0,0 create expected lock (0)
CS R0,R1,LOCK if LOCK = 0, store R1 into LOCK
BRNZ LOOP didn’t get the lock, try again
*
DONE ... proceed
...
LOCK DS F full word
 Disadvantage: CS locks the cache line exclusively upfront
 When the lock is held by another CPU, the cache line will bounce back and
forth between the CPU owning the lock and the CPU requesting the lock
 Solution: If it is likely that CS will not succeed (cc 1), do a trial fetch first

143 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Compare and Swap (continued…)


Example 2: obtain a lock (preferred solution)

LHI R1,-1 create lock (0xFFFFFFFF)


LHI R0,0 create expected lock (0)
LOOP CS R0,R1,LOCK if LOCK = 0, store R1 into LOCK
BRZ DONE we got the lock
* else, LOCK is loaded into R0
TEST LT R0,LOCK try again (simple fetch)
BRNZ TEST still locked
BRU LOOP no longer locked, try CS again
*
DONE ... proceed
...
LOCK DS F full word

144 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Variations of Compare and Swap

Mnemonic Operand length Notes


CS Word (32 bits)
CSY Word Long displacement

CSG Double word (64 bits) Single register

CDS Double word Lower halves of even/odd register pair

CDSY Double word Long displacement

CDSG Quad word (128 bits) Even/odd register pair

Note: Operand must be aligned according to its length (word, double word, quad word
boundary)

145 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

More Atomic Instructions

 COMPARE AND SWAP AND STORE (CSST)


– Does a separate store after COMPARE AND SWAP
function within one instruction
– For details, see the Principles of Operation, chapter 7
 PERFORM LOCKED OPERATION (PLO)
– For complex locking protocols (up to 8 operands)
– Locks only against other PLO locks
– For details, see the Principles of Operation, chapter 7

146 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Signal Processor (SIGP)

 SIGP issues order to


– One CPU (possibly itself)
– All CPUs (set architecture
only)

 SIGP R1,R3,D2(B2)
– R1: returned status
– R1+1: parameter
– R3: target CPU
– D2(B2): order code

147 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Start-up Sequence for Multiprocessing Operating System

 Start on CPU 0, other CPUs are still stopped


 Issue SIGP order “set architecture” to switch CEC from
ESA/390 to z/Architecture
 Allocate prefix area (assigned storage locations) for other
CPUs
 Initialize restart new PSWs in prefix areas of other CPUs
 Issue SIGP orders “set prefix” to all other CPUs
 Issue SIGP orders “restart” to all other CPUs

148 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Input/Output

149 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Channel Subsystem

 Directs the flow of information between I/O


devices and main storage
 Relieves the CPU of the task of communicating
directly with I/O devices
 Allows CPU processing to proceed concurrently
with I/O operations
 In general, runs on the SAPs

150 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Input/Output
Device
Channel path

Channel path Control unit Device

Channel path Device

Channel path Control unit Device

151 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Channel Paths, Control Units, and Devices

 A channel path is a separate processor that controls data


transfer between main storage and devices
– Data to be read from/written to an external medium
– Control information
 A control unit understands in detail commands such as
– Disk head positioning
– Page eject (printers)
– Tape rewind
 A device is driven by a control unit

152 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Input/Output

 One control unit may serve multiple devices


 Up to 8 channel paths may be attached to one
control unit
– Better performance
– Improved reliability, availability, serviceability

153 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Device Numbers

 Devices (such as disks, printers, screens) are


assigned device numbers from 0 to 0xFFFF by the
system administrator.
 These numbers are used in communication
between a system operator and the operating
system.
 These numbers do not need to be contiguous in a
configuration.

154 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

I/O Configuration Program (IOCP)

*
IMID RESOURCE PART=((LP1,1))
*
******************************************************************
* FICON Channel *
******************************************************************
CH21 CHPID PCHID=521,PATH=(CSS(0),21),TYPE=FC, *
PART=LP1
CNTLUNIT CUNUMBR=555,UNITADD=((10,16)),UNIT=FC, *
PATH=((CSS(0),21))
IODEVICE ADDRESS=(5210,16),CUNUMBR=555,UNIT=FC,UNITADD=10,*
STADET=Y

155 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Subchannels

 The IOCP (I/O configuration program) reads the I/O


configuration input deck. It defines the
relationship between
– Channel paths
– Control units and
– Devices
 IOCP assign a subchannel number to each device.
The numbers range from 0 to 0xFFFF and are
contiguous.

156 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Dynamic I/O Configuration

 The I/O configuration may be changed under


software control by adding or deleting
– Channel paths
– Control units
– Devices
 The software front-end is known as HCD
(Hardware Configuration Definition)

157 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Subchannels (continued…)

 I/O instructions address devices via the


subchannel number. Therefore, the operating
system must determine the relationship between
device number and subchannel number.

 The operating system uses the STORE


SUBCHANNEL instruction (STSCH) to retrieve the
subchannel-information block (SCHIB).

158 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Subchannel-Information Block

 The SCHIB is 13 words long. Its first six words


comprise the path-management-control word (PMCW):

159 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Modification of Subchannels

 Parts of the SCHIB may be changed by the


MODIFY SUBCHANNEL (MSCH) instruction,
especially
– The I/O interruption subclass (ISC)
– The enable-bit (E)

 MSCH usually stores the SCHIB that was retrieved


before by a STSCH instruction

160 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Flow of an I/O Operation


CPU SAP Channel
Software issues SSCH
Passes subchannel to SAP
Returns cc 0 to software Selects channel path (1 out
(operating system) of up to 8)
Passes subchannel to
channel
Executes channel program
Signals completion to SAP
Processes channel’s
response
Offers I/O interrupt to all
CPUs

One CPU in the


configuration takes the I/O
interrupt
161 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Starting an I/O operation

 The START SUBCHANNEL instruction initiates an I/O


operation:

 General register 1 addresses the I/O device


 The second operand addresses the operation-request-block
(ORB) that describes details of the operation to be
performed

162 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Subsystem-Identification Word

 General register 1 contains the subsystem-identification


word
– 16-bit subchannel number
– 2-bit subchannel set ID (essentially an extension to the
subchannel number)

163 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Operation-Request Block (ORB)

 The second operand of START SUBCHANNEL points to the


operation-request block:

164 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Operation-Request Block (continued…)

 The interruption parameter is saved by the channel


subsystem and returned unchanged at the time of the I/O
interrupt. It is normally used by the operating system to store
the address of its own device control block.
 The logical path mask (LPM) can be used by the operating
system to route an I/O operation to specific channel paths.
Normally, the operating system sets this field to 0xFF to
allow the channel subsystem to use any configured path.
 The channel program address points to the first channel-
command word (CCW).

165 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Channel-Command word (CCW) Formats

166 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

CCW Command Codes

 CCW command code types:


– Write (normal data output)
– Read (normal data input)
– Read backward (good for tapes)
– Control (e.g. to position disk head)
– Sense (retrieve reason for unit check)
– Sense ID (retrieve device type, e.g. 3390)
– Transfer in channel (branch in channel program)
 The particular command code is device-specific

167 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

CCW Flags

 CD: chain data (use the same CCW command on the next
CCW)
 CC: chain command (0 = this is the last CCW)
 SLI: suppress incorrect length indication
 SKP: do not read data on input
 PCI: program-controlled interruption (generate intermediate
I/O interrupt when this CCW is fetched)
 IDA: indirect data addressing
 S: suspend channel program without generating I/O interrupt

168 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Indirect-Data-Address Words (IDAWs)

 IDAWs are used to describe I/O buffers that are contiguous in


virtual storage – and thus usually scattered in absolute storage.
– With z/Architecture, they provide the only means to address data
buffers above 2G.
 IDAWs are used when the IDA flag is set in the CCW. The address
in the CCW points to the list of IDAWs.
 Format-1 IDAWs are 32 bits long and cover up to 2K of storage
 Format-2 IDAWs are 64 bits long and cover up to 4K of storage
 When ORB word 1.14 (H) is 1, format-2 IDAWs are used. When ORB
word 1.15 (T) is also 1, the IDAWs cover a 2K area, otherwise, they
cover 4K

169 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Indirect-Data-Address Words (IDAWs) continued…


 Example: Assume a data buffer is located at virtual addresses
0x10FFD – 0x13001, its size is 0x2005 bytes, also, assume format-2
IDAWs for 4K are used (H = 1, T = 0).

Virtual address Absolute Address Bytes transferred


in IDAW by IDAW
0x10FFD 0x200FFD 0x0003
0x11000 0x2F2000 0x1000
0x12000 0x3E8000 0x1000
0x13000 0x1D7000 0x0002
Total 0x2005

170 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations

 When the last CCW has been processed, the channel signals
ending status to the channel subsystem
 The subchannel is now made status-pending and an
interruption condition is generated for its I/O-interruption
subclass (ISC)
 An I/O interrupt is said to be floating, i.e. it is “offered” to all
CPUs that are enabled for this ISC
 If more than one CPU is enabled for this I/O interrupt, only
one will actually take it
 As an alternative, the CPU may poll for an interrupt using the
TEST PENDING INTERRUPTION (TPI) instruction

171 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations (continued…)

 At the time of the I/O interrupt, the following information is


stored in the assigned storage locations:

 This information is also stored by a TPI 0 instruction

172 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations (continued…)

 The subsystem-identification word has the same format as general


register 1 at the time of START SUBCHANNEL
 The I/O-interruption parameter is taken unchanged from the ORB
 The interruption-identification word has the following format:

 A = 0 means the interrupt is for a subchannel (A = 1 means


adapter interrupt, “thin interrupt”)
 ISC is the I/O-interruption subclass of the subchannel

173 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations (continued…)

 After the I/O interrupt, the


subchannel is still status-
pending
 The operating system now
issues TEST SUBCHANNEL
(TSCH) to retrieve the
subchannel status:

174 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations (continued…)

175 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Conclusion of I/O Operations (continued…)

 Important:
– Normally, an I/O operation ends with device status
channel end/device end (CE/DE)
– Sometimes, channel end status comes as separate
status before device end
– Unit check is signaled for a variety of exceptional
conditions
– Normally, the subchannel status is 0

176 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z990 Extensions to I/O Architecture

 Multiple channel subsystems (MCSS)


 1 to 4 logical channel subsystems (LCSS)
– Up to 256 channel paths per logical partition (LPAR)
– Up to 63K devices (subchannels) per LPAR
– An LCSS may be shared by multiple LPARs
 Allows much larger configurations (e.g. to
consolidate two z900 systems on a single z990)

177 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z990 – I/O Configuration Support


z990 - 2084 Processor

Logical Channel Subsystem Logical Channel Subsystem

Partitions Partitions

Subchannels Subchannels
63K 63K

Channels Channels

178 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z990 – I/O Configuration Support


zSeries 2084 zSeries 2084
Logical Channel Subsystem DASD Configurations
63K Subchannels 63K Device

63K
Base
Base
Base
Base
Base Bases
Base
Alias
Alias
Alias
Alias
Alias
Alias
Alias
Aliases

179 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z9 Extensions to I/O Architecture

 Multiple subchannel sets (MSS)


 The problem: z990 had only one subchannel set
– 64,512 subchannels (1,024 reserved for internal use)
 This became a problem for large installations because of PAV
(parallel access volumes):
– With PAV, a single disk drive often consumes four subchannels (base
address plus three aliases)
 The solution: z9 supports two subchannel sets
– 65,280 subchannels in set 0 (256 reserved for internal use)
– 65,535 subchannels in set 1
 z/OS 1.7 exploits second subchannel set to access alias addresses
of parallel access volumes (PAV)

180 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z9 – I/O Configuration Growth - Multiple


Subchannel Sets (MSS)
System z9 2094 System z9 2094
Logical Channel Subsystem DASD Configurations
128K Subchannels Increased to 128K Device
63.75K 64K
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base Alias
Base

Bases Aliases

181 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z9 – Multiple Subchannel Sets per LCSS

z9 – 2094 Processor

Logical Channel Subsystem Logical Channel Subsystem

Partitions Partitions

Subchannels Subchannels
63.75K 64K 63.75K 64K

SS-0 SS-1 SS-0 SS-1

Base Alias
Base Alias

Bases Aliases
Channels Channels

182 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Modified Indirect-Data-Address Words (MIDAWs)

 MIDAWs were invented to increase I/O throughput by


reducing the number of CCWs required
 Like IDAWs, MIDAWs specify a list of data areas in absolute
storage
 Unlike IDAWs, MIDAWs are used to describe I/O buffers that
are scattered in virtual storage:
– MIDAWs never have to specify aligned addresses (the
areas must not cross 4K-boundaries, though)
– Each MIDAW has its own count field
 MIDAWs were introduced with the IBM System z9

183 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Summary of FICON Express4 channel MIDAWs


measurements

184 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

MIDAWs continued…

 Use of MIDAWs must be enabled in the ORB (D-bit,


word 1, bit 25).
 The MIDA flag must be set in the flags of the CCW
(bit 15 in format-1 CCW). When it is used, neither
IDA nor SKIP flags must be set.
 The CCW points to a list of MIDAWs (MIDAL).
 The MIDAL begins on a quadword boundary. It
must not cross a 4K boundary.

185 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

MIDAW Layout

 Bit 40: Last MIDAW in list


 Bit 41: Skip data transfer on read
 Bit 42: Force channel-program-check on data transfer

186 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Initial Program Loading


(IPL)

187 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Initial Program Loading (IPL)

 Boots the operating system from an I/O device


 Initiated manually at the support element (SE) or
hardware management console (HMC)
 System operator specifies device number
 System operator may specify an IPL parameter

188 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Initial Program Loading (IPL) continued…


 During IPL, a format-0 CCW is stored at address 0:

 Command code 0x02 means Read IPL for all devices.


 This CCW reads 24 bytes from an external device, overwriting itself (data address is
0). CC and SLI are set.
 Since the CCW address has been incremented from 0 to 8, the next CCW is now taken
from the data read. It will read more data into storage.
 The third CCW is designated to branch to the additional data read (which may contain
more CCWs).
 At the end of this CCW chain, the initial program status word (PSW) is loaded from
address 0.

189 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

IPL Process
Absolute storage

24

16

8
Initial IPL CCW reads 24 bytes
0  address of next CCW

190 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

IPL Process
Absolute storage

24  address of next CCW


IPL CCW2 may transfer control
to CCWs just read
16  address of next CCW
IPL CCW1 reads more data,
CCWs
8  address of next CCW
IPL PSW (ESA/390 mode)
0

191 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Partitioning and
Virtualization

192 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Partitioning and Virtualization

 Purpose
– Run more than one operating system on a CEC
 Method
– Make each operating system think it owns a whole
machine
 Comes in two flavors
– LPAR (logical partitioning)
– z/VM (virtual machine)

193 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

LPAR

 A control program (LPAR hypervisor) manages logical partitions:


– Each partition owns a defined amount of physical storage
– Strictly no storage shared across partitions
– No virtual storage / paging done by LPAR hypervisor
– Zone relocation lets each partition start at address 0
– CPUs may be dedicated to a partition or may be shared by multiple partitions
– I/O channels may be dedicated to a partition or may be shared by multiple
partitions (Multiple image facility, MIF)
– Each LPAR has its own architecture mode (ESA/390 or z/Architecture)
 LPAR hypervisor is shipped with System z (considered part of firmware)
 Beginning with z990, the LPAR hypervisor is always loaded (no Basic Mode
anymore)
 Separation of logical partitions is considered as good as having each
partition on a separate CEC (Evaluation Assurance Level 5)

194 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z/VM

 A control program (CP) creates a virtual machine for each user


– Each virtual machine has an own address space starting at 0
– This is virtual storage managed by CP (subject to paging)
– Each virtual machine has its own architecture mode (ESA/390 or
z/Architecture)
– CP simulates one or multiple (virtual) CPUs per virtual machine
– CP simulates an I/O configuration for each virtual machine
• Dedicated devices, e. g., console
• Shared devices, e. g. minidisks (disks that are partitioned for many virtual machines),
printers (spooling devices)
– CP simulates communication paths between virtual machines
• channel-to-channel adapter
• Inter-user communication vehicle (IUCV)
• Virtual LANs

195 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Server Consolidation on System z


ERP

Trans-
trans- Trans-
actions
action actions
Business
Applications
Appl.+DB

CICS

IMS Java
& Consolidation
Linux
EJB Cluster/Parallel
Appl.
Applications File/Disk/Print
WebSphere
DL/I Siebel e-commerce
DB2

JVM Linux

z/OS Linux z/VM

PR/SM LPAR hypervisor

System z Platform

196 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Interpretive Execution (SIE)

 Both LPAR and z/VM use the instruction START


INTERPRETIVE EXECUTION (SIE) to run a logical
partition or virtual machine, respectively
 The program issuing SIE is called host
 The program running under SIE is called guest
 The operand of the SIE instruction is a state
description, it describes the guest
 One level of nesting (SIE under SIE) is supported,
used when z/VM runs in a logical partition

197 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

History of SIE

 The SIE instruction was introduced with the


370/XA architecture (early 1980s)
 Invented based on the experiences with VM/370 on
S/370 systems
 Not documented in the Principles of Operation
 Partially documented in SA22-7095 (public, 1985)
 Updated for z/Architecture

198 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

SIE State Description

 Guest PSW
 Guest CPU timer, clock comparator
 Guest epoch difference (for guest TOD clock)
 Guest control registers
 Guest general registers 14 and 15
 Guest prefix register

199 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Host Program Responsibilities

 On SIE entry
– Load guest general registers 0 – 13
– Load guest floating-point registers, floating-point-control
register
– Load guest access registers
 On SIE exit
– Handle interception

200 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Programmable SIE Exit Conditions

 Interception controls
– Certain instructions
– SVCs by SVC number
– LCTL for any set of control registers
 PSW enabled for I/O interrupts (PSW.6 = 1)
 PSW enabled for external interrupts (PSW.7 = 1)
 Stop request (triggered from another CPU)

201 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

SIE Exit Conditions

 An interception
– The state descriptor is updated, an interception code is stored
and the host program resumes after the SIE instruction.

Example: SSCH instruction


 A host interrupt
– e.g. an external or I/O interrupt, or a translation exception. In
this case, the unit of operation is nullified, the old PSW points
to the SIE instruction. No interception code is stored.

202 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Guest Storage

 LPAR: Region-relocation mode (RRF mode)


– Guest storage is described by zone parameters, the state
descriptor refers to a zone number.
 z/VM: Pageable storage mode
– The primary address space describes guest storage
(host CR 1). In addition, main storage origin and limit
may restrict this storage further.

203 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Layout in LPAR Mode

 Absolute storage of an
LPAR must be contiguous
LPAR 3

LPAR 2
 Activating and deactivating
LPARs may lead to
fragmentation of storage
LPAR 1

 How can we assign all Hardware System Area


unused storage to a single (HSA)

LPAR?
LPAR Hypervisor

204 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Storage Layout in LPAR Mode (continued…)

 Another mapping of storage is introduced:

Absolute storage is mapped to physical storage


– Absolute storage size is higher than physical storage size
(e.g. 2x)
– Mapping allows LPAR portions scattered in physical
storage to appear contiguous in absolute storage
 With pre-planning, it is possible to extend the
storage size of an LPAR on the fly

205 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

LPAR 4 Absolute vs. Physical Storage

LPAR 3

LPAR 3

LPAR 1

LPAR 1

Hardware System Area


(HSA) Hardware System
Area
(HSA)

LPAR Hypervisor LPAR Hypervisor

Absolute storage Physical storage

206 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Complete Address Translation Process (1)

 Scenario
– Application running in an operating system
– Operating system (guest-2) running under z/VM
– z/VM (guest-1) running in an LPAR
– LPAR managed by the LPAR hypervisor (host)

207 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Complete Address Translation Process (2)

 Application uses virtual addresses (guest-2


virtual)
 Application address is translated
– To guest-2 real using the operating system’s DAT tables,
then
– To guest-2 absolute using the operating system’s prefix
register

208 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Complete Address Translation Process (3)

 The guest-2 absolute address is taken to be guest-


1 virtual address by z/VM. It is translated
– To guest-1 real using z/VM’s DAT tables, then
– To guest-1 absolute using z/VM’s prefix register
 The guest-1 absolute address (absolute address
within the LPAR) is now translated
– To host absolute by adding the LPAR’s zone origin
(It is also checked against LPAR’s zone limit)

209 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

The Complete Address Translation Process (4)

 Finally, the host absolute address is translated


– To a physical address using the config array

210 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Parallel Sysplex

211 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Parallel Sysplex

 Several separate CECs are


connected to a coupling facility
(CF) via high-speed fiber links
(coupling channels)
 This configuration forms a
“Parallel Sysplex” with a star
topology
 The coupling facility is the hub of
this star
 In a Parallel Sysplex, each CEC
has access to the complete disk
pool (shared-data model)
 Parallel Sysplex is supported by
z/OS
 Timing is synchronized with a
Sysplex Timer or (new) with a
Server Time Protocol (STP)
2 – 32 systems

212 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Rationale for Parallel Sysplex

 Processing capacity beyond single SMP


 Non-disruptive addition of processing capacity without
changes to customer applications
 Improved application availability, reduction of planned
outages
 Incremental growth – up to 32 CECs
 Operability / Manageability (single system image)
 Work load management provides load balancing across
CECs

213 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Coupling Facility Characteristics

 LPAR within S/390, zSeries, or System z


 Uses InterSystem Channels (ISCs) and integrated cluster
buses (ICBs) to communicate with CECs
 Does not use ESCON / FICON I/O
 Uses console integration of support element (SE) to
communicate with system operator
 Runs Coupling Facility Control Code (CFCC), a control
program that talks to the CECs via ISCs and ICBs
 CFCC is firmware that is shipped as part of the system
 CFCC can be run in a virtual machine on z/VM

214 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Coupling Facility Characteristics (continued…)

 The CF architecture provides three behavioral models to enable


efficient clustering models

– Lock model: enables a specialized lock manager (e.g. database lock


manager) to be extended into a multi-system environment
– Cache model: provides the functions needed for multi-system shared-
data cache coherency management
– Queue/List model: provides a rich set of queuing constructs in
support of workload distribution, message pathing, and sharing of
state information

 Multiple CFs can be connected for availability, performance, and


capacity reasons

215 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

z10 EC Parallel Sysplex coexistence and z800, z900


coupling connectivity 2066 None!

2064
z9 EC 2094
ISC-3, ICB-4
STI
2096 2084

STI
z10 EC
E64 no ICB-4 STI

STI
z9 BC STI z990
ISC-3, ICB-4 ISC-3, ICB-4
12x IB-SDR
2094 IFB 2086
3 GBps
z9 EC IFB STI
Dedicated CF STI STI
PSIFB, ISC-3, ICB-4 STI z890
IFB ISC-3, ICB-4

STI
2096 z10 EC
z9 BC STI PSIFB, ISC-3, and
Dedicated CF ICB-4 (Except E64)
PSIFB, ISC-3, ICB-4 12x IB-SDR 12x IB-DDR
IFB IFB
3 GBps 6 GBps
** STI = ISC-3 and ICB-4 links which use STI technology
216 System z Architecture May 2008 © 2008 IBM Corporation
Systems and Technology Group

Coupling Channels

 InterSystem Channels (ISCs)


– With repeaters, can span distances of up to 100 km
– Good for distributed environments (e.g. Geographically Dispersed Parallel
Sysplex, GDPS)
– Without repeaters, 10 km distance and 2 Gigabits/sec
 Integrated Cluster Buses (ICBs)
– Short distance: about 10 m
– High performance
• ICB-3: 1 Gigabyte/sec
• ICB-4: 2 Gigabytes/sec

 Internal Channels (ICs)


– LPAR-to-LPAR communication within a CEC
– No external hardware used

217 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Coupling Support Facility

 Specialized hardware provided on each processing node in the Parallel


Sysplex cluster is responsible for controlling communication between the
processor and the CF
 The coupling support facility consists of the following elements:
– special CPU instructions
– high-speed links
 It also utilizes processor memory to contain local state vectors (in
Hardware Storage Area, HSA) to locally track the state of resources
maintained in the CF
 The coupling support facility provides several critical functions:
– Coupling facility command delivery
– Secondary command execution
– Local state vector control
– Hardware assisted system isolation (fencing)

218 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Primary Message (SEND MESSAGE)

z/OS CF
(Sender) (Receiver)

MCB MCB
(command) (command)

Data Data
(up to 64K (up to 64K
bytes) bytes)

MRB MRB
(response) (response)

219 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Secondary Message

 When the CF determines that the data structure is changed


and other z/OS images are registered for the structure, the
CF will send alerts to these z/OS images. These alerts (XI:
Cross Invalidate) are sent by Secondary Messages
 MCB/MRB only (no data)
 z/OS is not interrupted when MCB arrives, but local status is
kept in system memory for later inspection by z/OS
 Beside XI for a cached structure, LN (List Notification) is
used for list/queue state transitions in the CF (empty/non-
empty)

220 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Message Vectors

 Message vectors are allocated in HSA


– Local cache vectors
– List notification vectors
– Notification vectors for the completion of an
asynchronous SEND MESSAGE

221 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

Message Vector Instructions

 DEFINE VECTOR: Allocates a message vector with a specified


number of entries (bits):
– A local summary flag is associated with such a vector.
– A global summary flag is associated with all local summary flags.
 TEST VECTOR SUMMARY: Tests the local summary flag of a
message vector and the global summary flag
 TEST VECTOR ENTRIES: Tests a number of entries in a vector
 SET VECTOR SUMMARY: Sets the local summary flag of a
message vector and the global summary flag
 SET VECTOR ENTRY: Sets an entry in a message vector

222 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

CF Duplexing

 Running the CF and the z/OS on the same system


(LPAR), a system failure affecting both images is
not recoverable.
 To solve this problem, CF Duplexing is introduced
– It allows pairs of CFs to automatically synchronize two
copies of CF data structures, one in each CF.
 Each CF is placed in a different physical server.

223 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

CF Duplexing (continued…)

 z/OS will send 2 SEND MESSAGE instructions to 2


CFs for each operation
 The 2 CF images exchange messages to keep
themselves synchronized and a link is required to
exchange these messages.
 Secondary messages are used for CF-to-CF
communication
 Automatic switch-over on failure from one CF to
backup CF, transparent to exploiter

224 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

225 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

System z Architecture

Appendix

226 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

References (1)

 z/Architecture Principles of Operation, SA22-7832


 z/Architecture Reference Summary, SA22-7871
 ESA/390 Principles of Operation, SA22-7201
 System/370 Extended Architecture Interpretive Execution,
SA22-7095
 Online (PDF) available here:

http://www.elink.ibmlink.ibm.com/publications/servlet/pbi.wss

Select country. Then click “Search for publications” and enter


the publication number.

231 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

References (2)

 IBM System z10 Enterprise Class Technical Introduction

http://www.redbooks.ibm.com/abstracts/sg247515.html

 IBM System z10 Enterprise Class Technical Guide

http://www.redbooks.ibm.com/abstracts/sg247516.html

232 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

References (3)

 IBM Journal of Research and Development, Vol. 46, Nos. 4/5,


2002:
Development and attributes of z/Architecture, K. E. Plambeck,
W. Eckert, R. R. Rogers, and C. F. Webb

http://www.research.ibm.com/journal/rd46-45.html

 IBM Journal of Research and Development, Vol. 51, Nos. 1/2,


2007: IBM System z9

http://www.research.ibm.com/journal/rd51-12.html

233 System z Architecture May 2008 © 2008 IBM Corporation


Systems and Technology Group

References (4)

 IBM Journal of Research and Development, Vol. 48, Nos. 3/4,


2004: IBM eServer z990

http://www.research.ibm.com/journal/rd48-34.html

 S/390 cluster technology: Parallel Sysplex, J. M. Nick, B. B.


Moore, J.-Y. Chung, N. S. Bowen, IBM Systems Journal, Vol.
36, No. 2, 1997

http://domino.research.ibm.com/tchjr/journalindex.nsf/495f80c
9d0f539778525681e00724804/05ffabe33879ed1485256bfa00685
ddf?OpenDocument

234 System z Architecture May 2008 © 2008 IBM Corporation

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy