SG 248518
SG 248518
Lydia Parziale
Sam Amsavelu
Dileep Dixith
Gayathri Gopalakrishnan
Pravin Kedia
Manoj Srinivasan Pattabhiraman
David Simpson
Redbooks
Draft Document for Review December 17, 2024 7:01 am 8518edno.fm
IBM Redbooks
June 2024
SG24-8518-01
8518edno.fm Draft Document for Review December 17, 2024 7:01 am
Note: Before using this information and the product it supports, read the information in “Notices” on
page ix.
iii
8518edno.fm Draft Document for Review December 17, 2024 7:01 am
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Chapter 6. Open source data base management systems and LinuxONE . . . . . . . . 135
6.1 Migration from MySQL to MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.2 Migration from MySQL to PostgreSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Contents vii
8518TOC.fm Draft Document for Review January 7, 2025 7:53 am
Playbooks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Base deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Replica set virtual machine instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
OpenStack support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Shadow instance deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Hosts file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
MongoDB deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Quiescing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Resuming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Terminating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Embedded task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Replica set deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Master playbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Image and volume tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Deploying or destroying the variables file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and
PL/pgSQL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Challenges that are caused by the specification differences of SQL and PL/SQL . . . . . . 372
Case of error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Use case with different runtime results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Key to a successful SQL and PL/SQL conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
SELECT statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
DELETE or TRUNCATE statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
ROWNUM pseudocolumn. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
PL/SQL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Database trigger migration pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Cursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Stored functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Stored procedures migration pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Other migration patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright
and trademark information” at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
DB2® IBM Spectrum® Redbooks (logo) ®
Db2® IBM Z® System z®
IBM® IBM z Systems® z Systems®
IBM Cloud® Interconnect® z/OS®
IBM Cloud Pak® Passport Advantage® z/VM®
IBM FlashSystem® Redbooks® zEnterprise®
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive
licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
Ansible, OpenShift, Red Hat, are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in
the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
Preface
Data modernization refers to initiatives or processes that lead to more pertinent and precise
data, as well as quicker and more effective data processing and analysis, thereby enhancing
both organizational management and regulatory compliance while catering to industrial and
societal requirements..
This IBM® Redbooks® publication describes how to maximize the data serving capabilities
on LinuxONE and how you can take advantage of these capabilities to modernize your
enterprise.
We start by describing the value of using and migrating your databases to IBM LinuxONE in
order to maximize your data serving capabilities. We describe information on migrating your
current databases to LinuxONE in the following chapters:
“Oracle and LinuxONE” on page 11
“Db2 and LinuxONE ” on page 45
“Postgres and LinuxONE” on page 87
“MongoDB and LinuxONE” on page 136
“Open source data base management systems and LinuxONE” on page 151
We also describe using containers and the benefits of automation by using containers (see
section 7.2, “Benefits of automation when using containers” on page 155).
Appendix B, “MongoDB as a service with IBM LinuxONE” on page 351 demonstrates the
benefits of using IBM LinuxONE in your enterprise. This chapter describes how
IBM LinuxONE, combined with IBM Storage, provides high availability (HA), performance,
and security by using a sample-anonymized client environment and includes a use case that
demonstrates setting up a database as a service that can be replicated on a much larger
scale across multiple client sites.
Authors
This book was produced by a working at IBM Redbooks, Poughkeepsie Center.
Lydia Parziale is a Project Leader for the IBM Redbooks team in Poughkeepsie, New York,
with domestic and international experience in technology management including software
development, project leadership, and strategic planning. Her areas of expertise include
business development and database management technologies. Lydia is a PMI certified PMP
and an IBM Certified IT Specialist with an MBA in Technology Management and has been
employed by IBM for over 30 years in various technology areas.
Sam Amsavelu has worked for IBM for the past 29 years advising customers about how to
build highly available, reliable, and secure infrastructure architectures for their databases.
Expertise in implementing virtualization, server consolidation, and cloud technologies to help
customers choose the correct platform and technologies for an optimized environment with
the lowest TCO possible. SME of multiple operating systems (IBM z/OS®, IBM z/VM®, UNIX,
and Linux distributions), relational databases (IBM Db2®, Oracle, PostgreSQL and SQL
Server) and NoSQL databases (MongoDB).
Dileep Dixith is a Security Architect in IBM India as well as a Senior Inventor. He has 20
years of experience in Security and Storage domains. He has worked at IBM for 14 Years. His
areas of expertise include Storage, Security and Confidential computing area. He is
passionate about patents, blogs etc.
Gayathri Gopalakrishnan is an IT Architect for IBM India and has over 23 years of
experience as a technical solution specialist, working primarily in consulting. She is a
results-driven IT Architect with extensive experience in spearheading the management,
design, development, implementation, and testing of solutions. A recognized leader, applying
high-impact technical solutions to major business objectives with capabilities transcending
boundaries. Adept at working with management to prioritize activities and achieve defined
project objectives, ability to translate business requirements into technical solutions.
Pravin Kedia Pravin Kedia is a CTO for Data & AI Expert Labs running the Centre of
Excellence in India. Pravin has 26+ years of experience in watsonx, CP4D Data Fabric, Data
Science and DataOps Solutions, Big Data, Data Warehouse, Replication, Data Science and
AI Architect with international work experience in Banking, Telecommunication, Insurance
and Financial Markets domain across WW customers. He holds a degree in MBA (Finance)
from University of Kansas and Bachelors (Electronics and Telecommunications) from VESIT,
Mumbai university. He has written extensively on Database and Replication technologies.
Pravin regularly conducts Data Strategy, Data Discovery and Design workshops with
customer C-level executives. As CTO and solutions Architect for IBM Data and AI solutions,
Pravin is responsible for technical Delivery for IBM Expert Lab Services Projects across the
Hybrid Cloud Product portfolio. Pravin is a PMP since 2005 and is a Master Inventor (Plateau
4) and on the IDT for Data and Governance teams. Pravin is an Open Group Certified
Executive IT Specialist Thought Leader and Certified ITS Profession Champion. Pravin has
received numerous rewards and led/participated in AOT and Cloud Studies on Data and
Governance.
David Simpson is an Oracle Certified Database Administrator on IBM LinuxONE. David has
authored several IBM Redbooks publications and has presented at numerous user
conferences, including Oracle OpenWorld. David has assisted many IBM customers with
implementing Oracle, PostgreSQL, MongoDB, and Open-source database solutions with
IBM Z® hardware and IBM Hybrid Cloud solutions.
Lydia Parziale
IBM Redbooks, Poughkeepsie Center
Robert Haimowitz
IBM, Poughkeepsie Center
Thanks to the authors of the previous editions of this book. Authors of the first edition,
Leveraging LinuxONE to Maximize Your Data Serving Capabilities, SG24-8518, published in
April, 2022 and updated May, 2022 were:
Kurt Acker, Sam Amsavelu, Elton de Souza, Gary Evans, Neale Ferguson, Aya Hoshino, Yuki
Ishimori, Niki Kennedy, Riho Minagi, Daiki Mukai, Colin Page, Anand Subramanian, Kaori
Suyama, Kazuhisa Tanimoto and Yoshimi Toyoshima
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Preface xiii
8518pref.fm Draft Document for Review January 7, 2025 7:53 am
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, IBM Redbooks
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
For example, when an organization's transformation strategy includes public or private cloud,
the move to CI/CD, open source, and containerization is foundational because the
organization can use built-in automation for agility and portability to achieve cloud-like release
cycles for new competitive functions with pre-packaged scalability, security, and resilience.
This chapter focuses on the enterprise class IBM LinuxONE server, its benefits and how it
can add value to your enterprise data serving needs. We describe how organizations can
undertake application modernization initiatives with flexibility, agility, and cost-effectiveness
without compromising critical data serving requirements for performance, scalability,
resilience, or security.
Over the years, IBM has continuously upgraded LinuxONE systems, integrating more
advanced technology to handle growing data demands and to support emerging technologies
like blockchain and artificial intelligence.
The initial models, such as the Emperor and Rockhopper, were named after penguins,
reflecting their Linux-centric nature. Subsequent releases, like the LinuxONE Emperor II and
III, introduced improvements in processing power, storage capabilities, and energy efficiency,
making them suitable for even more intensive computing tasks.
The latest iteration of the system is the IBM LinuxONE 4, which is available in the following
models and feature AI and Quantum security technologies:
IBM LinuxONE 4 LA1
The newest member of the LinuxONE family, the IBM LinuxONE 4 was generally available
in late 2022 and maintains the new form factor introduced in LinuxONE III, featuring a
19-inch frame that flexibly scales 1 - 4 frames. It is designed around the new Telum
processor with 8 core per chip with Dual Chip Module packaging, 5.2 Ghz processor and
is configurable with up to 200 processor cores, up to 40 TB of RAM, and 10 TB of
Redundant Array of Independent Memory (RAIM) per central processing drawer.
IBM LinuxONE 4 LA2
Released in may 2023, the IBM LinuxONE 4 LT2 is the newest entry model into the IBM
LinuxONE family of servers. It delivers a 19-inch single-frame (versus the option of up to
four frames for the LT1) with an efficient design with a low entry cost that can easily coexist
with other platforms in a cloud data center. This model is designed around the new Telum
processor with 8 core per chip with Dual Chip Module packaging. It is configurable with up
to 68 cores running at 4.6 GHz, up to 16 TB of RAM, and 8 TB of RAIM per central
processing drawer.
IBM LinuxONE 4 AGL
Released in may 2023, the IBM LinuxONE 4 AGL is a new entry model option which
consists of a rack mount model. It delivers a rack mount option from 10U to 39U that can
be collocated with other technologies with a client-supplied 19 inch rack. This model is
designed around the new Telum processor with 8 core per chip with Dual Chip Module
packaging. It is configurable with up to 68 cores running at 4.6 GHz, up to 16 TB of RAM,
and 8 TB of RAIM per central processing drawer.
IBM LinuxONE Express
It is a special offering addition that is based entirely on IBM LinuxONE 4 AGL hardware but
preconfigured and bundled specifically to three sizes: small, medium and large. The small
size consists of 4 Cores and 384GB of RAM. The medium size consists of 6 cores and
512GB of RAM. The large size consists of 12 Cores and 736GB of RAM.
IBM LinuxONE 4 is the industry’s first quantum-safe enterprise Linux system, which
integrates new hardware encryption capabilities, that allows users to apply quantum-safe
encryption to protect data workloads and infrastructure. Each core includes a dedicated
coprocessor for cryptographic functions, which is known as the Central Processor Assist for
Cryptographic Functions (CPACF). CPACF supports pervasive encryption and is providing
hardware acceleration for encryption operations.
The new Telum chip is the first server-class chip with a dedicated on-chip AI accelerator that
provides IBM LinuxONE with the capacity to execute real-time inferences at speed and scale
by co-locating data and AI.
These Linux distributions provide customers who use Linux with various support options,
including 24 x 7 support with one-hour response time worldwide for customers running
production systems. In addition to the Linux operating system, all the major Linux distributions
offer a number of other open source products that they also fully support.
The increased interest and usage of Linux resulted from its rich set of features, including
virtualization, security, Microsoft Windows interoperability, development tools, a growing list of
independent software vendor (ISV) applications, performance, and, most importantly, its
multiplatform support.
This multiplatform support allows customers to run a common operating system across all
computing platforms, which means significantly lower support costs and, for Linux, no
incremental license charges. It also allows customers to easily move applications to the most
appropriate platform. For example, many IT organizations choose Linux because of its ability
to scale databases across highly scalable hardware.
1
A Linux distribution is a complete operating system and environment. It includes compilers, file systems, and
applications such as Apache (web server), SAMBA (file and print), sendmail (mail server), Tomcat (Java
application server), MySQL (database), and many others.
capabilities make it perfectly suited for deploying and running distributed ledger
technologies which require robust, tamper-resistant environments.
Artificial Intelligence (AI): AI and machine learning workloads require significant
computational power and data throughput capabilities. LinuxONE was enhanced to handle
these needs by incorporating AI-specific accelerators and optimizing its architecture to
speed up AI inferencing tasks, thus enabling real-time analytics and insights at
unparalleled speeds.
Because of these technological advancements, IBM LinuxONE provides the following key
features:
Scalability
– Flexible workload management: LinuxONE's ability to scale involves its seamless
handling of varying workload sizes—from smaller databases to large-scale enterprise
operations. It achieves this through a unique architecture that supports the vertical
scaling of resources, allowing for the addition of CPUs, memory, and storage without
disrupting ongoing operations. Valuable to businesses that need to increase their
processing power or data storage capacity quickly due to growth spurts or seasonal
spikes in demand.
– Dynamic Resource Allocation: LinuxONE can dynamically allocate resources based on
workload demands. This allows for real-time redistribution of computing power and
storage, based on current demand and workload requirements. This ability ensures
optimal performance and efficiency, particularly useful in environments where workload
volumes can fluctuate significantly, such as cloud services or large e-commerce
platforms.
Performance
– High-speed processing: LinuxONE systems are designed for high-speed data
processing, utilizing advanced processor technology and optimized I/O pathways that
ensure rapid data throughput. This capability makes it ideal for environments that
require quick transaction processing, such as financial trading floors or real-time
analytics applications.
– Enhanced data handling: The systems are engineered to handle large volumes of data
and transactions with minimal latency, supporting the demanding performance
requirements of modern data-intensive applications.
– High Reliability and Performance Consistency: LinuxONE is designed to manage
varied workloads without compromising on performance or reliability. This capability is
crucial for businesses that operate in sectors where time and accuracy are critical,
such as financial services, healthcare, and e-commerce.
Security
– LinuxONE’s pervasive encryption capabilities allow you to encrypt massive amounts of
data with little affect on your system performance. The LinuxONE hardware benefits
from encryption logic and processing on each processor chip in the system.
– The Central Processor Assist for Cryptographic Function (CPACF) is well-suited for
encrypting large amounts of data in real time because of its proximity to the processor
unit. CPACF supports:
• DES
• TDES
• AES-128
• AES-256
• SHA-1
• SHA-2
• SHA-3
• SHAKE
• DRNG
• TRNG
• PRNG
With the LinuxONE 4, CPACF supports Elliptic Curve Cryptography clear key,
improving the performance of Elliptic Curve algorithms.
The following algorithms are supported:
• EdDSA (Ed448 and Ed25519)
• ECDSA (P-256, P-384, and P-521)
• ECDH (P-256, P-384, P521, X25519, and X448)
Protected key signature creation is also supported.
– Optional cryptography accelerators provide improved performance for specialized
functions:
• Can be configured as a secure key coprocessor or for Secure Sockets Layer (SSL)
acceleration.
• Certified at FIPS 140-3 and Common Criteria EAL 4+.
– IBM’s Hyper Protect Virtual Server offering is exclusive to IBM LinuxONE because it
delivers more security capabilities to protect Linux workloads from internal and external
threats throughout their lifecycle, build, management, and deployment phases. Some
of the security benefits include:
• Building images with integrity, which Secures continuous integration and delivery
• Managing infrastructure with least privilege access to applications and data
• Deploying images with trusted provenance
– IBM LinuxONE 4 maintains the Secure Execution for Linux. It is a hardware-based
security technology that is designed to protect and isolate workloads on-premises, or
on IBM LinuxONE and IBM Z hybrid cloud environments. Users, and even system
administrators, cannot access sensitive data in Linux-based virtual environments.
– Secure Service Containers: LinuxONE utilizes secure service containers to provide a
highly secure environment for running applications and workloads. These containers
are isolated from one another and from the host system, protecting them from
cross-container and external threats, as well as from internal vulnerabilities.
• For organizations handling sensitive data or operating under strict regulatory
frameworks, secure service containers offer a compliance-ready environment. The
containers ensure that data is processed and stored in compliance with security
policies and standards, significantly reducing the risk of data leakage or
unauthorized access.
• The security within these containers is managed and enforced at the hardware
level, providing stronger protection than software-based security solutions. This
hardware-level enforcement minimizes the risk of security being compromised
through software vulnerabilities or configuration errors.
– Telum processor: Designed specifically for high-speed data processing, the Telum
processor supports enhanced data throughput and reduced latency. This technological
advancement is pivotal for industries where speed and efficiency are directly linked to
operational success.
• Enhanced Transaction Processing: With the Telum processor, LinuxONE excels in
environments that require rapid transaction processing, such as banking and
financial services. The processor's ability to handle massive volumes of
transactions in real-time ensures that businesses can deliver fast and reliable
service to their customers.
• Real-Time Analytics: Beyond transaction processing, the Telum processor is also
adept at facilitating real-time analytics. This capability allows businesses to analyze
data as it is being collected, enabling immediate insights and decision-making,
which is critical for sectors like retail and telecommunications where market
conditions can change rapidly.
Optimized data pathways
– Efficient Data Movement: The architecture of LinuxONE is optimized to ensure efficient
data movement through the processor and memory, minimizing bottlenecks that can
slow down operations. This design is essential for high-performance computing
environments that require fast access to large datasets.
– Application Performance: For applications that demand real-time data access-such as
financial trading platforms and online transaction systems-optimized data pathways
ensure that information is processed and available without delays. This optimization
supports critical business operations, enhancing the overall agility and competitiveness
of enterprises.
– Scalability and Flexibility: The optimized architecture not only supports current
performance needs but also provides scalability. As business requirements grow and
evolve, LinuxONE can continue to deliver optimal performance without significant
reconfiguration, thereby protecting investment and future-proofing the infrastructure.
Reliability
– Near-Zero downtime: The architecture of LinuxONE is designed to offer unmatched
reliability, capable of providing continuous service with near-zero downtime. This
architecture includes:
• Redundant processors, I/O, and memory.
• Error correction and detection.
• Remote Support Facility.
– Disaster recovery: Robust disaster recovery capabilities ensure that operations can be
quickly restored after any unplanned outages, further reinforcing its reliability.
Open source support
– Community engagement: IBM actively fosters a strong community around LinuxONE,
integrating with various open-source projects and platforms. This support includes
tools like Kubernetes for orchestration, and databases such as MongoDB and
PostgreSQL.
– Innovation through flexibility: The support for open-source software not only provides
flexibility in terms of application development and deployment but also drives
innovation as developers can leverage the best tools available without vendor lock-in.
Energy efficiency
– Reduced energy consumption: LinuxONE systems are designed to be energy-efficient,
using less power than traditional server arrays of comparable capacity. This efficiency
is achieved through optimized hardware that requires less cooling and energy. With its
low power and cooling requirements, IBM LinuxONE is an ideal platform for the
consolidation of distributed servers.
– Environmental impact: The reduced energy consumption contributes to a lower carbon
footprint, aligning with the sustainability goals of many organizations. This feature is
particularly appealing to companies committed to environmental stewardship.
IBM LinuxONE's advanced data serving capabilities are designed to significantly enhance
customer value across various critical aspects of modern business operations. From
bolstering data security to ensuring system reliability and operational efficiency, LinuxONE
provides comprehensive solutions that cater to the demanding needs of today's digital
enterprises.
Some of the benefits of the features described in section 1.2, “Key Technological
Advancements and Features of IBM LinuxONE” on page 3 translate to satisfying your data
serving needs, which include the following:
Robust data serving
– Unified data access: Implementing a robust data management system that integrates
data from various silos and makes it accessible across the organization is crucial. Such
systems ensure that all departments have a unified view of customer interactions and
can deliver consistent and informed customer experiences.
– Reliability and accuracy: These systems must not only integrate data but also ensure
its accuracy and timeliness, which are critical for maintaining trust and delivering value
to customers.
– Data-serving capabilities: LinuxONE is designed for structured and unstructured data
consolidation and optimized for running modern relational and non-relational
databases.
Scalability
– Handling Growth: As businesses grow and data volumes increase, the need for
scalable solutions becomes critical. Scalable data architectures ensure that
businesses can handle increased loads without performance degradation, thereby
supporting growth without compromising customer service.
– Flexibility for Future Expansion: Scalable solutions also provide the flexibility to expand
and incorporate new technologies without extensive overhauls, thus protecting
investment in existing technologies.
Security and Data Protection
– LinuxONE provides pervasive encryption that ensures that all data, whether at rest or
in transit, is shielded from unauthorized access, thus maintaining confidentiality and
integrity. LinuxONE's pervasive encryption is designed to secure data at-rest and
in-transit, covering all levels of data interaction. This approach ensures that all data,
regardless of its state, is shielded from unauthorized access, thereby protecting
sensitive information across the entire data lifecycle.
– As quantum computing advances, the potential threat to current cryptographic
standards grows. LinuxONE addresses this emerging challenge by incorporating
quantum-safe cryptography into its security architecture, ensuring that data remains
protected even as quantum computing becomes more prevalent.
– Preventing Breaches: Secure data solutions include advanced encryption, regular
security audits, and real-time threat detection mechanisms to protect customer data
and maintain business integrity.
Regulatory compliance
– LinuxONE’s encryption strategy is designed not only to secure data but also simplifies
compliance with global data protection regulations such as GDPR, the US Health
Insurance Portability and Accountability Act (HIPAA), and the Payment Card Industry
Data Security Standard (PCI DSS).
Application modernization
– LinuxONE plays a pivotal role in optimizing software licensing through consolidated
and centralized assets. This simplified and cost-effective subscription model, when
coupled with the high consolidation and scalability features of the IBM LinuxONE
server, enable enterprises to realize significant and easily predictable operating cost
savings over other database software solutions while continuing to meet Enterprise
database requirements: automation, security, resilience, portability, and speed.
Encryption capabilities of LinuxONE can help defend and protect business critical data
against external and internal attacks, including privileged users. Financial institutions can
protect all business critical data at rest and in-flight – transparently with no changes to
applications with IBM Pervasive Encryption. Centralized data encryption policy-based
controls significantly reduce the costs associated with data security and achieving
compliance mandates, including new General Data Protection Regulations (GDPR).
Integrated encryption, data protection, identity and access management and security
intelligence and audit on LinuxONE provide financial institutions with a highly optimized, cost
effective security environment.
To read more on how LinuxONE helps the world’s most complex financial organizations
quickly grow their enterprises to outthink their competition, see:
http://tes-es.com/wp-content/uploads/2019/12/IBM_LinuxONE_Banking__FM_Point_of_Vie
w.pdf
To read about how a bank in Jamaica accelerated its core process and positioned itself to
launch new services with LinuxONE technology, see the following case study:
https://www.ibm.com/case-studies/sagicor-bank-jamaica
For product longevity and patching, IBM and Oracle recommend upgrading to Oracle 19c,
which is the long-term release with the longest support end date. Oracle 23ai is the next
projected long-term release for the LinuxONE (s390x) platform.
Although the minimum RHEL 8 Oracle Database release level is 19.11, we also recommend
using the latest level of RHEL 8 (in our lab environment we use release 8.10 for Oracle 19c
installs).
The Oracle Master Note is intended to provide an index and references to the most frequently
used MOS articles concerning Linux OS requirements for an Oracle Database software
installation.
For more information about the master note, see MOS Doc ID 851598.1, which is available by
logging in to your Oracle Support account
If you have a commercial Oracle license, always download your software from the Oracle
Software Delivery Cloud.
For trial and developer network downloads, the Oracle License agreement must be fully
understood before agreeing and downloading the Oracle 19c database software from the
OTN download site.
In addition to the base software, the latest Patch Set or Release Update must be downloaded
from Oracle Support.
Oracle 19c provides the latest Release Update (RU) for the Grid Infrastructure and Database
from the Oracle support site. The latest RU for your release can be found in Oracle Note
Master Note for Database Proactive Patch Program (Doc I 756671.1).
An Oracle 19c base installation is release 19.3 for Linux on System Z. It is recommended to
be on the latest Grid Infrastructure and Database RU when the patches are made available
every fiscal quarter.
The latest (as of July 2024) Oracle 19.24 Grid Infrastructure and Database security and RU
patch is 36582629 for Grid/ASM (including DB) or 36582781 for just the DB RU. Follow the
ReadMe file for the patch you plan to install. Make sure to download the latest OPatch
installer utility patch 6880880 as well for the RU patch as described in the patch readme file.
More than 32 GB 32 GB
The Oracle guidelines for Linux swap can be reduced (if needed) when enough memory is
available to run all the databases in the Linux guest. The Oracle Installer requires a
minimum 500 MB of configured Linux swap to complete an installation and 1GB of swap
for database upgrades and patches.
Customers that use IBM LinuxONE can use a layered virtual in-memory disk or VDisk for
the Linux swap devices. Linux swap to a memory device (VDisk) is much quicker than
using a physical disk storage device.
Figure 2-1 shows an example of a recommended VDisk configuration with a VDisk being
used for the first and second Level Swap with higher priority. Then, a physical disk or
DASD disk can be used as a lower priority swap in the case of unexpected memory usage.
Linux use, the higher priority swap devices first. When the swap device is fully exhausted,
the next priority swap device is used next.
kernel.shmmax = 8589934592
#set shmall to (sum of sga memory in bytes)/4096
kernel.shmall = 2097152
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.panic_on_oops = 1
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.swappiness = 1
#set nr_hugepages to sum of all Oracle SGAs in MB if using large pages
#vm.nr_hugepages = 4096
vm.hugetlb_shm_group=1001
After kernel parameter values are updated, validate the kernel parameters by using the
following sysctl command:
/sbin/sysctl -p
Issue the following commands to create the oracle group and user ids.
groupadd -g 1001 oinstall ; groupadd -g 1002 dba ; groupadd -g 1003 asmdba ;
groupadd -g 1004 asmoper ; groupadd -g 1005 oper ;
useradd -m -g oinstall -G dba oracle echo "oracle:newPassw0rd" | chpasswd
For separation of duty, some sites set up a Linux grid user ID to manage the grid
infrastructure - both the Oracle ASM and Oracle Grid Infrastructure components. If this is the
case, use the following useradd and change password command:
useradd -m -g oinstall -G dba grid echo "grid:newPassw0rd" | chpasswd
The settings shown in Example 2-2 must be verified before an installation is performed.
These settings can be found in the /etc/security/limits.conf file.
If the Linux oracle user performs the installation, add the settings that are shown in the
example below to your /etc/security/limits.conf file. If your current values are higher than
these values, leave the higher value in place.
If using a grid user, include the grid user settings as well. For large systems with many users,
increasing the nproc value to 131072 is typical.
The values that are shown in Example 2-3 can be added to the oracle and grid (if used) users
.bash_profile file.
ulimit -u 16384
ulimit -n 65536
ulimit -s 32768
If installing Oracle 19c on RHEL 8 or greater, make sure the following environment variable is
set in your .bash_profile or is set manually.
export CV_ASSUME_DISTID=RHEL8.0
If using the graphical user interface (GUI) when running the Oracle Universal Installer (OUI),
Database Configuration Assistant (DBCA) or any related tasks, make sure the environment
variable is set in the xterm window as well.
No environment variable LD_ASSUME_KERNEL value should be used with the Oracle 19c
product.
Use the following command to check the shared memory file system:
cat /etc/fstab |grep tmpfs
If needed, change the mount settings. As the root user, open the /etc/fstab file with a text
editor and modify the tmpfs line. If you are planning to use AMM, set the tmpfs file system
size to the sum of all the MEMORY_TARGET on the system, as shown in the following
example:
tmpfs /dev/shm tmpfs rw,exec,size=30G 0 0
LinuxONE can use large pages of 1 M with Oracle running under z/VM and 2GB when
running under LPAR mode.
of all the Oracle Database SGAs on the Linux guest requiring large pages, as shown in the
following equations:
vm.nr_hugepages = ((sum of all large page SGA's)* 1024)+16 (Granule)= N
vm.hugetlb_shm_group = <Oracle user Linux group number from /etc/group>
Note: Take care when checking that these kernel parameters are correct. A
mis-configured vm.nr_hugepage kernel parameter setting can cause a Linux system to
start incorrectly because of a lack of memory.
5. Restart your Linux image and Oracle for the changes to take effect. Review your Oracle
Database alert log to verify that the database was started with large pages enabled.
Silent install options are available for these Oracle installers as well. However, first-time
installations include GUI wizards that provide the system checks to verify your system
configuration and then can be scripted with the silent installations later.
For Red Hat systems, the VNC Configurator summarizes the steps to install the VNC server
and client RPMs, and then to configure your VNC xstartup file based on the xterm emulator of
your choosing. Use the following command to install the VNC server:
# yum install tigervnc-server
To install the GNU Desktop for VNC server you can use the following command.
# yum group install GNOME base-x Fonts
To start a VNC server session as the Oracle user, run the vncserver command. You may be
required to enter a password for the vncviewer session. Take note of the port (in Figure 2-2 it
is :1) that is used. This port is needed when connecting with the VNC Viewer to the vncserver
session that was started.
If you have problems connecting with the client verify your Linux firewalld settings in section
2.5.5, “Oracle RAC firewalld configuration (optional)” on page 38.
To configure disk storage for LinuxONE, assign the logical unit numbers (LUNs) with multiple
paths to Linux to ensure high availability and better performance. This spread the I/O
workload across more channel paths/host bus adapters (HBAs) and is designed to improve
performance and high availability.
To configure multiple disk paths with FCP/SCSI open storage, it is necessary to set up a
/etc/multipath.conf file to enable multipathing for high availability to the disk paths and for
performance reasons.
Note: It is also recommended to consult your storage vendor for their recommendations for
the multipath.conf file settings for the Linux distribution and level that you use.
multipath {
wwid 20020c240001221a8
aliasASMDISK00
}
}
To restart the multipath service for any changes to take effect, run the following command to
read in the new multipath.conf settings:
/sbin/multipath -v2
systemctl restart multipathd.service
Example 2-8 shows that any disks that are configured with a user_friendly_name that begins
with “ASM*” are owned by the grid or oracle Linux user.
Example 2-8 12-dm-permissions.rules file for FCP/SCSI open system disk storage
ENV{DM_NAME}=="ASM*",OWNER:="oracle",GROUP:="dba",MODE:="660"
You can run the following Linux commands to enable any changes to the UDEV rule:
udevadm control –reload-rules; udevadm trigger
Verify the disk permissions are set correctly by using the following ls command:
ls -lL ASM*
HyperPAV provides more I/O paths to help avoid any disk I/O path bottlenecks.
To configure the ECKD/DASD devices, it is helpful to use the lsdasd device name in the
disk name to make it easier to assign the disk storage, as shown in Example 2-9.
It is also recommended to configure aliases for each DASD device containing Oracle
datafiles, to allow for more concurrent IO. HyperPAV can be setup in z/VM or in LPAR to
configure aliases for the base ECKD volumes. The “lsdasd -u” command can be used to
verify that aliases are configured correctly. If you see no “alias” defined, then the configuration
is not setup correctly.
Run the “lsdasd -u” command as shown in Figure 2-3 to ensure PAV aliases are defined for
any ECKD Disk storage devices being used for database files for IO performance.
If configuring Oracle RAC, the same disk devices must be set up as shareable on each of the
nodes in the RAC cluster. Configure a new UDEV, such as
/etc/udev/rules.d/99-udev-oracle.rules, as shown in Figure 2-10.
A restart is required for modifications of UDEV rules to take effect. For Red Hat 8 / SUSE
Linux Enterprise Server 12, you can run the following UDEV commands.
# udevadm control –reload-rules; udevadm trigger
This change can be done dynamically. Then, run the ls commands to confirm that the file
permissions are set correctly (based on the devices configured in your UDEV rules), as
shown with the following command and its results.
# ls -lL /dev/oracleasm
brw-rw---- 1 oracle oinstall 94, 61 Jun 12 19:18 asm7406
It is recommended to separate the Oracle Data and redolog file systems. For high IO
databases, the file system should be striped across multiple disk storage devices.
For configuration of spectrum scale shared file systems please refer to the following IBM
Redbooks publication:
Chapter 5 of Best practices and Getting Started Guide for Oracle on IBM LinuxONE,
REDP-5499
If ASM is not used, create a Linux file system in the following steps by using Yast or Linux
commands.
1. Create the initial mount point:
# mkdir /u01
# chown -R oracle:oinstall /u01 l3oradb2:~
# chmod -R 775 /u01
2. Create a simple file system:
– ECKD DASD: Run the following commands to create a simple file system on ECKD
DASD:
dasdfmt -p -y -b 4096 -f /dev/dasdan
fdasd -a /dev/dasdan
– FCP/SCSI: Run the following command to create a simple file system on FCP/SCSI:
fdisk /dev/mapper/mymp1 pvcreate /dev/mapper/mymp1p1
For both disk storage types create a volume group, and then a logical volume by using the
commands shown in Example 2-12.
Oracle Restart monitors and restarts Oracle Database instances, Oracle Net Listeners, and
Oracle ASM instances.
Oracle ASM manages a small number of storage pools that are called ASM Disk Groups.
Database-related files are assigned to ASM Disk Groups and ASM manages the layout and
data organization.
Oracle combined these two infrastructure products into a single set of binaries that is installed
into an Oracle Restart home directory. To use Oracle ASM or Oracle Restart, we must install
Oracle Grid Infrastructure for a stand-alone server before we install and create the database.
The following documents were referenced for installing Oracle Grid Infrastructure for a
standalone server:
Oracle Database Installation Guide, 19c for Linux, E96432-34
RPM Checker when Installing Oracle Database 19c on Linux on System z® (Doc ID
2553465.1)
The following is an overview of the steps that were completed to install Oracle Grid
Infrastructure (19c) as a stand-alone server on a Red Hat Enterprise Linux guest on an IBM
LinuxONE server. The following are the high level steps:
1. Validate Linux guest requirements.
2. Validate required Disk Space.
3. Validate required Linux RPMs for Oracle 19c.
4. Disable Linux Transparent Huge Pages.
5. Update Linux Kernel parameters.
6. Huge page setup for Oracle Databases.
7. Create users or groups.
8. Create user directories for Oracle product installations.
9. Set up udev rule for storage for ASM.
10.Install Grid infrastructure for the standalone server binary.
11.Verify Grid infrastructure for the standalone server binary.
Installation
The Oracle Grid Infrastructure software is available as an image, available for download from
the Oracle Technology Network website, or the Oracle Software Delivery Cloud portal. For
more information on downloading the Oracle Grid Infrastructure see the Oracle Database
Installation Guide, 19c for Linux, E96432-34.
Example 2-14 shows the image file that we downloaded and is available for our use.
1. Create the grid home directory, and extract the image files in this directory by running the
following commands:
mkdir -p /u01/grid
cd /u01/grid
unzip -q /mnt/oraclenfs/oracode/19c/V982652-01_19cGrid.zip
The extracted files are shown in Figure 2-4.
2. Back up the OPatch directory, and first apply the 6880880 OPatch file (Example 2-15) and
then the latest Release Update from the My Oracle Support site.
3. In a graphical environment, such as VNC, and as the grid user, start the Grid Infrastructure
installation process by running the following command:
gridSetup.sh
or configure a response file for a silent install installation.
It is advisable to download the latest database patch RU for the Linux One (or Linux on
IBM Z) platform, and unzip the patch ahead of time. The same patch can used for the Grid
and Database software installations.
In our case, we downloaded p36233126_190000_Linux-zSer.zip and unzipped the file into
our software staging directory. Unzip will create a file directory under the patch number. In
our case folder 36233126 was created.
To run the gridSetup.sh with the patch with the VNC GUI, run the following command. The
command and its results are shown in Example 2-16.
./gridSetup.sh -applyRU <path to the latest unzipped RU Patch>
Once the RU patch is applied you should get the Oracle Grid Installer Step 1 of 7 display
as shown in Figure 2-8.
4. Select Configure Oracle Grid Infrastructure for a Standalone Server (non-RAC) then select
the second option (Configure an Oracle Standalone Cluster) and click the next button.
5. In the Create ASM Disk Group screen, specify the settings that are listed in Table 2-2.
Table 2-2 Create ASM Disk Group window settings
Field name Value Comments
Redundancy External
6. In the Specify ASM Password window, choose Use same passwords for these accounts
option and specify a password that conforms to your security requirements.
7. In the Enterprise manager grid control screen, specify the required information if the
database is to be managed by Enterprise Manager (EM) Cloud Control. This step can be
configured later if needed as well.
8. In the Operating System Groups screen, specify the settings listed in Table 2-3.
10.In the Create Inventory window, specify the value for the Oracle Inventory Directory. This
value is the location where Oracle stores the inventory of all the products that are installed.
For our example, we specified /u01/app/oraInventory.
11.In the Root Script execution configuration window, select the Automatically run
configuration scripts option and enter the root user credentials.
In our installation, the installation process performed prerequisite checks and reported the
following errors:
– Swap size error
– Package cvuqdisk-1.0.10-1 missing error.
The swap size error can be ignored if encountered as long as at least 1GB of swap is
configured. For the Package missing error, choose the Fix and Check again option. This
choice created the fix and provided the information about where the fix available and how
to run it, as shown in Figure 2-6.
12.For an Oracle Single Installation for a stand-alone server, click OK for to run the
runfixup.sh script that was prepared by the installer. Selecting OK automatically runs the
command by using root credentials. Otherwise, a script is created, which you can run
manually.
13.When that has completed, an installation Summary window is displayed. Review all the
information Select Install to start the Grid Installation which will run the configuration
scripts. The installer will prompt that Configuration scripts as the root user need to be run,
click “Yes” to allow the configuration scripts to be run as the root user.
14.After the installer is complete and the Grid Infrastructure is configured, the successful
completion window is displayed, as shown in Figure 2-7.
Once the Installation configuration screen is displayed, complete the following steps:
1. In the Select Configuration Option window, select Set Up Software Only, as shown in
Figure 2-8 and click the "Next" button to continue the installation.
2. In the Select Database Installation Option window, select Single Instance database
installation.
3. In the next Select Database Edition window, click Enterprise Edition.
4. Specify the value for the Oracle base directory. This value is the location where all the
diagnostic information (among which are trace files, dumps, and core files) for all the
Oracle products that are stored. In our example, we specified /u01/app/oracle.
5. The Privileged Operating System Groups window is next.
a. The OSDBA group is mandatory. Specify dba for all groups. This group identifies
operating system user accounts that have database administrative privileges (the
SYSDBA privilege).
b. The OSOPER group for Oracle Database is optional. This group grants the
OPERATOR privilege to start up and shut down the database (the SYSOPER
privilege). Accept oper for the Database operator group to keep things simple.
6. The installation process will perform prerequisite checks and may report a Swap size
warning, which can be ignored if at least 1GB is configured on the Linux system.
7. The summary window is displayed. Select Install.
The Install Product window will prompt you to run, as root, the root.sh script. Run it and then,
click OK (as shown in Figure 2-9).
Figure 2-10 shows the command and output run as the root user.
Upon completion, the Finish window displays and indicates that the registration installation of
the Oracle Database software was successful.
In our example we will show the xterm GUI method. For silent Install steps see section 8.1.15
Silent installation in the IBM Redbooks publication, Oracle 19c on IBM LinuxONE,
SG24-8487.
1. Add or uncomment the ORACLE_SID in the oracle user’s profile, which is in the user’s HOME
directory by using the export ORACLE_SID=orclfs command.
2. Start the Database Configuration Assistant (dbca) installer to create a database by running
the dbca command in a VNC client terminal.
3. Figure 2-11 shows the Select Database Operation window. Select the Create a database
option.
4. The database creation panels are listed by appearance in Table 2-4 with our selected
options in bold.
Table 2-4 Database creation panels in order of appearance with their options
Panel Comments and options
Step 7 of 15 In this panel, you can use any Listeners that are
Network Configuration running or create a Listener for the database.
In our lab environment, we chose the default
listener that is configured, named LISTENER.
For more detailed steps, see Chapter 9, of the IBM Redbooks publication, Oracle 19c on IBM
LinuxONE, SG24-8487.
The Oracle Advanced Cluster File System (ACFS) is not supported on IBM LinuxONE. IBM
and Oracle recommend using secure IBM Spectrum® Scale (General Parallel File System or
GPFS) for your shared file system needs, including Oracle Goldengate implementations. See
the following support note for more information on this:
IBM Spectrum Scale 5.0 and Oracle RAC on IBM Linux on System z (Doc ID 2437480.1)
The Linux pre-installation processes in section 2.1, “Obtaining the Oracle 19c software for
IBM LinuxONE” on page 12 apply for Oracle RAC installs as well. In this section, we discuss
additional requirements that are needed for a successful RAC installation, including the
following:
“Oracle network requirements on IBM LinuxONE ” on page 35
“Oracle SCAN configured in DNS ” on page 36
“Public, Private and Virtual IP address requirements” on page 37
“Network Interface Name (NIC) consistency for Oracle RAC ” on page 37
“Oracle RAC firewalld configuration (optional)” on page 38
“Time synchronization across cluster nodes” on page 39
“Time synchronization across cluster nodes” on page 39
“SSH user equivalency for Oracle RAC installs ” on page 40
The network interconnect for Oracle RAC must be on a private network. If you are configuring
the private interconnect between separate IBM LinuxONE machines, there must be a physical
switch configured ideally with one network hop between systems. The private interconnect
should also be a private IP address in the range that is shown in Example 2-18.
An Oracle RAC workload sends a mixture of short messages of 256 bytes and database
blocks of the database block size for the long messages. Another important consideration is
to set the Maximum Transition Unit (MTU) size to be a little larger than the data base block
size for the database. Oracle RAC configuration options are shown in Table 2-5.
All z/VM Linux guests in one Private Layer2 VSwitch Shared Public VSwitch
LPAR Guest LAN OSA recommended
recommended Shared or dedicated OSA
Real layer 2 HiperSocket card is possible
possible
Guest LAN HiperSocket not
supported
z/VM Linux guests on different Real layer 2 HiperSocket Shared Public VSwitch
LPARs recommended recommended
Private Layer 2 Gigabit Shared or dedicated OSA
OSA card possible card
z/VM Linux guests on different Private Layer 2 Gigabit OSA Dedicated OSA card possible
physical machines card recommended with
physical switch in between (one
hop)
The following methods can be used to configure a high availability (HA) solution for an IBM Z
environment that uses a multi-node configuration:
Virtual Switch (Active/Passive)
When one Open System Adapter (OSA) Network port fails, z/VM moves the workload to
another OSA card port; z/VM handles the fail over.
Link Aggregation (Active/Active)
Allows up to eight OSA-Express adapters to be aggregated per virtual switch. Each
OSA-Express port must be exclusive to the virtual switch (for example, it cannot be
shared); z/VM handles the load balancing of the network traffic.
Linux Bonding
Creates two Linux interfaces (for example, eth1 and eth2) and create a bonded interface
bond0 made up of eth1 and eth2, which the application uses. Linux can be configured in
various ways to handle various failover scenarios.
Oracle HAIP
Oracle can have up to four Private Interconnect interfaces to load balance Oracle RAC
Interconnect traffic. Oracle handles the load balancing and is exclusive to Oracle
Clusterware implementations. To use of HAIP, following parameters must be set in
/etc/sysctl.conf file otherwise communication between nodes will be interrupted causing
the node eviction, one of the nodes not joining cluster
The Linux kernel parameters may need to set at /etc/sysctl.conf file in order to avoid any
routing issue between the network interfaces.
Oracle SCAN provides a single name in which a client server can use to connect to a
particular Oracle database. The main benefit of SCAN is the ability to keep a client
connection string the same, even if changes within the Oracle RAC Database environment
occur, such as adding or removing of nodes within the cluster.
In Example 2-20, orascan0087, our SCAN, is configured with three IP addresses pointing to
the SCAN name DNS entry.
Name: orascan0087.pbm.ihost.com
Address: 129.40.24.8
Name: orascan0087.pbm.ihost.com
Address: 129.40.24.7
Run an nslookup command using your SCAN name to help ensure the DNS name is
configured correctly with three IP addresses configured to each SCAN name for each Oracle
RAC cluster that you require.
If it is not consistent, change the interface names to a consistent name. For more information
on how to do this, see Configuring the System's Network.
InExample 2-21, we list our plan to rename encXXXX based interface names from Node1 and
Node2 to pub1 and priv1 respectively.
This will prevent your session from being disconnected with ifdown commands. Open an SSH
session from Node2 into Node1 on the private interconnect IP, to change the Public Interface.
Use the commands shown in Example 2-22 to change the interface name.
Reboot the server and verify whether the network interface names have been properly
changed.
Interface configuration (ifcfg) files control the software interfaces for individual network
devices. As the system boots, it uses these files to determine what interfaces to bring up and
how to configure them. We suggest creating an ifcfg script. We copied an icfg file from the
/etc/sysconfig/network-scripts/ directory and called ours ifcfg-privora and saved it in the
/etc/sysconfig/network-scripts/ directory.
We then edited our file and commented out “ONBOOT=yes and the “SUBCHANNELS” line.
Ensure that NM_CONTROLLED=yes. In our new file,ifcfg-privora we added the following
Subchannels:
DEVICE=privora
ONBOOT=yes
If the firewalld service must be running, then various firewall ports will need to opened up in
order for Oracle RAC to be installed. Involve the Linux system administrator to modify firewall
rules to allow the Oracle RAC private interconnect nodes to connect to each other per Oracle
MOS note: RAC instabilities due to firewall (netfilter/iptables) enabled on the cluster
interconnect (Doc ID 554781.1).
The following is an example of the procedure we used to configure firewalld with Oracle RAC
installations on LinuxONE.
1. To allow access to a specific database client with an IP address of 10.19.142.54 and to
make requests to the database server via SQL*Net using Oracle's TNS (Transparent
Network Substrate) Listener (default port of 1521), the following permanent firewall rule
within the public zone must be added to the firewalld configuration on each node within the
Oracle RAC cluster.
# firewall-cmd --permanent --zone=public --add-rich-rule=”rule family=”ipv4”
source address=”10.19.142.54” port protocol=”tcp” port=”1521” accept”
2. Ensure the firewall allows all traffic from the private network by accepting all traffic
(trusted) from the private Ethernet interfaces priv1 from all nodes within the Oracle RAC
cluster. It is highly recommended that the private network be isolated and communicate
only between nodes locally. We used the following command to accomplish this.
# firewall-cmd --permanent -–zone=trusted -–change-interface=priv1
3. On node one of the Oracle RAC cluster, use the following command to add the public
source IP of all remaining nodes. This reference environment only adds the public IP of
node two of the Oracle RAC cluster.
# firewall-cmd -–permanent --zone=trusted -–add-source=10.19.142.52/21
4. Once the rules have been added, run the following command to activate.
# systemctl restart firewalld.service
5. On node 2 of the RAC cluster, add the rule to allow temporary access for the install by
using the following command.
# firewall-cmd -–permanent --zone=trusted -–add-source=10.19.142.52/21
6. Once the rules have been added, run the following command to activate:
# systemctl restart firewalld.service
7. To re-enable the firewalld rules on each of the RAC nodes after the RAC install is
completed be sure to remove the public IP firewall rules that were previously created after
the installation. Use the following command to do this.
# firewall-cmd –permanent –zone=trusted –remove-source=10.19.142.52/21
The Oracle Cluster Time Synchronization Service is designed for organizations whose cluster
servers are unable to access NTP services. If you use NTP, then the Oracle Cluster Time
Synchronization daemon (CTSSD) starts up in observer mode. If you do not have NTP
daemons, then CTSSD starts up in active mode and synchronizes time among cluster
members without contacting an external time server.
To manually configure SSH equivalency for each system, generate the SSH keys and then
copy them to userid@newserver as shown in Example 2-23.
Next, run the “ssh user@<newserver> date” command to verify that SSH runs with prompting
for a password by using the command shown in Example 2-24 with your <newserver> name.
If you encounter any issues, clean up the existing ~oracle/.ssh folder and retry these steps.
For additional detailed steps in Installing Oracle RAC on LinuxONE please refer to Chapter 9,
of the IBM Redbooks publication, Oracle 19c on IBM LinuxONE, SG24-8487 for detailed
screen shots of the Oracle RAC Installation.
With this support, IBM LinuxONE servers running Podman on Red Hat Enterprise Linux 8 or
SLES 15 SP3+ with Oracle 19c release 19.16 or greater will be able to quickly spin up
container images.
My Oracle support note, Oracle RAC on PODMAN on IBM System z - Released Versions and
Known Issues (Doc ID 2924682.1) details what is and is not supported and any known
issues.
For example, one issue is CTSSD not running in observer mode. To avoid this error, create an
empty file /etc/ntp.conf file on each container.
For a detailed white paper on installing Oracle RAC with Podman, see Real Application
Clusters (RAC) in containers on IBM System Z-Implementing Oracle RAC with Podman.
There are many Oracle tools available that work on the LinuxONE platform, whether the
source platform is big endian or little endian. The IBM LinuxONE platform is a true
hybrid-cloud platform for running not only Oracle databases, but many other open-source
databases are also supported on LinuxONE. As the business needs change, CPU, memory,
I/O or network system resources can be dynamically added or removed as the workload
demands.
There are several migration approaches for moving databases including the following as
shown in Table 2-6 Sample Oracle Migration Approaches.
Data Pump Conventional Export/Import
RMAN backup sets for Cross-Platform Database transport
Cross Platform Transportable Tablespaces (XTTS) migration steps using incremental
backup
Replication including Oracle Golden Gate, for synchronizing both source and target
databases.
Oracle Data Cross platform Hot backup not Oracle Usually hour
Pump Cross-version supported BLOBs/CLOBs level
Simple to use can sometimes depending on
Supports have issues data volume or
reorganizing data (verify with network
files Oracle SR) transmission
On LinuxONE systems, you can take advantage of the IBM zEnterprise® Data Compression
(zEDC) hardware accelerator to quickly compress export dumps and move them to other
systems.
RMAN backup sets is a technique for migrating Oracle databases from other platforms
(regardless of endian format), particularly if the source Oracle database version is 19.18 or
greater. Oracle support note, M5 Cross Endian Platform Migration using Full Transportable
Data Pump Export/Import and RMAN Incremental Backups (Doc ID 2999157.1) details all the
steps needed.
If an endian conversion is needed then, an RMAN command is needed to run the conversion.
Most of the time needed for this migration approach is copying the data to the target system.
One approach to reduce copy time is, if possible, unmount the migrated file system on the
source system and remount the file system on the target.
Oracle Change Data Capture (CDC) and Oracle Goldengate (OGG) are also a good
approach when close to zero downtime is needed or if a failback option is needed after some
time of going live on a new system. Configuring the replication scripts, and ensuring things
are in synch and highly available is one disadvantage to this approach.
With all these migration approaches, a good framework and project plan is paramount to
ensure that each migration is tested and repeatable. The framework is important because it
brings discipline and consistency to the project and helps minimize risk. Typically, we suggest
the 5-step process shown in Figure 2-13.
LinuxONE is IBM’s all-Linux enterprise platform for open innovation that combines the best of
Linux and open technology with the best of enterprise computing in one system. The platform
is designed with a focus on security, scalability, and performance to support customers who
want an efficient and cost-effective solution to thrive in a data-centric economy.
LinuxONE’s hardened Linux-based software stack can run most open-source software
packages, such as databases and data management, virtualization platforms, and containers,
automation and orchestration software, and compute-intensive workloads, such as
blockchain.
You can use Db2 for Linux, UNIX, and Windows products on LinuxONE. It works seamlessly
in the virtualized environment without any extra configuration. In addition, autonomic features,
such as self-tuning memory management and enhanced automatic storage, help the
database administrator to maintain and tune the Db2 server.
This multiplatform support allows customers to run a common operating system across all
computing platforms, which means significantly lower support costs and, for Linux, no
incremental license charges. It also offers customers the flexibility of easily moving
applications to the most appropriate platform.
IBM LinuxONE delivers the best of enterprise Linux on the industry’s most reliable and highly
scalable hardware. These systems are specialized scale-up enterprise servers that are
designed exclusively to run Linux applications.
IBM LinuxONE provides the highest levels of availability (near 100 percent uptime with no
single point of failure), performance, throughput, and security. End-to-end security is built in
with isolation at each level in the stack and provides the highest level of certified security in
the industry.
IBM LinuxONE delivers on the promise of a flexible, secure, and smart IT architecture that
can be managed seamlessly to meet the requirements of today’s fast-changing business
climate. LinuxONE provides the following benefits:
Premium Linux experience with sub-seconds user response times and virtually unlimited
scale.
Broad portfolio of Open Source and other vendor products and tools delivered on the
platform.
Choice of Linux (RHEL, SUSE, and Ubuntu) and tools that best fit your environment.
Eliminates risks by running Linux on industry’s most secure and resilient hardware
platform.
Easy integration of data and applications with existing IBM z Systems® based solutions.
Overall increases the operational IT efficiency
2. Check the machine OS version and architecture by using the commands shown in
Example 3-2.
3. Install the IBM DB2® pre-requisites by using the commands shown in Example 3-3.
Example 3-3 Decompress the Db2 executables and run the db2prereqcheck command
gunzip v11.5.9_linux390x64_universal_fixpack.tar.gz
tar -xvf 11.5.9_linux390x64_universal_fixpack.tar
cd universal
./db2prereqcheck
yum install patch -y
yum install gcc-c++ -y
yum install libxcrypt-compat -y
yum install perl -y
yum install ksh -y
yum install mash -y
yum install compat-openssl* -y
vi /etc/sysconfig/selinux
Change SELinux=enforcing to SELinux=disabled
reboot
———————————————————————————————————————
2. Before installing Db2, create the required instance USERID and fence USERID, following
the example shown in Example 3-6.
Example 3-6 Create required instance and fence users and groups
groupadd db2grp1
groupadd db2fgrp1
useradd db2inst1
useradd db2fenc1
usermod --append --groups db2fgrp1 db2fenc1
usermod --append --groups db2grp1 db2inst1
[root@rdbkll04 universal]# cat /etc/passwd|grep -i db2
db2inst1:x:1001:1003::/home/db2inst1:/bin/bash
db2fenc1:x:1002:1004::/home/db2fenc1:/bin/bash
[root@rdbkll04 universal]# cat /etc/group|grep -i db2
db2grp1:x:1001:db2inst1
db2fgrp1:x:1002:db2fenc1
3. Install the Db2 Server Edition using the db2_install command, as shown in Example 3-7.
Note that in Example 3-9, we set the DB2COMM variable to TCPIP and used the db2set
-all command to display all defined variables in all registry levels. In our example, the [i]
means that the variable is set on the instance level and the [g] means that the variable is set
globally.
Update the services file and specify the ports that you want the server to listen on for
incoming client requests. If you want to make changes to the /etc/services file, the instance
must be fully offline, and you must change the /etc/services file on all hosts in the cluster.
Example 3-10 shows our Linux port configuration to ensure that the Db2 instance manager is
able to run. You would use a text editor to add the connection entries to the services file.
From our example, db2c_db2inst1 represents the connection service name, 20016 represents
the connection port number and tcp represents the communication protocol that you are
using. DB2_db2inst1_END20021/tcp indicates that this is a port range, which we have set up
to enable the communication between Db2 partitions using fast communication manager
(FCM). By default, the first port (20016) is reserved for client to server connection requests,
and the first available four ports above 20021 are reserved for FCM communication.
Next, you will need to change the Linux kernel parameters based on the memory in the
machine. For more information on kernel parameter requirements on Linux, see:
https://www.ibm.com/docs/en/db2/11.5?topic=unix-kernel-parameter-requirements-linu
x
In our lab environment, we installed Db2 on two systems - rdbkll03 and rdbkll04. These will be
a part of the single Db2 DPF cluster instance.
We set up the Db2 NFS server between the two nodes (with rdbkll03 as the NFS server, also
known as the coordinator node, and rdbkll04 as the NFS client) following the instructions
found at:
https://earihos.medium.com/how-to-install-and-configure-nfs-storage-server-in-red-
hat-enterprise-linux-9-389379cbb4ee
The NFS mount can be automated to failover from one machine to the other by using TSA
software. To do this, we followed the instructions found in Chapter 2 of the IBM Redbooks
publication, High Availability and Disaster Recovery Options for DB2 for Linux, UNIX, and
Windows, SG24-7363.
We changed our Linux kernel parameters based on the memory of our IBM LinuxONE. The
minimum kernel parameter requirements for Linux can be found on the following website:
https://www.ibm.com/docs/en/db2/11.5?topic=unix-kernel-parameter-requirements-linu
x
You could also install GPFS sharing if needed but we did not do that for our example.
The install command and our output are shown in Example 3-12.
Last metadata expiration check: 2:52:55 ago on Wed 07 Aug 2024 11:51:10 PM EDT.
Dependencies resolved.
==================================================================================
Package
Architecture Version
Repository
Size
==================================================================================
Installing:
nfs-utils s390x
1:2.5.4-25.el9
rhel-9-for-s390x-baseos-rpms
455 k
Installing dependencies:
gssproxy s390x
0.8.4-6.el9
rhel-9-for-s390x-baseos-rpms
109 k
libev s390x
4.33-5.el9
rhel-9-for-s390x-baseos-rpms
54 k
libnfsidmap s390x
1:2.5.4-25.el9
rhel-9-for-s390x-baseos-rpms
64 k
libverto-libev s390x
0.3.2-3.el9
rhel-9-for-s390x-baseos-rpms
15 k
rpcbind s390x
1.2.6-7.el9
rhel-9-for-s390x-baseos-rpms
60 k
sssd-nfs-idmap s390x
2.9.4-6.el9_4
rhel-9-for-s390x-baseos-rpms
46 k
Transaction Summary
==================================================================================
Install 7 Packages
Installing : libev-4.33-5.el9.s390x
3/7
Installing : libverto-libev-0.3.2-3.el9.s390x
4/7
Installing : gssproxy-0.8.4-6.el9.s390x
5/7
Running scriptlet: gssproxy-0.8.4-6.el9.s390x
5/7
Running scriptlet: nfs-utils-1:2.5.4-25.el9.s390x
6/7
Installing : nfs-utils-1:2.5.4-25.el9.s390x
6/7
Running scriptlet: nfs-utils-1:2.5.4-25.el9.s390x
6/7
Installing : sssd-nfs-idmap-2.9.4-6.el9_4.s390x
7/7
Running scriptlet: sssd-nfs-idmap-2.9.4-6.el9_4.s390x
7/7
Verifying : libev-4.33-5.el9.s390x
1/7
Verifying : libverto-libev-0.3.2-3.el9.s390x
2/7
Verifying : gssproxy-0.8.4-6.el9.s390x
3/7
Verifying : libnfsidmap-1:2.5.4-25.el9.s390x
4/7
Verifying : nfs-utils-1:2.5.4-25.el9.s390x
5/7
Verifying : rpcbind-1.2.6-7.el9.s390x
6/7
Verifying : sssd-nfs-idmap-2.9.4-6.el9_4.s390x
7/7
Installed products updated.
Installed:
gssproxy-0.8.4-6.el9.s390x libev-4.33-5.el9.s390x
libnfsidmap-1:2.5.4-25.el9.s390x libverto-libev-0.3.2-3.el9.s390x
nfs-utils-1:2.5.4-25.el9.s390x rpcbind-1.2.6-7.el9.s390x
sssd-nfs-idmap-2.9.4-6.el9_4.s390x
Complete!
We used the commands shown in Example 3-13 to start the RPC bind services on Red Hat
Enterprise Linux (RHEL).
We use the command found in Example 3-14 to start the NFS server.
We used the commands found in Example 3-15 to create the NFS folder and give the mount
folder permissions.
Example 3-16 provides an example of the commands we used to export the NFS to our client,
stop the firewall and check the mount list.
Example 3-16 Export NFS to client, stop the firewall and review mount information
[root@rdbkll03 ~]# cat <<EOF | sudo tee -a /etc/exports
> /nfs_storage
9.76.61.0/24(rw,no_root_squash,insecure,async,no_subtree_check,anonuid=5001,anongi
d=5001)
EOF
[root@rdbkll03 ~]# mkdir /db2home
[root@rdbkll03 ~]# chmod 777 /db2home
vi /etc/fstab
9.76.61.102:/nfs_storage /db2home/ nfs defaults 0 0
systemctl daemon-reload
Create the Db2 instance to use this NFS folder as a shared directory by using the commands
shown in Example 3-17.
Example 3-17 Create the instance using the NFS directory and set the permissions
[root@rdbkll03 ~]# mkdir /db2home/db2inst2
[root@rdbkll03 ~]# chmod 777 /db2home/db2inst2
[root@rdbkll04 ~]# mkdir /db2home/db2inst2
[root@rdbkll04 ~]# chmod 777 /db2home/db2inst2
Create users db2inst2 and db2fenc2 on both machines with the same user ID (UID) and
group ID (GID). Run the commands on both machines, in our case, rdbk1103 and rdbk1104,
as shown in Example 3-18.
db2fenc2:x:2003:1998::/db2home/db2fenc2:/bin/bash
db2igrp2:x:1999:db2inst2
db2fgrp2:x:1998:db2fenc2
db2igrp2:x:1999:db2inst2
db2fgrp2:x:1998:db2fenc2
Verify the Db2 install and list of which products are installed, as shown in Example 3-19.
Create the Db2 instance and use /db2home/db2inst2 as the Db2 SQLLIB directory on both
servers. In our example, once we logged into the rdbkll03 machine as root, we created the
db2inst2 instance in the /db2home/db2inst2 directory. For more details on setting up a
partitioned database environment, see the following website:
https://www.ibm.com/docs/en/db2/11.5?topic=environment-setting-up-partitioned-data
base
Example 3-20 shows logging in as root to our lab environment, and creating the instance in
the /db2home/db2inst2 directory.
Task #1 start
Description: Setting default global profile registry variables
Estimated time 1 second(s)
Task #1 end
Task #2 start
Description: Initializing instance list
Estimated time 5 second(s)
Task #2 end
Task #3 start
Description: Configuring DB2 instances
Estimated time 300 second(s)
Task #3 end
Task #4 start
Description: Updating global profile registry
Estimated time 3 second(s)
Task #4 end
Edit the /etc/services file on both machines to include the Db2 DPF instance, as shown in
Example 3-21.
The node configuration file (db2nodes.cfg), located in the instance owner's home directory,
contains configuration information that tells the Db2 database system which servers
participate in an instance of the partitioned database environment.
Edit the db2nodes.cfg file to indicate that both servers are a part the Db2 DPF instance.
Example 3-22 provides an example of doing this in our lab environment.
0 rdbkll03.cpolab.ibm.com 0
1 rdbkll03.cpolab.ibm.com 1
2 rdbkll04.cpolab.ibm.com 0
3 rdbkll04.cpolab.ibm.com 1
Example 3-23 shows starting the Db2 instance from the coordinator node, in our lab
environment, the NFS server, rdbkll03, of the Db2 DPF instance.
From the coordinator node (rdbkll03), we check the database configuration to ensure it is
using the correct default database path(Example 3-24).
Next, from the coordinator node (rdbkll03), we create the Db2 database, verify the files in the
default database path and test our connection to the newly created database, as shown in
Example 3-25.
For more information on Db2 HADR on LinuxONE, see chapters 5,6,7,8 in the following IBM
Redbooks publication: High Availability and Disaster Recovery Options for DB2 for Linux,
UNIX, and Windows, SG24-7363.
The minimum Red Hat OpenShift architecture consists of five Linux guests (bootstrap, three
control nodes, and one worker node) that are deployed on top of IBM z/VM 7.1 or later. Red
Hat OpenShift on LinuxONE users can use the IBM Cloud® Infrastructure Center to manage
the underlying cluster infrastructure.
Db2U integrates well with AI and analytics tools, making it easier to perform complex data
analysis and leverage machine learning models directly within the database environment.
High Availability and Disaster Recovery
Db2U includes features for high availability (HA) and disaster recovery (DR), ensuring that
your data is protected and that your database can continue to operate in the event of
failures.
Support for Multiple Data Types
Like the traditional Db2, Db2U supports a wide range of data types, including structured,
semi-structured, and unstructured data, making it versatile for different kinds of
applications.
Compliance and Security
IBM Db2U includes robust security features and is compliant with various industry
standards, which is critical for enterprises that handle sensitive data.
See Modernizing Db2 Containerization Footprint with Db2U for more information on the Db2U
architecture.
As of Db2 11.5.5, Red Hat OpenShift has operator-enabled installations, allowing you more
control over your deployment. You deploy Db2 to your Red Hat OpenShift cluster through a
series of API calls to the Db2 Operator.
The Db2 Operator is acquired from either the IBM Operator Catalog or the Red Hat
Marketplace. In our lab environment, we used the IBM Operator Catalog, which is accessible
through the OpenShift user interface (UI) console, while the Red Hat Marketplace is a web
site.
In our lab environment, we chose to install Db2 directly onto the Red Hat OpenShift Container
Platform using the IBM Cloud Pak® for Data command line interface (cpd-cli). You do not
need to install the complete IBM Cloud Pak for data, nor do you need a license for the cpd-cli.
Next, from the RHOCP command line, create a new project. A project is essentially the same
as a namespace, but Red Hat OpenShift provides additional administrative controls for
projects. In our lab environment, we used the following command:
oc new-project db2
When you create a new project by using the oc new-project command, the new project will
automatically be set as the current project. To ensure your new project has been created, you
can run the following command:
oc project db2
To view IBM offerings in the Red Hat OpenShift Operator catalog, the catalog index image
needs to be enabled. To do this, we first created a Db2 catalog source YAML configuration
file, as shown in Example 3-27.
Next, we applied the YAML file to create the Db2 Operator catalog resource, as shown in
Example 3-28.
To verify the installation, use the command shown in Example 3-29 from the command line,
where -n openshift-marketplace is the catalog source namespace, as highlighted in
Example 3-27 in your YAML file.
For workloads that rely heavily on multi-threading, such as databases, the default pids-limit of
1024 is insufficient. This setting governs not only the number of processes but also the
number of threads, as a thread is essentially a process that shares memory. You will need to
change the worker node process ID limit.
As the MCO updates machines in each MCP, it reboots each node one by one.
Example 3-30 provides the command to create the file that will accomplish this task. We set
our pidsLimit to 65536.
Example 3-30 Change the worker node PID limit at the CRI-O level
[root@rdbko7b1 ibm]# vi pids-limit.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: crio-pids-limit
spec:
machineConfigPoolSelector:
matchLabels:
pools.operator.machineconfiguration.openshift.io/worker: ''
containerRuntimeConfig:
pidsLimit: 65536
oc apply -f pids-limit.yaml
To verify the changes have been successful on the MCP, run the command shown in
Example 3-31.
Example 3-31 Verify the CRI-O configuration changes were successful on the MCP
[root@rdbko7b1 ibm]# oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED
MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-26c1e0d1f03394f2e2e53287cefee8fb True False False 3
3 3 0 37h
worker rendered-worker-c94d116d73511b4aea20b9ec0490001d True False False 3
3 3 0 37h
Once the worker nodes are rebooted, you can login and confirm the health of the nodes by
running the command shown in Example 3-32. Healthy nodes show a status of Ready.
Example 3-32 Verify the CRI-O configuration changes were successful on the worker nodes
[root@rdbko7b1 ibm]# oc get nodes
NAME STATUS ROLES AGE VERSION
rdbko7m1.ocp-dev.cpolab.ibm.com Ready control-plane,master 37h
v1.29.6+aba1e8d
rdbko7m2.ocp-dev.cpolab.ibm.com Ready control-plane,master 37h
v1.29.6+aba1e8d
rdbko7m3.ocp-dev.cpolab.ibm.com Ready control-plane,master 37h
v1.29.6+aba1e8d
rdbko7w1.ocp-dev.cpolab.ibm.com Ready worker 37h
v1.29.6+aba1e8d
rdbko7w2.ocp-dev.cpolab.ibm.com Ready,SchedulingDisabled worker 37h
v1.29.6+aba1e8d
rdbko7w3.ocp-dev.cpolab.ibm.com Ready worker 36h
v1.29.6+aba1e8d
See About listing all the nodes in a cluster for more details on nodes status.
Verify that the CRI-O PIDs limit has been applied, as shown in Example 3-33
finalizers:
- 99-worker-generated-containerruntime
generation: 1
name: crio-pids-limit
resourceVersion: "688733"
uid: 3ae2cd55-b060-4b40-b734-011d19fba551
spec:
containerRuntimeConfig:
pidsLimit: 65536
machineConfigPoolSelector:
matchLabels:
pools.operator.machineconfiguration.openshift.io/worker: ""
status:
conditions:
- lastTransitionTime: "2024-08-18T05:26:15Z"
message: Success
status: "True"
type: Success
observedGeneration: 1
Perform the following tasks in order to create the Db2 Operator on the Red Hat OpenShift
Container Platform cluster in the db2 namespace:
1. Log into the Red Hat OpenShift Container Platform console as kubeadmin.
2. From the “hamburger” menu icon, select Operator Hub.
3. In the search bar, enter the keyword db2.
4. Select IBM Db2, provided by IBM to install the Operator.
5. Select Install.
6. Confirm you have the latest version - in our case, it was 110509.02.
7. Set the installation mode to a specific namespace on the cluster.
8. Select Install Operator in the db2 namespace.
9. Select Update approval to Manual.
10.Select Install.
11.When prompted, select Approve.
Verify the Db2U pods in the db2 namespace are running. Example 3-34 shows the command
we used to verify, and its output.
We will use the NFS storage we set up in section 3.2.3, “Db2 DPF install on LinuxONE” on
page 58 as our storage layer. We verify our pre-requisites before we install the Db2 instance.
Example 3-35 shows that all worker nodes are listed in our hosts file.
Next we will verify that our hosts (listed in Example 3-35) have access to the NFS storage
directory by examining the /etc/exports configuration file (Example 3-36). This configuration
file controls which file systems are exported to remote hosts and specifies options. For more
details on the options, see the following website:
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/storage_a
dministration_guide/nfs-serverconfig#nfs-serverconfig-exports
Finally, we use the command shown in Example 3-37 to query the mount daemon for
information about the state of the NFS server on our machine. With no options, showmount
lists the set of clients who are mounted from that host. In this example, we see the IP
addresses of each of our hosts and can verify that they have the NFS storage file
(/data/nfs_storage) mounted.
Example 3-37 Query the mount daemon for the mount list
[root@rdbko7b1 ibm]# showmount -e
For the following activities, log into Red Hat OpenShift Container Platform. In order to install
Db2 with a dependency on Db2U, you will need to create a db2u-product-cm ConfigMap to
specify whether Db2U runs with limited privileges or elevated privileges. In our example, we
chose to run Db2U with elevated privileges. We created the YAML file shown in
Example 3-38.
From the bastion node (in our case, rdbko7b1 with an IP address of 9.76.61.174) on the Red
Hat OpenShift Container Platform, use the commands shown in Example 3-39 to export the
NFS storage.
You can now log into the cluster on the Red Hat OpenShift Container Platform. Our command
to login is shown in Example 3-40. The OpenShift Container Platform master includes a
built-in OAuth server. Users obtain OAuth access tokens to authenticate themselves to the
API. For more information on retrieving this token, see Describe the details of a user-owned
OAuth access token.
You have access to 70 projects, the list has been suppressed. You can list all
projects with 'oc projects'
Our next objective will be to create the NFS storage class for the Db2 database on RHOCP.
Since Db2 supports only NFS version 3, we must first change the NFS protocol to NFS3. To
do this, you will edit the NFS mount configuration file using the following command:
vi /etc/nfsmount.conf
Next we will create the storage class by using the command found in Example 3-42.
PLAY [localhost]
**********************************************************************************
TASK [include_role : utils]
**********************************************************************************
PLAY RECAP
**********************************************************************************
To check the NFS storage class, you will run the command shown in Example 3-43.
You can now proceed to create the Db2 instance and database. To do this, you will create a
YAML file, as shown in Example 3-44. This example reflects the configuration parameters for
our lab environment. You can change the configuration parameters to suit your own
environment.
dataOnMln0: true
total: 12
volumePerPartition: true
license:
accept: true
nodes: 2
podTemplate:
db2u:
resource:
db2u:
limits:
cpu: 8
memory: 60Gi
advOpts:
memoryPercent: 99
storage:
- name: meta
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 40Gi
storageClassName: managed-nfs-storage
type: create
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
storageClassName: managed-nfs-storage
type: template
- name: archivelogs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 40Gi
storageClassName: managed-nfs-storage
type: create
- name: backup
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 80Gi
storageClassName: managed-nfs-storage
type: create
- name: tempts
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
storageClassName: managed-nfs-storage
type: template
version: s11.5.9.0-cn2
Ensure you are in the correct RHOCP project (namespace) by running the command shown
in Example 3-45. Our project is “db2”.
While the command shown in Example 3-45 only shows a subset of resources, it is sufficient
for our purposes. Basically, we are just reviewing the status of our pods.
At this point of the process, now that you have ensured everything is in a good state, you can
either apply these configuration details by using the following command:
oc apply -f db2_inst.yam
Or, you can run the command shown in Example 3-47 to create the instance using the YAML
file you created in Example 3-44.
You will now verify the Db2 persistent volume claims (PVC) are bound, as shown in
Example 3-48. Note that the STORAGECLASS listed in this example is the storage class you
exported in Example 3-39, managed-nfs-storage.
Check the pods once again to ensure they are running, as shown in Example 3-49.
Verify the newly created Db2U instance is in a ready state, as shown in Example 3-50.
Run the command shown in Example 3-51 to verify the Db2 database service port.
We next want to set up basic load balancing by using the HAProxy configuration file. First, we
edit the /etc/haproxyhaproxy.cfg configuration file, as shown in Example 3-52.
Validate the Db2 instance and database on the Red Hat OpenShift Container Platform,
switch to the Db2 instance userID and connect to the database, by using the commands
shown in Example 3-53. The oc rsh command allows you to locally access and manage
tools that are on the system. In our example, we are opening a remote shell session to our
container (pod) before we try to connect to our database.
su - db2inst1
[db2inst1@c-db2u-cr-db2u-0 - Db2U ~]$ db2 connect to bludb
You also can review the migration steps in section 6.2.2 of the IBM Redbooks publication,
Practical Migration from x86 to LinuxONE, SG24-8377.
For more details on application analysis, see section 5.3 of the IBM Redbooks publication,
Practical Migration from x86 to LinuxONE, SG24-8377.
For applications developed within the company, ensure that you have the source code
available. Regarding the operating system platform, even a workload from a different platform
can be migrated but start with servers running Linux. This process will substantially increase
the success criteria of the migration effort. Applications that require proximity to corporate
data stored on IBM LinuxONE are also ideal candidates, as are applications that have high
I/O rates because I/O workloads are off-loaded from general-purpose processors onto
specialized I/O processors.
IBM LinuxONE III has a powerful processor with a clock speed of 5.2 GHz. Because IBM
LinuxONE is designed to concurrently run disparate workloads, remember that some
workloads that required dedicated physical processors designed to run at high sustained
CPU utilization rates might not be optimal candidates for migration to a virtualized Linux
environment. This is because workloads that require dedicated processors do not take
advantage of the virtualization and hardware sharing capabilities. An example of such an
application might include video rendering, which requires specialized vide hardware.
A brief summary of different migration strategies along with their pros and cons are
summarized in Table 3-2.
Linux backup and Relatively Simple This will work only if the old
Low amount of downtime and new servers have same
restore. Migrate
endian
backed-up data from
an operating
environment to IBM
LinuxONE.
Db2 backup and Relatively Simple This will work only if the old
restore. Low amount of downtime and new servers have same
endian
Table by Table migration Does not require a lot of space Takes time to migrate the data
using LOAD FROM to keep the unload files Requires direct connection on
CURSORa. a good network bandwidth
between the old and the new
Db2 databases
Table by Table migration Automated in a way that it is Takes time to migrate the data
using db2look and single command to dump the Require more temporary
db2move utilitiesb. data from a database with storage space to hold the
large numbers of tables unloaded data
Table by Table migration Much faster for large sized Takes time to migrate the data
using Db2 HPU utility. tables compare to the above 2
IBM Db2 High techniques
Performance Unload Restore is from a backup
(Db2 HPU) is a image and hence much faster
high-speed Db2 utility and easier to schedule
for unloading Db2 tables
from either a table
space or from an image
copy. Tables are
unloaded to one or
more files based on a
format that you specifyc.
a. For more information on this command, see:
https://www.ibm.com/docs/en/db2/11.5?topic=data-moving-using-cursor-file-type
b. See section 6.2.2 of the IBM Redbooks publication, Practical Migration from x86 to LinuxONE,
SG24-8377.
c. For more information on this utility, see:
https://www.ibm.com/docs/en/dhpufz/5.2.0?topic=documentation-db2-high-performance
-unload-overview
If the database is not using all of the available memory, reduce the server memory until it
starts paging. A system that is constantly swapping is the first indication of insufficient
memory.
High load averages are not always an indication of CPU bottlenecks. Monitor the LPAR
performance and determine if any other process or server that is running in the same LPAR is
competing for CPU time when the problem occurs.
Database data files and log files must be in different file systems and should be striped across
the storage hardware. Have multiple paths to the data to ensure availability.
The systems administrator and the database administrator must work together during the
sizing process. Database servers typically require adjustments at the Linux and database
levels
The first PostgreSQL release known as version 6.0 was released on January 29, 1997 and
since then PostgreSQL has continued to be developed by the PostgreSQL Global
Development Group, a diverse group of companies and many thousands of individual
contributors. PostgreSQL version 16 released on September 14, 2023, is the 33rd major
release in over 37 years of development. PostgreSQL is available from Linux distributions like
Red Hat, SLES and Ubuntu and as a Docker image.
The PostgreSQL open source-based database has gained significant adoption in the last few
years. With a vibrant and engaged community, PostgreSQL has become a viable alternative
for enterprises trying to replace and modernize databases that support their legacy systems
of record. There are vendors like Fujitsu and EnterpriseDB (EDB) that enhance the
capabilities of open-source PostgreSQL and provide value added services.
In this chapter we will discuss a specific distribution of PostgresSQL provided by Fujitsu that
augments the capabilities of the open source edition to provide an enterprise-grade
experience and support. This distribution, which is combined with the IBM LinuxONE server,
delivers a robust solution.
Fujitsu Enterprise Postgres version 17 fully supports IBM LinuxONE and leverages the IBM
LinuxONE virtualization capabilities to provide a highly flexible, scalable, and resilient data
serving platform that enables organizations to adopt various data service architectures to
meet the needs of any business transformation initiative. Fujitsu Enterprise Postgres on IBM
LinuxONE embraces open-source innovation with the improved sustainability, performance,
scalability, and resiliency that IBM LinuxONE delivers.
Figure 4-1 provides a view of all the enterprise features covering development, operations,
and management of Fujitsu Enterprise Postgres.
All the above HA requirements are possible through various data replication modes that are
available with Fujitsu Enterprise Postgres. In this section, we describe how Fujitsu Enterprise
Postgres provides multiple ways to reliably back up and recover operational and
business-critical data in multiple scenarios.
Database multiplexing
Database multiplexing (also known as database mirroring) is a mode of operation that
enables HA by creating copies of the database running on other similar systems by using the
Postgres streaming replication technology. With the Fujitsu database multiplexing feature, the
database is mirrored and the capability of the databases to switch over from the primary
database to the standby database automatically is enhanced. The database multiplexing
mode of operation is managed by the software component that is called the Mirroring
Controller (MC)
Figure 4-2 shows a high-level architecture with switchover and failover capabilities that uses a
Mirroring Controller (MC).
The MC feature enables the primary server (the database server that is used for the main
jobs) to be switched automatically to the standby server if an error occurs in the former. In
addition, data on the standby server can be made available for read access, allowing the
standby server to be used for tasks such as data analysis and reporting in parallel to the
primary server workload.
Note: For more information, see the "Database Multiplexing Mode" section in the Fujitsu
Enterprise Postgres Cluster Operation Guide.
The MC is effective in performing failover and controlled switchover when any issues occur in
the primary database server. To resolve a split-brain scenario in data multiplexing mode,
Fujitsu Enterprise Postgres uses the Server Assistant program, which is described in the next
section, “Server Assistant: Avoiding split-brain scenarios” on page 90.
Figure 4-3 shows a high-level architecture with Server Assistant providing the quorum service
(also known as arbitration service) between the primary and standby server and fencing the
primary database server when the heartbeat fails between the primary and the standby
database servers.
Server Assistant is packaged with Fujitsu Enterprise Postgres Advanced Edition and provided
at no additional cost. Server Assistant can be installed on a different server that is also known
as an arbitration server.
Fujitsu Enterprise Postgres supports many backup and recovery methods that are available
through the GUI and command-line interface (CLI). Although the Fujitsu WebAdmin GUI tool
provides a one-click backup and recovery feature, the CLI utilities pgx_dmpall and pgx_rcvall
provide a more granular and automated backup and recovery capability.
Encryption
Files that are backed up by Fujitsu Enterprise Postgres are encrypted and automatically
protected even if the backup medium is lost or somehow accessed by unauthorized users. In
contrast, files that are generated by the equivalent PostgreSQL commands pg_dump and
pg_dumpall are not encrypted and must be encrypted by using OpenSSL commands or other
means before they are saved.
Connection Manager
Fujitsu Enterprise Postgres Connection Manager is a HA feature that provides transparent
connectivity to the Fujitsu Enterprise Postgres HA database cluster. Unlike MC and Server
Assistant, which are part of the database layer, Connection Manager is part of the application
layer. Connection Manager is configured on the application and database servers.
Figure 4-5 shows the operation of Connection Manager. Connection Manager monitors the
application server and Fujitsu Enterprise Postgres Database instances running on the
database servers, which are primary and standby database servers.
Figure 4-5 Connection Manager heartbeat monitoring and transparent connection support features
For more information on this topic, see chapter 7 in the IBM Redbooks publication, Data
Serving with FUJITSU Enterprise Postgres on IBM LinuxONE, SG24-8499, where we
described the setup of JDBC, ODBC drivers, examples of using the libpq library and
compiling a sample code, and embedded SQL in C.
Note: For more information about application development instructions, see the Fujitsu
Enterprise Postgres Application Development Guide
Note: For more information about the Oracle compatibility feature usage and precautions,
see the Fujitsu Enterprise Postgres Installation and Setup Guide.
The contrib modules that are listed in “OSS software peripheral devices that are supported
by Fujitsu” on page 95 are supported by the Fujitsu development team for resolving issues,
which means that customers do not need to wait for the community to provide a solution.
Instead, customers get enterprise-grade 24x7 support for the database software and
database management tools.
Table 4-3 OSS software peripheral devices that are supported by Fujitsu
Description Open-Source software Version and level
name
pg_dbms_stats 1.3.11
Note: For more information about the ORAFCE function within Fujitsu Enterprise Postgres,
see the following file:
fujitsuEnterprisePostgresInstallDir/share/doc/extension/README.asciidoc
For OSS peripheral tools version information, see the Fujitsu Enterprise Postgres General
Description Guide.
4.1.3 Security
Data security is an important area of focus for organizations. Organizations must ensure that
the confidentiality and safety of the data that is available in electronic and physical form.
Organizations with even the most sophisticated technology have faced data breaches that
have resulted in expensive consequences of a legal nature. More importantly, damage might
be done to their reputation, which has a long-term impact on customer trust and diminished
confidence and morale in doing business or using services.
Data security generally follows a layered framework that covers every touch point that is
vulnerable to security threats. These layers include human, physical, application, databases,
and other detection technologies (see Figure 4-6). The aim of a layered security framework is
to make sure that a breach in one layer does not compromise the other layers, and in the
worst case, the entire data security.
Figure 4-6 Layered security architecture and common vulnerable data security breach points
The layered security framework secures the three states of data through the implementation
of security policies. These three data states can be classified as:
Data at rest
Data in motion
Data in use
Fujitsu Transparent Data Encryption (TDE) and Data Masking are the Fujitsu solution for
securing data at rest and data in use. TDE and Data Masking are included in Fujitsu
Enterprise Postgres Advanced Edition. In Fujitsu Enterprise Postgres on IBM LinuxONE, TDE
is further enhanced through integration with a Central Processor Assist for Cryptographic
Function (CPACF) instructions set on IBM LinuxONE to make Fujitsu Enterprise Postgres the
most secure enterprise-grade Postgres on the planet.
Fujitsu Enterprise Postgres comes with TDE integrated. Unlike some other proprietary
RDBMS products, you are not required to purchase extra products to gain this function.
Figure 4-7 shows Fujitsu Enterprise Postgres integrated to use CryptoCard based encryption
and keystore management.
Figure 4-7 Transparent Data Encryption: File-based and hardware security module-based keystore
management
Take advantage of the CPACF in the IBM Z processor to minimize encryption and
decryption overhead so that even in situations where previously the minimum encryption
target was selected as a tradeoff between performance and security, it is now possible to
encrypt all the data of an application.
Figure 4-8 shows a high-level implementation of Fujitsu TDE with openCryptoki for storing
master keys on the CryptoCard (also called HSM). openCryptoki allows Fujitsu Enterprise
Postgres to transparently use Common Cryptographic Architecture (CCA) and EP11
(Enterprise PKCS #11) firmware in a CryptoCard.
Data masking
Data masking enables data security governance by obfuscating specific columns or part of
the columns of tables that store sensitive data while still maintaining the usability of the data.
Here are some common use cases for partial or full obfuscation of information:
Test data management
This is the process of sanitizing production data for use in testing of information systems
with realistic data without exposing sensitive information to the application development or
infrastructure management teams, who might not have the appropriate authority and
clearance to access such information. Figure 4-9 shows the data masking use case for
test data management.
Figure 4-9 Data privacy management by transferring masked production data for testing
Compliance
Various organizations collect and store large amounts of complex data as part of business
operations and use analytics to run their business, which creates data privacy challenges
for the organizations to adhere to different data privacy regulations and compliance based
on regions, such as General Data Protection Regulation (GDPR), California Consumer
Privacy Act (CCPA), PCI-DSS, and HIPAA (Health Insurance Portability and Accountability
Act). Data masking is one of the most effective ways to be compliant to various regional
and international data privacy legislation and compliance.
Production data security
While data in production is accessed by people with different roles such as system
administrators, customer support, production application maintenance personnel, and
business process analysts, it is important that data masking is role-sensitive. Fujitsu Data
Masking provides the facility to implement role-sensitive masking. Figure 4-10 shows a
production environment example.
The following are some of the features of Fujitsu Data Masking technology:
Fujitsu Enterprise Postgres uses a flexible and easy to use policy-based data masking
technique. There are advantages of using Fujitsu policy-based Data Masking over other
the techniques.
Policy-based data masking allows the development of data-sensitive masking policies for
different data classifications.
Data masking policies can be applied to tables for different columns that come under one
data classification or another.
After the policy-based data masking is applied, the policies can be disabled or enabled as
required without needing to remove or reapply them.
The Fujitsu Data Masking feature is irreversible because the relationship between the
original data and the masked data is severed during the masking policy implementation to
de-risk any reverse engineering to re-create the production data.
The data masking policy is applied at the time that the data is accessed, so queried data is
modified according to the masking policy before the results are returned in the query
chain.
Data masking policies can be configured to be applied to specific roles / conditions so that
the original data cannot be viewed without the appropriate privileges.
Referential integrity is maintained in the source database because the data masking policy
is applied during the data access process.
Masking policies generate data that can be same every time masked data is accessed,
which maintains test data consistency across multiple iterations of accessing original data.
Availability of catalog tables for querying data masking policy information to assess the
current sensitive data policy state of a database.
Figure 4-11 shows the three different types of data masking techniques that are offered with
Fujitsu Enterprise Postgres Advanced Edition
Figure 4-11 Types of data masking that are available with Fujitsu Enterprise Postgres
Audit log
Dedicated audit logging is a unique feature to Fujitsu Enterprise Postgres that helps
organizations manage data accountability, traceability, and auditability. Audit Log is an
important security feature that provides a level of protection while also meeting compliance
expectations and protecting the brand and data asset value of an organization.
OSS PostgreSQL implements basic statement logging through the standard logging
parameter log_statement = all. This implementation is acceptable for monitoring and other
use cases, but this method does not provide the level of information that is required by
security analysts for an audit. The information that is generated is not enough to list what the
user requested. Ideally, the configuration must also establish what happened while the
database was satisfying each request.
Fujitsu Enterprise Postgres Audit Logging enables organizations to trace and audit the usage
of sensitive data and connection attempts of the database. Audit Logging provides a clear
picture of data access by logging the following details:
What data is accessed.
When the data is accessed.
How the data is accessed and who accessed the data.
The information that is provided by audit logs can be used to counter security threats such as:
Spoofing
Unauthorized database access
Privilege misuse
The Audit Log feature uses the pgaudit utility, which enables retrieval of details relating to
database access as an audit log. Additionally, audit logs are externalized to a dedicated log
file or server log to implement an efficient and accurate log monitoring.
Figure 4-12 shows the functioning differences between the server logs and the dedicated
audit log (session log). The server log configuration parameters are set in the
postgresql.conf file, but the audit log configuration is managed in the pgaudit.conf
configuration file.
With pgaudit, the following two types of audit log can be output:
Session Audit Log
Object Audit Log
4.1.4 Performance
One of the driving forces behind business innovation is using information to gain insight about
business and respond as needed. This task becomes more challenging because the world
generates 10 zettabytes (a trillion gigabytes) of data every year, and this number is projected
to increase to 175 zettabytes in 2025.
Regardless of the choices that businesses make, IT systems must adapt to keep pace with
exponential data growth and be able to quickly analyze growing amounts of data. Every
minute that is saved in ingesting and analyzing data has tremendous value, so the demand to
quickly perform analysis puts tremendous pressure on IT systems. Business and data
analysts are always looking for faster ways of processing data aggregation and manipulation
supporting business leaders to make business decisions.
Fujitsu Enterprise Postgres offers three performance features for enhancing ingesting and
analyzing data by fully optimizing the processor, memory, storage, and I/O resources:
VCI
Fujitsu High-Speed Data Load
Global Meta Cache (GMC)
Columnar index
Columnar storage architecture has gained attention with the introduction of Hybrid
Transactional Analytical Processing (HTAP), which is an alternative to merging OLTP with
OLAP systems. One of the major challenges that exists in various product implementations of
HTAP processing is maintaining the ACID compliance of an RDBMS. The goal of HTAP data
processing is to remove the ETL processing to an extent that enables real-time analytics on
the OLTP systems before they go to OLAP systems through ETL, which traditionally has been
the de facto method to connect OLTP and OLAP DBMSs. Although row-oriented data is better
suited to OLTP processing, it’s the column-oriented data that is best suited for data analysis
because it improves aggregation performance.
VCI has two components to achieve aggregation performance: One is on disk, and another
one is in memory. Both the components are managed by the VCI Analysis Engine.
Figure 4-13 shows that the VCI Analysis Engine combines VCI columnar storage on disk by
using the pgx_prewarm_vci utility to preload VCI pages and make it memory resident. VCI
data is stored in the dedicated portion of the shared buffers on memory. VCI memory
allocation share is controlled by the configuration parameter reserve_buffer_ratio.
VCI benefits
VCI provides the following benefits:
Minimizes the impact on existing jobs and performs aggregation by using job data in real
time.
VCI is crash-safe. It also stores data on the disk, so aggregation jobs can be quickly
resumed by using a VCI even if a failure occurs (when an instance is restarted).
If the amount of memory that is used by VCI exceeds the set value, aggregation can
continue by using VCI data on the disk to avoid impacting on-going operations.
VCI is provided as an index, so no application modification is required.
In HA architectures, VCI provides the opportunity to optimize the extra computing power
that is available with standby servers for performing resource-intensive data analysis
faster.
VCI features
The following are some of the features of VCI:
Disk compression: Compresses VCI data on the disk, minimizing the required disk space.
Even if disk access is required, the read overhead is low.
Parallel scan: Enhances aggregation performance by distributing aggregation processes
to multiple CPU cores and then processing them in parallel.
Preload feature: Preloads VCI data into memory before an application scans VCI and
ensures stable query response times.
Stable buffer feature: Maintains VCI data in memory, which improves query performance
because it reduces disk I/Os by avoiding VCI eviction from memory by other jobs.
Note: The Preload and Stable buffer features keep VCI data in memory and minimize disk
I/Os on each aggregation process.
For more information about the evaluation criteria for installing VCI, resource estimation,
and configuration steps, see 9.1, “Installing Vertical Clustered Index (VCI)”, of the Fujitsu
Enterprise Postgres Operation Guide.
Query planner
VCI optimizes the available CPU and memory within the server to enhance scan
performance. It further speeds up execution time for the following use cases:
Single table processing
Queries processing large aggregations, such as simultaneous sum and average
operations
Queries fetching and processing many rows in the tables
Before the aggregation is performed, the query planner computes the execution cost, and
based on an algorithm, it implements the most cost-efficient plan to determine whether VCI
processing is regardless of the availability of VCI.
Furthermore, the parallelization of the aggregation process further reduces the query
execution time with VCI. Together, both features provide HTAP system design capability to the
database designers.
Fujitsu High-Speed Data Load distributes the data from an input file to several parallel
workers. Each worker performs data conversion, table creation, and index creation, which
tremendously reduces the data load time.
Query processing goes through a few steps before a query execution plan is created. The
steps include parsing the query, creating a plan, and running the plan. To perform these
steps, the PostgreSQL process accesses the system catalogs, during which the system
catalog data is cached in memory per process.
Moreover, the metadata that is required to access the tables that are referenced in SQL
queries and stored in catalog tables is also aggregated and cached. This cached information
in the memory is called a meta cache. The purpose of a meta cache is to improve the query
processing performance by providing metadata in memory to avoid metadata searches in the
catalog table each time. As the number of transactions and connections and their size
increases, the cache size increases because the metadata is cached per process, which
increases the system’s overall meta cache and causes cache bloat, which impacts the
performance. This performance issue is resolved by GMC.
Fujitsu GMC is a method to avoid cache bloat by centrally managing the meta cache in
shared memory instead of per process. Because the meta cache is managed centrally, it is
called GMC.Fujitsu
Optimizer hints
Optimizer hints enable Fujitsu Enterprise Postgres to stabilize the query plan in
mission-critical systems. Users can use pg_hint_plan to specify a query plan as a hint in each
individual SQL statement.
Fujitsu Enterprise Postgres uses the open-source module pg_hint_plan to provide optimizer
hints for query planning. The query planner estimates the cost of each possible plan that is
available for the SQL statement and picks the low-cost plan because the query planned
considers it the best plan, but that plan might not be best because the query planner does not
know some details like the association between the columns. Optimizer hints come handy in
such situations because it suggests to the planner to choose a specified plan.
In addition, Fujitsu Enterprise Postgres on IBM LinuxONE works with the on-chip
compression accelerator IBM zEnterprise Data Compression (IBM zEDC) to ensure efficient
backup operations in the event of database or hardware failures. IBM zEDC reduces CPU
costs by 90% and processing time during data compression by 40% compared to traditional
software compression. For more information about IBM zEDC, see Data compression with
the Integrated Accelerator for zEDC.
WebAdmin is the Fujitsu graphical user interface (GUI) tool for many tasks, such as database
installation and database operations management and is used for the following tasks:
Fujitsu Enterprise Postgres setup. Instances can be created easily with minimal input
because the tool automatically determines the optimal settings for operation.
Creating and monitoring a streaming replication cluster.
Database backups and recovery.
Manage Fujitsu Enterprise Postgres instances in a single server or instances that are
spread across multiple servers.
– Recovery that uses WebAdmin requires less time and effort because WebAdmin
automatically determines the scope of the operation.
Wallet feature.
– The Wallet feature in WebAdmin is used to conveniently create and store instance
username and passwords.
– This feature can be used to create usernames and passwords when creating remote
standby instances through WebAdmin.
– After the credentials are created in Wallet, they can be used repeatedly.
Anomaly Detection and Resolution.
When certain operations are performed through the command line interface (CLI), they
can cause anomalies in WebAdmin. These operations are:
– Changes to the port and backup_destination parameters in the postgresql.conf file.
– Changes to the MC configuration of cluster replication that is added through
WebAdmin.
WebAdmin checks for anomalies when an instance is selected for viewing or when any
instance operation is performed.
WebAdmin configurations
WebAdmin can be installed as either of two configurations:
Single server
Multi-server
Note: WebAdmin does not support encrypted communication between a browser and
server or between servers. Hence, when using WebAdmin in either configuration, the
network communication path should be built on a private network with no external access.
For more information about WebAdmin installation instructions, see Fujitsu Enterprise
Postgres Installation/Setup.
Offline migration is carried out during a migration window and the database is not available for
write operations. In an offline migration, the definition of each selected object is read and
written in SQL script. This SQL script is used to migrate the Postgres database to Fujitsu
Enterprise Postgres.
The total time of offline migration depends on the database size and time that is taken to lift
and shift the selected objects. The downtime in the case of an offline backup significantly
increases when migrating large databases.
We can use Postgres logical replication to migrate to Fujitsu Enterprise Postgres. Logical
replication is a method of replicating data objects and their changes based on their replication
identity (usually a primary key). Logical replication works on the publisher-subscriber model,
where the primary server is defined as publisher and the destination server as subscriber. For
a detailed example of performing online migration please refer to the Red Book “Data
Serving with Fujitsu Enterprise Postgres on IBM LinuxONE”
4.3.1 Overview
As organizations modernize and consolidate their existing applications, it is essential to have
the flexibility to manage and deploy the entire application portfolio on infrastructures that offer
flexibility, security and scalability.
Some the major advantages of the having EnterpriseDB Postgres deployed on IBM
LinuxONE are
Performance - Higher response time combined with extreme scalability for data
management and analytics workloads.
Combining EnterpriseDB Postgres with IBM LinuxONE - offers the high levels of vertical
and horizontal scalability, the flexibility and high utilization levels that can’t be achieved
with distributed infrastructures. Also, with high VM or Container saturation levels on a
single LinuxONE server - it offers great economic advantages as well as reduces power
and data center footprint from a sustainability perspective.
EnterpriseDB Postgres customers benefit from the most reliable, high-performing, flexible,
open, and cost-effective data management platform available.
From a disaster recovery perspective, standby nodes are deployed on to the Disaster
recovery site and a asynchronous streaming is setup between the primary node and the
disaster recovery standby nodes.
The EnterpriseDB Failover Manager (EFM) agent manages the nodes health and detects any
failures and takes appropriate failover recovery actions. Within the production DC, failover
happens to the current standby node, and it attaches the virtual IP to the newly promoted
primary node. This automatic failover provides near zero downtime maintenance with
controlled switchover with minimal data loss. In case of a disaster recovery, manual failover
mechanism is activated for DR switchover. Figure 4-14
With this kind of streaming application, the EnterpriseDB deployment can achieve zero RPO.
We have downloaded EDB v14.10 and placed in the specific folder " /installables". Before
installing, firewall has to be stopped and an EDB Postgres repository has to be created for
'Yum' or 'dnf' commands to be used for installing the binaries.
Once the installables are untared, the next step is to create a repository as below.
Now that the repository is setup, we can go ahead and install EDB postgres on the virtual
machine.
=================================================================
Package Arch Version Repository Size
=================================================================
Installing:
edb-as14-server s390x 14.4.0-1.rhel8 EDB
9.6 k
Installing dependencies:
boost-atomic s390x 1.66.0-13.el8
rhel-8-for-s390x-appstream-rpms 14 k
boost-chrono s390x 1.66.0-13.el8
:
Transaction Summary
=================================================================
Install 28 Packages
Downgrade 8 Packages
Total size: 91 M
Total download size: 56 M
Is this ok [y/N]: y
Downloading Packages:
(1/16): bcc-0.16.0-3.el8.s390x.rpm 4.2 MB/s | 1.2
MB 00:00
:
(16/16): clang-libs-11.0.1-1.module+el8.4.0+12483+89b287b0.s390x.r 9.9 MB/s | 20
MB 00:02
-----------------------------------------------------------------
Total 14 MB/s | 56
MB 00:03
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing :
1/1
Installing : edb-as14-server-libs-14.4.0-1.rhel8.s390x
1/44
:
:
Installed:
edb-as14-pgagent-4.2.2-2.rhel8.s390x
edb-as14-server-14.4.0-1.rhel8.s390x
edb-as14-server-client-14.4.0-1.rhel8.s390x
:
:
llvm-11.0.0-2.module+el8.4.0+8598+a071fcd5.s390x
Complete!
[root@rdbkll01 ~]#
the binaries in '/usr/ebd/as14' folder. When we switch user (using 'su' command), the default
bash home is "/var/lib/edb".
At this point, we would switch to enterprisedb user-id and set the PGHOME path and INITDB
environmental parameters in .bash_profile file. This would enable us to automatically apply
the default path and INITDB parameters.
Make sure the PGHOME path is applied to the bash shell. From the root user-id initialize the
database using the 'initdb' command with the CHAR types.
Once the database is initialized, now it's time to start the EDB Postgres server and initialize
with the database.
—————————————————————————————————————————————————————————————————
Now that the EDB Postgres server is successfully started, lets verify that the database server
process is started. For this we would need to switch user from 'root' to 'enterprisedb' and
issue 'pg_ctl' command.
Now we know that database process is running, lets login in to the database shell and list
down the databases that’s available.
Example 4-11 Issue sql statements on the new EDB Postgres Server.
[enterprisedb@rdbkll01 ~]$ /usr/edb/as14/bin/psql -d edb -U enterprisedb
psql (14.4.0, server 14.4.0)
Type "help" for help.
edb=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | ICU |
Access privileges
-----------+--------------+----------+-------------+-------------+-----+----------
---------------------
edb | enterprisedb | UTF8 | en_US.UTF-8 | en_US.UTF-8 | |
postgres | enterprisedb | UTF8 | en_US.UTF-8 | en_US.UTF-8 | |
template0 | enterprisedb | UTF8 | en_US.UTF-8 | en_US.UTF-8 | |
=c/enterprisedb +
| | | | | |
enterprisedb=CTc/enterprisedb
Important: Till now, all the above steps are applicable to both the Postgres primary and
secondary nodes. From here on, the steps would be differing, as would deploy Enterprise
Failover manager for replication streaming purposes.
Next, we would change the connectivity mode, in the pg_hba.conf file located under the '
/var/lib/edb/as14/data/' folder. For the easiness of the setup, the authentication method is set
as trust. but in the real production setup it should be set to some more secure authentication
method.
So, the data must be managed, and the related data are called database and the technology
for managing the database is called DBMS. Two prominent DBMS technologies are:
Ø RDBMS
o Supports Relational data model and have the following
characteristics
§ Maturity
§ Reliability
§ High availability
§ ACID compliance
§ Security
§ Skills availability
o Db2, Oracle, PostgreSQL, SQL Server are examples
In this chapter we will discuss about one of the NoSQL database MongoDB and how that
database fits very well and leverages LinuxONE architecture
Ø MongoDB architecture
Ø MongoDB high availability
Ø MongoDB scalability
Ø Why MongoDB on LinuxONE
Ø MongoDB as a Service
0.0.1MongoDB architecture
MongoDB is good at handling both structured and unstructured data. The basic components
of MongoDB are
Documents
MongoDB is a document database and stores all the related data together in binary JSON
(BSON) documents, offering a schema-less model that provides flexibility in terms of
database design. JSON and BSON documents don't require a predefined schema and allows
the data structure to change over time.
The following is an example of a MongoDB document:
{
_id: ObjectId("15cf23abcdef11c23a111a5b"),
string: 'First Record',
first_name: 'Sam',
last_name: 'Amsavelu',
object: { a: 100, b: false },
121
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
array: [ 1, 2, 3 ]
}
Collections
Collections are grouping of documents, like tables in RDBMS, but without a fixed schema.
Each collection can have one or more documents
Databases
A MongoDB instance can have one or more databases and each database contains one or
more collections. There are default databases, Admin and Local Databases, which are used
for system and internal operations.
Application
This is the software or program that interacts with the MongoDB database. The application
sends queries to the database and receives and processes the results. The application can
be any type of software, such as a web application, mobile app, or desktop application. In a
modern web application, this would be your API. A back-end application built with Node.js (or
any other programming language with a native driver) makes requests to the database and
relays the information back to the clients.
Drivers
The driver is a software library that allows the application to interact with the MongoDB
database. The driver provides an interface for the application to send queries to the database
and receive and process the results. MongoDB supports the driver written in many
programming languages, such as Java, Python, Node.js, or C#. Basically, the drivers are
client libraries that offer interfaces and methods for applications to communicate with
MongoDB databases. Drivers will handle the translation of documents between BSON
objects and mapping application structures.
Storage engines
MongoDB supports multiple storage engines, including the default WiredTiger storage engine
and the older MMAPv1 engine. WiredTiger offers better performance and scalability, while
MMAPv1 is simpler and can be a good choice for small deployments.
MongoDB Server
MongoDB Server is responsible for maintaining, storing, and retrieving data from the
database through several interfaces. mongod is the MongoDB daemon, also known as the
server for MongoDB. The MongoDB server listens for connections from clients on port 27017
by default, and stores data in the /data/db default directory when you use mongod. Each
mongod server instance is in charge of handling client requests, maintaining data storage,
and performing database operations. Several mongod instances work together to form a
cluster in a typical MongoDB setup.
mongos
mongos is a proxy that sits between the client application (mongo/mongosh) and a sharded
database cluster, that is multiple mongod replica sets. The mongos proxy is a map that the
mongo client can use to query or make changes in the cluster when the data is sharded. This
in-between proxy is required because the MongoDB cluster doesn’t know which shard(s) that
data exists on, but the mongos proxy does. We will discuss more about the sharded
databases in the coming section.
123
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
MongoDB offers high availability through its replication feature, which involves the use of
replica sets. A replica set is a group of MongoDB servers that maintain the same data set,
providing redundancy and increasing data availability. In simple terms, MongoDB replication
is the process of creating a copy of the same data set in more than one MongoDB server.
MongoDB handles replication through a Replica Set, which consists of multiple MongoDB
nodes that are grouped together as a unit. A replica set in MongoDB is a group of mongod
processes that maintain the same data set. Replica sets provide redundancy and high
availability, and are the basis for all production deployments.
A MongoDB Replica Set requires a minimum of three MongoDB nodes:
Ø One of the nodes will be considered the primary node that receives all
the write operations.
Ø The others are considered secondary nodes. These secondary nodes
will replicate the data from the primary node.
Ø The primary records all changes to its data sets in its operation log
(oplog).
Ø The secondaries replicate the primary's oplog and apply the operations
to their data sets asynchronously such that the secondaries' data sets
reflect the primary's data set.
Ø If the primary is unavailable or when a primary does not communicate
with the other members of the replica set for more than the configured
period (ten seconds by default), an eligible secondary calls for an
election to nominate itself as the new primary. The cluster attempts to
complete the election of a new primary and resume normal operations.
Ø The replica set cannot process write operations until the election
completes successfully. The replica set can continue to serve read
queries if such queries are configured to run on secondaries while the
primary is offline.
Ø Each replica set node must belong to one, and only one, replica set.
Replica set nodes cannot belong to more than one replica set.
Ø Replica-sets are platform independent
In relational databases like Db2, Oracle the data is shared across their instances for high
availability however in MongoDB the data itself is replicated. This results in additional storage
and resource requirements. Also, as there is only one node which is primary, is taking all the
write operations it usually reduces the throughput and scalability.
0.0.1MongoDB scalability
Database systems with large data sets or high throughput applications can challenge the
capacity of a single server. Very high query rates can exhaust the CPU capacity of the server
and if the working set size is larger than the system's memory then it can stress the I/O and
reduce the throughput. This problem can be addressed by either vertical scaling or horizontal
scaling.
Vertical scaling
Vertical scaling involves increasing the hardware capabilities of the server, such as adding
additional CPUs, increasing more RAM, and increasing the amount of storage space.
Systems like IBM LinuxONE is a very good example to meet the vertical scalability
requirements for applications by allowing to dynamically add CPU, memory and storage.
Limitations in other platform technologies or cloud operations may restrict a single machine
from being sufficiently powerful for a given workload.
Horizontal scaling
Horizontal Scaling involves dividing the system dataset and load over multiple servers,
adding additional servers to increase capacity as required. Distributing the load reduces the
strain on the required hardware resources, however horizontal scaling increases the
complexity of underlying architecture.
MongoDB supports horizontal scaling through sharding.
MongoDB sharding
MongoDB sharding works by creating a cluster of MongoDB instances consisting of at least
three servers. A MongoDB sharded cluster consists of the following components:
shard:
A shard is a single MongoDB instance that holds a subset of the sharded data. To increase
availability and provide redundancy shards must be deployed as replica sets. The
combination of multiple shards creates a complete data set. For example, a 2 TB data set can
be broken down into four shards, each containing 500 GB of data from the original data set.
mongos:
The mongos act as the query router providing an interface between the application and the
sharded cluster. This MongoDB instance is responsible for routing the client requests to the
correct shard. mongos also support hedged reads to minimize latencies.
config servers:
125
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
Config servers store metadata and configuration settings for the whole sharded cluster.
Config servers must be also deployed as a replica set.
Shard Keys:
MongoDB uses the shard key to distribute the collection's documents across shards. The
shard key consists of a field or multiple fields in the documents. When sharding a MongoDB
collection, a shard key gets created as one of the initial steps. The sharded key is immutable
and cannot be changed after sharding. A sharded collection only contains a single shard key.
The choice of shard key affects the performance, efficiency, and scalability of a sharded
cluster.
MongoDB supports two sharding strategies Hashed Sharding and Ranged Sharding for
distributing data across sharded clusters.
Sharding limitations:
For more detailed explanation about MongoDB high availability, sharding and other
operations please refer to the MongoDB documentation.
127
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
mongodb-enterprise-7.0.14
mongodb-enterprise-database-7.0.14
mongodb-enterprise-server-7.0.14
mongodb-mongosh-7.0.14
mongodb-enterprise-mongos-7.0.14
mongodb-enterprise-tools-7.0.14
In our lab environment, we were running Red Hat Enterprise Linux release 8.4 guests
We downloaded the MongoDB Enterprise Server current software 7.0.14 for Red Hat s390x
version server package from MongoDB Download Center
We followed these steps in our lab environment to install MongoDB Enterprise Edition using
the yum package manager.
[mongodb-enterprise-7.0]
name=MongoDB Enterprise Repository
baseurl=https://repo.mongodb.com/yum/redhat/8/mongodb-enterprise/7.0/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://pgp.mongodb.com/server-7.0.asc
The MongoDB Enterprise repository contains the following officially supported packages:
mongodb-enterprise
§ mongoldap
§ mongokerberos
§ install_compass script
§ mongodecrypt binary
129
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
Now we have installed MongoDB enterprise packages and any required dependent modules
have also been installed automatically.
131
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
Directory Paths
The package manager creates the default directories for data and log during installation. The
owner and group name are mongod. By default, MongoDB runs using the mongod user
account and uses the following default directories for data and log.
/var/lib/mongo (the data directory)
/var/log/mongodb (the log directory)
If you want to use data directory and/or log directory other than the default directories, create
them and change the ownership of the directories to mongod. And edit the configuration file
/etc/mongod.conf and modify the following fields accordingly:
storage.dbPath to specify a new data directory path
systemLog.path to specify a new log file path
Start MongoDB
MongoDB, mongod process can be started by issuing the following command:
You can verify that MongoDB has started successfully by issuing the following command:
You can check the MongoDB version by issuing the following command:
# mongod --version
You can start a mongosh session, the command line utility on the same host machine where
the mongod is running.
You can run mongosh without any command-line options to connect to a mongod that is
running on your localhost with default port 27017
# mongosh
133
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
By default, MongoDB launches with bindIp set to 127.0.0.1, which binds to the localhost
network interface. This means that the mongod can only accept connections from clients that
are running on the same machine. Remote clients will not be able to connect to the mongod,
and the mongod will not be able to initialize a replica set unless this value is set to a valid
network interface.
This value can be configured in the MongoDB configuration file /etc/mongod.conf with
bindIp value to 0.0.0.0
Stop MongoDB
You can stop the mongod process by issuing the following command:
Restart MongoDB
You can restart the mongod process by issuing the following command:
135
8518ch08.fm Draft Document for Review January 7, 2025 7:53 am
Please refer to the Chapter 6: MongoDB as a service with Linux on IBM Z in the Redbook
"Leveraging LinuxONE to Maximize Your Data Serving Capabilities" for a description of how
IBM LinuxONE, combined with IBM Storage, provides high availability (HA), performance,
and security by using a sample-anonymized client environment and includes a use case that
demonstrates setting up a database as a service that can be replicated on a much larger
scale across multiple client sites.
and building applications to access the information. And they are looking for the ability to
change the applications quickly, if not dynamically.
Robust data analysis needed to gain insights and make informed decisions. Scalability and
sustainability are other important factors when performing robust data analysis. Enterprises
want to use data and AI to optimize their departments, such as:
Marketing: Optimize customer reach
Sales: Uncover client needs.
Operations: Automate business processes.
Finance: Accurately forecast performance
HR: Attract top talent.
IT: Optimize IT spending.
So, the data must be managed. Organized collections of data arranged for ease and speed of
search and retrieval are called databases and the technology for managing databases is
called a database management system or DBMS. Two prominent DBMS technologies are:
Relational database management systems (RDBMS):
– Supports the relational data model and have the following characteristics
• Maturity
• Reliability
• High availability
• Atomicity, consistency, isolation, and durability (ACID) compliance
• Security
• Skills availability
– Db2, Oracle, PostgreSQL and SQL Server are examples of RDBMS.
NoSQL, an alternative approach to an RDBMS that focuses on providing a way for storage
and retrieval of data. NoSQL database designs are an alternative to the RDBMS for the
following reasons:
– They do not use a relational (tabular) data model and they have NoSQL (not only
SQL) and other interfaces.
– Handles large volumes of data which changes rapidly, and data can be Structured or
Unstructured.
– There is no predefined Schema.
– Supports Agile / DevOps methodologies for quick delivery of applications in just weeks,
not in months.
– The following are types of NoSQL databases:
Key-value Stores: Store pairs of keys and values, as well as retrieve values when
a key is known, redis is an example.
Document: Extension of key-value stores, schema free to organize data
and data is stored in JSON-like format, MongoDB and
CouchDB are examples.
Wide Column Stores:Two-dimensional key-value stores and supports very large
number of dynamic columns, cassandra is an example.
Graph DBMS Represent data in graph structures as nodes and edges, which
are relationships between nodes, used for social connections,
security, fraud, neo4j is an example database.
In this chapter we will discuss one of the NoSQL databases, MongoDB, and how that
database can satisfy your data serving needs while leveraging the LinuxONE architecture.
MongoDB is great in handling large volumes of data and the word Mongo is derived from
“Humongous”. One of the LinuxONE architecture's benefit is to efficiently handle large I/O
transactions and MongoDB takes advantage of that.
array: [ 1, 2, 3 ]
}
– Collections
Collections are grouping of documents, like tables in RDBMS, but without a fixed
schema. Each collection can have one or more documents.
– Databases
A MongoDB instance can have one or more databases and each database contains
one or more collections. There are default databases, Admin and Local Databases,
which are used for system and internal operations.
MongoDB Application layer
In MongoDB deployment, the application layer is responsible for communicating with the
database. To ensure secure access to the data, requests are initiated from this tier.
– Application
This is the software or program that interacts with the MongoDB database. The
application sends queries to the database and receives and processes the results. The
application can be any type of software, such as a web application, mobile app, or
desktop application. In a modern web application, this would be your API. A back-end
application built with Node.js (or any other programming language with a native driver)
makes requests to the database and relays the information back to the clients.
– Drivers
The driver is a software library that allows the application to interact with the MongoDB
database. The driver provides an interface for the application to send queries to the
database and receive and process the results. MongoDB supports the driver written in
many programming languages, such as Java, Python, Node.js, or C#. Basically, the
drivers are client libraries that offer interfaces and methods for applications to
communicate with MongoDB databases. Drivers will handle the translation of
documents between BSON objects and mapping application structures.
MongoDB core components
– Storage engines
MongoDB supports multiple storage engines, including the default WiredTiger storage
engine and the older MMAPv1 engine. WiredTiger offers better performance and
scalability, while MMAPv1 is simpler and can be a good choice for small deployments.
– MongoDB Server
MongoDB Server is responsible for maintaining, storing, and retrieving data from the
database through several interfaces. mongod is the MongoDB daemon, also known as
the server for MongoDB. The MongoDB server listens for connections from clients on
port 27017 by default, and stores data in the /data/db default directory when you use
mongod. Each mongod server instance is in charge of handling client requests,
maintaining data storage, and performing database operations. Several mongod
instances work together to form a cluster in a typical MongoDB setup.
– mongo and mongosh
mongo is the shell, it’s the client, it’s a javascript interface that you can use to interact
with the MongoDB server (mongod). However, as of June 2020, it was superseded by
the new Mongo Shell, called, mongosh. Compared to mongo shell mongosh has
improved syntax highlighting, command history and logging. mongosh is used to
interact or change the data from MongoDB.
– mongos
mongos is a proxy that sits between the client application (mongo/mongosh) and a
sharded database cluster, that is multiple mongod replica sets. The mongos proxy is a
map that the mongo client can use to query or make changes in the cluster when the
data is sharded. This in-between proxy is required because the MongoDB cluster
doesn’t know which shard(s) that data exists on, but the mongos proxy does. We will
discuss more about the sharded databases in the coming section.
MongoDB offers high availability through its replication feature, which involves the use of
replica sets. A replica set is a group of MongoDB servers that maintain the same data set,
providing redundancy and increasing data availability. In simple terms, MongoDB replication
is the process of creating a copy of the same data set in more than one MongoDB server.
MongoDB handles replication through a Replica Set, which consists of multiple MongoDB
nodes that are grouped together as a unit. A replica set in MongoDB is a group of mongod
processes that maintain the same data set. Replica sets provide redundancy and high
availability, and are the basis for all production deployments.
One of the nodes will be considered the primary node that receives all the write
operations.
The others are considered secondary nodes. These secondary nodes will replicate the
data from the primary node.
The primary node records all changes to its data sets in its operation log (oplog).
The secondary nodes replicate the primary node’s oplog and apply the operations to their
data sets asynchronously, such that the secondary nodes’ data sets reflect the primary
node's data set.
If the primary node is unavailable or when a primary node does not communicate with the
other members of the replica set for more than the configured period of time (ten seconds
by default), an eligible secondary node calls for an election to nominate itself as the new
primary node. The cluster attempts to complete the election of a new primary node and
resume normal operations.
The replica set cannot process write operations until the election completes successfully.
The replica set can continue to serve read queries if such queries are configured to run on
secondary nodes while the primary node is offline.
Each replica set node must belong to one, and only one, replica set. Replica set nodes
cannot belong to more than one replica set.
Replica-sets are platform independent.
In relational databases such as Db2 or Oracle, the data is shared across their instances for
high availability. However, in MongoDB, the data itself is replicated. This results in additional
storage and resource requirements. Also, as there is only one node which is primary, if taking
all the write operations, it usually reduces the throughput and scalability.
Vertical scaling
Vertical scaling involves increasing the hardware capabilities of the server, such as adding
additional CPUs, increasing more RAM, and increasing the amount of storage space. IBM
LinuxONE provides a very good example to meet the vertical scalability requirements for
applications by allowing dynamic additional CPU, memory and storage. Limitations in other
platform technologies or cloud operations may restrict a single machine from being
sufficiently powerful for a given workload.
Horizontal scaling
Horizontal scaling involves dividing the system dataset and load over multiple servers, adding
additional servers to increase capacity as required. Distributing the load reduces the strain on
the required hardware resources, however horizontal scaling increases the complexity of
underlying architecture.
MongoDB sharding
MongoDB sharding works by creating a cluster of MongoDB instances consisting of at least
three servers.
The mongos instance consults the config servers to check which shard contains the
required data set to send the query to that shard.
Finally, the result of the query will be returned to the application.
MongoDB shards data at the collection level, distributing the collection data across the
shards in the cluster.
Sharding limitations
The following are some limitations of sharding.
Sharding requires careful planning and maintenance.
The shard key directly impacts the overall performance of the underlying cluster.
The sharded key is immutable and cannot be changed after sharding. Once a collection
has been sharded, MongoDB provides no method to unshard a sharded collection. While
you can reshard your collection later, it is important to carefully consider your shard key
choice to avoid scalability and performance issues.
Fragmentation, where a sharded collection's data is broken up into an unnecessarily large
number of small chunks.
Also, there are some operational restrictions in sharded clusters such as:
– The geoSearch command is not supported in sharded environments
– MongoDB does not support unique indexes across shards
– MongoDB does not guarantee consistent indexes across shards
For more detailed explanation about MongoDB high availability, sharding and other
operations please refer to the MongoDB documentation.
MongoDB Enterprise Edition contains the officially supported packages shown in Figure 5-2.
mongodb-enterprise-server
mongodb-enterprise-tools
mongodb-mongosh
mongodb-enterprise-7.0.14
mongodb-enterprise-database-7.0.14
mongodb-enterprise-server-7.0.14
mongodb-mongosh-7.0.14
mongodb-enterprise-mongos-7.0.14
mongodb-enterprise-tools-7.0.14
• mongofiles
– mongodb-enterprise-database-tools-extra, contains the following MongoDB
support tools:
• mongoldap
• mongokerberos
• install_compass script
• mongodecrypt binary
2. Install the MongoDB Enterprise packages by using the following command:
# sudo yum install mongodb-enterprise
Sample output from the above command is shown in Example 5-4.
The MongoDB Enterprise Server packages and any required dependent modules have now
been installed automatically.
ulimit settings
MongoDB recommends the following ulimit settings for mongod and mongos deployments:
f (file size): unlimited
t (cpu time): unlimited
v (virtual memory): unlimited
l (locked-in-memory size): unlimited
n (open files): 64000
Directory paths
The package manager creates default directories for data and log during installation. The
owner and group name are mongod. By default, MongoDB runs using the mongod user
account and uses the following default directories for data and log.
/var/lib/mongo (the data directory)
/var/log/mongodb (the log directory)
If you want to use a data directory and/or a log directory other than the default directories,
create them and change the ownership of the directories to mongod. Edit the configuration file
/etc/mongod.conf and modify the following fields accordingly:
storage.dbPath to specify a new data directory path
systemLog.path to specify a new log file path
Start MongoDB
You can start MongoDB from a command line by issuing the following mongod command.
sudo systemctl start mongod
For more information on the mongod command as well as a list of options, see the following
website:
https://www.mongodb.com/docs/manual/reference/program/mongod/
You can run mongosh without any command-line options to connect to a mongod that is
running on your localhost with default port 27017 by using the following command:
mongosh
This value can be configured in the MongoDB configuration file /etc/mongod.conf with
bindIp value to 0.0.0.0
Stop MongoDB
You can stop the mongod process by issuing the following command:
sudo systemctl stop mongod
Restart MongoDB
You can restart the mongod process by issuing the following command:
sudo systemctl restart mongod
In general, Database as a Service (DBaaS) can be defined as a managed service that gives
users the ability to use a database without the complexities of hardware setup, software
installation, or intricate configuration. Also, it offloads the responsibility of routine
maintenance such as upgrades, backups, and security to the service provider, ensuring the
database is operational around the clock.
simplicity
flexibility
monitoring tools
sustainability
virtualization
cataloging
energy and floor space savings
See Appendix B, “MongoDB as a service with IBM LinuxONE” on page 351, for a description
of how IBM LinuxONE, combined with IBM Storage, provides high availability (HA),
performance, and security by using a sample-anonymized client environment and includes a
use case that demonstrates setting up a database as a service that can be replicated on a
much larger scale across multiple client sites.
However, there are differences between the nature of database management systems
(DBMSs) and business applications that kept DBMSs from being containerized in the past.
DBMSs are more CPU- and memory-intensive; they are stateful; and they require storage.
The rapid progress in container and container management technology along with DBMS
software’s close integration with those technologies has realized containerized databases.
DBMS server software is encapsulated into containers by separating the database engine
from the database files storage, and persistent storage volumes are used. Container
orchestration frameworks such as Kubernetes provide high-throughput, low-latency
networking, built-in high availability (HA) capabilities, and support for stateful container
management, which are essential for DBMS.
With automation that uses containers, the cost of deployment and operations can be reduced.
Organizations can benefit from focusing investments on application development for new
services, which empowers and expedites business reform through modernization.
As shown in Figure 7-1, when modernizing data and services, organizations move through
five elements. These five elements are deployment at the start of reform, operation and
fluctuation during the continuation of the reform, success at the completion of the reform, and
next steps. The successful outcomes of one reform project build the foundation for the next
steps, and reform is delivered continuously.
The following sections describe quick deployment, low-cost database operations, maintaining
optimal performance against fluctuation, and the next steps.
See Modernizing Db2 Containerization Footprint with Db2U for more information on the Db2U
architecture.
Complex database server deployment and setup can be realized by using a GUI. By
modifying the input template file in a declarative way, any configuration can be built in a few
steps, making database deployment quick and easy.
This section describes how FUJITSU Enterprise Postgres Operator deploys databases
quickly.
Sections 7.5, “FUJITSU Enterprise Postgres containers” on page 156 through 7.5.3,
“Fluctuation” on page 190 assume that the configuration that is shown in Figure 7-2 on
page 157 is used.
Prerequisites
In this section, we present the prerequisites that are needed to deploy a FUJITSU Enterprise
Postgres database on Red Hat OpenShift Container Platform (RHOCP).
The FUJITSU Enterprise Postgres Operator must be installed first. For more information, see
10.4.1 “FUJITSU Enterprise Postgres Operator installation”, in Data Serving with FUJITSU
Enterprise Postgres on IBM LinuxONE, SG24-8499.
Our storage was on IBM Spectrum Virtualize and we used Gold as our storage class. For
more information, see 10.2.17 “Installing IBM Spectrum Virtualize and setting up a storage
class”, in Data Serving with FUJITSU Enterprise Postgres on IBM LinuxONE, SG24-8499.
Prepare the FQDN that will be used when connecting to the FUJITSU Enterprise Postgres
server from outside of its RHOCP cluster. For example, when using logical replication
between FUJITSU Enterprise Postgres servers on different RHOCP clusters, the FQDNs
for both the publisher and subscriber FUJITSU Enterprise Postgres servers are required.
Certificates for authentication
The certificates of the FUJITSU Enterprise Postgres server and user are required for
authentication. For example, when using logical replication to use FQDN to connect to the
FUJITSU Enterprise Postgres server from outside of its RHOCP cluster, the server
certificate for FUJITSU Enterprise Postgres server must include the FQDNs for both the
publisher and subscriber FUJITSU Enterprise Postgres servers.
FUJITSU Enterprise Postgres client
For the FUJITSU Enterprise Postgres client, download the rpm from FUJITSU Enterprise
Postgres client - download and install it in the client machine or in a container.
For more information about the setup, see Chapter 3, “Setup”, in FUJITSU Enterprise
Postgres 13 on IBM LinuxONE Installation and Setup Guide for Client.
Note: Because data is communicated over the internet, a secured network is required,
such as mutual authentication by Mutual Transport Layer Security (MTLS). MTLS must be
set up before deploying the FUJITSU Enterprise Postgres cluster. For more information
about implementing MTLS, see FUJITSU Enterprise Postgres 13 for Kubernetes User
Guide.
Modernization of data and services requires rapid response to changing consumer needs.
Therefore, database deployment requires speed.
3. In the Operator details window, select Create Instance, as shown in Figure 7-4.
4. In the Create FEPCluster window, click the YAML tab, and update the parameter values
as shown in Table 7-1. To create a cluster, click Create, as shown in Figure 7-5 on
page 161.
5. The HA cluster is deployed, and the deployment status can be checked by selecting
Workloads → Pods, as shown in Figure 7-6. ha-fep-sts-0 is the master server, and
ha-fep-sts-1 and ha-fep-sts-2 are the replica servers against the cluster name ha-fep
that is specified in the CR configuration file. The status shows Running when the cluster is
ready.
All FUJITSU Enterprise Postgres pods must show the status as Running.
You have successfully installed FUJITSU Enterprise Postgres on a Red Hat OpenShift cluster
on the IBM LinuxONE server platform.
Note: For more information about the parameters of the FUJITSU Enterprise Postgres
cluster CR configuration, see FUJITSU Enterprise Postgres 13 for Kubernetes Reference
Guide.
7.5.2 Operation
This section describes how database operations are automated with FUJITSU Enterprise
Postgres Operator.
Automatic backup
This section describes the automatic backup of FUJITSU Enterprise Postgres Operator.
Data is the lifeline for organizations, and protecting data is critical. Taking a backup of your
data periodically and automatically is essential. Restoring data to the latest state is necessary
in cases of disk corruption or data corruption. Point-in-time recovery (PITR) to restore data
after operational or batch processing errors is also imperative.
For users to deploy and use the database system at ease, automatic backup is enabled by
default in FUJITSU Enterprise Postgres Operator. Backup schedules and backup retention
periods can be customized with a declarative configuration.
Note: The following examples assume that the system is used in the UTC-5 time
zone. FUJITSU Enterprise Postgres Operator accepts Coordinated Universal Time
(UTC) time. The UTC-5 time that is used in this example is converted to Coordinated
Universal Time time for the parameters.
Note: The FEPCluster monitoring feature must be enabled to view information about
the backups that you did. For more information about this procedure, see ,
“Monitoring” on page 176.
1. In the Red Hat OpenShift Console, select Operators → Installed Operators, as shown in
Figure 7-7 on page 163.
3. Select the FEPCluster tab, and select ha-fep, as shown in Figure 7-9.
4. In the FEPCluster details window, select the YAML tab, as shown in Figure 7-10. Set the
parameters as shown in Table 7-2.
In this example, here are the backups that are taken:
– Full backup is taken at midnight every Sunday.
– Incremental backup is taken at 1300 everyday.
– Backups are retained for 5 weeks.
Figure 7-10 Selecting the YAML tab in the FEPCluster details window
5. Click Save to apply the changes, as shown in Figure 7-11. A message displays that
confirms that the changes were saved successfully, as shown in Figure 7-12.
Note: If saving is unsuccessful, click Reload to apply the changes again, and then click
Save.
Note: This procedure requires the user to have a cluster-admin role for their Red Hat
OpenShift service account.
3. Select the drop-down list Insert metric at cursor. In the search box, type
pg_backup_info_last_full_backup. Possible candidates appear when you start typing, so
select the appropriate metrics, as shown in Figure 7-15.
4. Click Run queries. A chronological graph appears at the top, and a table appears at the
bottom, as shown in Figure 7-16. Check the timestamp of the latest backup under the
Value column of the table.
Timestamps are shown in UNIX time. The following steps convert the timestamp into a
more readable format:
a. Copy the timestamp value.
b. Run the following command in a Linux terminal:
$ date --date '@<TimestampValue>'
In our example, the command is:
$ date --date ‘@1634029228’
The following output is from our example command:
Sat Oct 2 20:15:18 EDT 2021
5. Similarly, you can check the recovery windows. Select the drop-down list Insert metric at
cursor. In the search box, type pg_backup_info_recovery_window. Possible candidates
appear when you start typing, so select the appropriate metrics, as shown in Figure 7-17.
4. In the Create FEPRestore window, click the YAML tab. Update the values as shown in
Table 7-3 and click Create, as shown in Figure 7-23 on page 173. Wait for the FEPCluster
to be re-created and restored.
spec: (Delete this parameter.) When you omit this parameter, restore
toFEPcluster: is performed on the existing cluster.
5. The restore is complete when all FUJITSU Enterprise Postgres cluster pods are
re-created and the pod status is Running.
6. With FUJITSU Enterprise Postgres client, check that the database is restored
(Example 7-2).
Autohealing
This section describes the autohealing features of FUJITSU Enterprise Postgres Operator.
Organizations must be prepared for any kind of failure in database operations. However,
creating and rehearsing recovery procedures is costly. In addition, systems with strict
availability requirements require HA configurations, which demand extensive workloads for
system administrators.
FUJITSU Enterprise Postgres Operator enables automatic failover and automatic recovery to
recover systems without human intervention in the event of a problem. Organizations benefit
from the stability of the database systems that are provided with less cost.
Automatic failover promotes a replica pod to master when the master pod fails, and the
database connection is switched. Automatic recovery re-creates the failed pod and restores
the database multiplexing configuration. To use automatic failover, see 7.3, “Oracle
containers” on page 156 to learn how to deploy a FUJITSU Enterprise Postgres cluster in an
HA configuration.
3. To simulate a failure, remove the master pod. Click the menu icon of ha-fep-sts-0 and
select Delete Pod, as shown in Figure 7-25 on page 175.
5. The pod that was promoted to master appears, as shown in Figure 7-27. In this simulation,
we can see that ha-fep-sts-2 pod was promoted to the master.
6. To check the results of automatic recovery, open the replica pods in the list by removing
the feprole=master label, as shown in Figure 7-28. The status of ha-fep-sts-0, which is
the old master pod, changed to Running. We can confirm that automatic recovery
completed.
7. Type feprole=replica into the search box. The list of replica pods displays, as shown in
Figure 7-29. We can confirm that the old master ha-fep-sts-0 is now running as a replica
pod, and the database multiplexing configuration is restored.
Monitoring
This section describes the monitoring capabilities of FUJITSU Enterprise Postgres Operator.
Grafana and Prometheus are two of the most common open-source components that are
integrated in a container management platform and containerized software for observability.
Prometheus is a pull-based metrics collection component. Prometheus comes with a built-in
real-time alerting mechanism that is provided by Alertmanager so that users can use existing
communication applications to receive notifications. Grafana is a visualization layer that is
closely tied to Prometheus that offers a template function for dynamic dashboards that can be
customized. Grafana, Alertmanager, and Prometheus together are referred to as the “GAP
stack”, and they provide a convenient way to monitor the health of the cluster and the pods
that run inside the cluster.
FUJITSU Enterprise Postgres Operator provides a standard Grafana user interface that
organizations can use to start monitoring basic database information. In addition, it is
possible for system administrators to respond to sudden fluctuations by using the alert
function.
Note: Before completing the following steps, you must enable monitoring in your project.
For more information, see Enabling monitoring for user-defined projects.
1. Enable the monitoring settings for the FUJITSU Enterprise Postgres cluster. For a new
deployment, add the parameter that is shown in Table 7-4 by performing step 4 on
page 159. For a deployed cluster, add this parameter into the existing FEPCluster CR. To
save the changes, click Save.
2. Exporter is deployed. Select Workloads → Pods and check that the pod status for
Exporter (ha-fep-fepexporter-deployment-XXX) is Running, as shown in Figure 7-30.
2. In the drop-down list for Dashboard, select Kubernetes / Compute Resources / Pod, as
shown in Figure 7-32 on page 179.
3. In the drop-down list for Namespace, select the project that was created in 7.5.1,
“Automatic instance creation” on page 158. In the drop-down list for Pod, select
ha-fep-sts-0. CPU utilization can be checked, as shown in Figure 7-33. Based on the
information that is displayed in this window, database administrators can decide whether a
resource should be added.
4. In addition, graphs can be viewed by using the Grafana sample template. Select Grafana
UI at the top of the window, as shown in Figure 7-34.
5. Grafana opens in a separate window. When prompted to log in, enter your credentials. If
permission is requested, grant permission.
6. On the Grafana home window, select the search icon, as shown in Figure 7-35.
Figure 7-35 Selecting the search icon on the Grafana UI home page
7. Select Kubernetes / Compute Resources / Pod under the Default folder, as shown in
Figure 7-36 on page 181.
8. In each drop-down list, select the target namespace and pod. The metrics for the selected
pod appear, and resource information can be viewed in rich GUI, as shown in Figure 7-37.
In this example, users see that a CPU resource configuration is appropriate because the
pod is running within the CPU resource allocation.
Note: Grafana, which is used in this example, is integrated with the RHOCP cluster.
Dashboards cannot be customized. To customize the dashboard, see “Using custom
Grafana dashboards” on page 187.
9. You can also check the disk usage. Select the search icon, and then select USE Method
/Cluster under the Default folder, as shown in Figure 7-38.
10.Scroll down to view a graph of the disk usage, as shown in Figure 7-39.
Alert settings
By default, FUJITSU Enterprise Postgres Operator comes with an alert rule that sends out
notifications if the number of connections exceeds 90% of the maximum number of
connections that is possible. In Alertmanager, set an email receiver and set routing so that the
alert is routed to the email receiver.
Note: For the list of default alerts, see FUJITSU Enterprise Postgres 13 for
Kubernetes User Guide.
Alert rules are configurable. Alert levels, intervals, and thresholds can be set
for any monitoring metrics. For more information, see FUJITSU Enterprise
Postgres 13 for Kubernetes User Guide.
2. Select the Global configuration tab and select Alertmanager, as shown in Figure 7-41.
4. Enter the Receiver name and select Email for the Receiver type, as shown in
Figure 7-43. The detail fields appear, as shown in Figure 7-44.
5. As needed by your environment, enter details such as the email address and SMTP
server.
Figure 7-47 Selecting the dashboard icon on the Grafana UI home page
5. Click Upload JSON file, and upload the JSON file that you downloaded in Step 2 on
page 187, as shown in Figure 7-50 on page 189.
7.5.3 Fluctuation
As businesses grow, more data processing power is vital to support the continuous and stable
growth of these businesses. FUJITSU Enterprise Postgres Operator monitors resources and
scales data processing power flexibly to ensure that your system maintains optimal
performance.
Autoscaling
This section describes the autoscaling feature of FUJITSU Enterprise Postgres Operator.
Database administrators determine the best configurations that meet database performance
requirements in the design phase. During operations, expansion by scale-up and scale-out is
planned and conducted according to fluctuations in system growth and changes in the
business environment.
FUJITSU Enterprise Postgres Operator constantly monitors changes so that it can flexibly
scale data processing power against unexpected fluctuations. Auto scale-out can be set up to
scale out replica pods automatically according to the workload to expand system capacity.
This feature leverages the high scalability of the IBM LinuxONE platform so that the system
obtains performance stability that is resistant to load fluctuations in referencing business
transactions.
1. Specify the replica service as the connection destination for referencing the business
transactions application by using the command that is shown in Example 7-3.
2. In the Red Hat OpenShift Console, select Installed Operators, and then click FUJITSU
Enterprise Postgres 13 Operator, as shown in Figure 7-53.
4. In the FEPCluster Details window, select the YAML tab and set the parameters that are
shown in Table 7-5, as shown in Figure 7-55. In this example, we set up a policy so that an
instance is created whenever the average CPU utilization of the master pod and replica
pods in the FEPCluster exceeds 70%.
5. Click Save to apply the changes, as shown in Figure 7-56 on page 193.
Note: When using the auto scale-out feature, consider synchronous mode. The default for
synchronous mode is on. When the number of replicas increases after scale-out, SQL
performance might degrade. Use the auto scale-out feature after validating that the
performance remains within the requirements of the system.
If the performance degradation risks the violation of the system requirements, set
synchronous mode to off. By turning synchronous mode to off, remember the impacts to
the database behavior:
When data is updated and the same data is read by another session immediately, the
old data might be fetched.
When the master database instance fails and a failover to another database instance is
performed, updates that were committed on the old master database instance might not
be reflected in the new master. After a failover occurs due to a master database failure,
investigate records such as the application log to identify the updates that were in
progress at the time of failure. Verify that the results of those updates are correctly
reflected to all the database instances in the database cluster.
Note: When the workload on the system decreases, users should consider scaling-in to
reduce redundant resources. This task is performed manually by editing the FEPCluster
CR. For more information, see FUJITSU Enterprise Postgres 13 for Kubernetes User’s
Guide.
When expanding tenants on IBM LinuxONE, FUJITSU Enterprise Postgres Operator makes it
easy to deploy new tenants.
Even if the database structure is common, the processing capacity that is required for each
tenant might be different. With FUJITSU Enterprise Postgres Operator, users can adjust the
scale factors (CPU, memory, and disk allocation) of the template that is used in the successful
tenant database to quickly deploy a new database with optimal capacity.
To deploy a database in system expansion on IBM LinuxONE, complete the following steps.
Note: The example that is provided in this section assumes that the storage that is used in
the new database that will be deployed was pre-provisioned.
1. In the Red Hat OpenShift Console of the existing system, select Installed Operators →
FUJITSU Enterprise Postgres 13 Operator → FEPCluster → ha-fep and download the
CR configuration of FEPCluster on the YAML tab, as shown in Figure 7-58.
2. On the Red Hat OpenShift Console of the new system, select Installed Operators.
3. Select FUJITSU Enterprise Postgres 13 Operator, as shown in Figure 7-59.
5. Copy the file that was downloaded on the existing system in step 1 on page 195 to the
target location of the system expansion. Open the downloaded file in a text editor, and
copy the content of the CR configuration.
6. In the Create FEPCluster window, click the YAML tab, and paste the copied contents.
Update the value of the CR configuration parameters, as described in Table 7-6. Click
Create to create a cluster, as shown in Figure 7-61 on page 198.
7. The HA cluster is deployed, and the deployment status can be checked by selecting
Workloads → Pods. When the cluster is ready, the status is displayed as Running, as
shown in Figure 7-62.
Note: To understand how to quickly deploy a new database by using an existing template,
see , “Quick deployment of new databases for business expansion” on page 198.
FUJITSU Enterprise Postgres makes it easy and reliable to copy existing data and replicate it
in real time with logical replication.
This use case explains how the database is expanded as a regional service site within one
RHOCP Cluster on IBM LinuxONE, as shown in Figure 7-66.
Note: The slot name that is specified for spec.fep.replicationSlots must be different
from the names of the pods in the cluster (in the example in this section, they are
ha-fep-sts-o, ha-fep-sts-1, and ha-fep-sts-2). If the name of a pod is also used for the
slot name, the replication slot will not be created.
The client must present a certificate, and only certificate authentication is allowed.
Replace <SubClusterName> and <SubNamespace> with the appropriate values according to
the subscriber FEPCluster.
e. Save the changes that you made to FEPCluster CR configuration by clicking Save in
the YAML tab, as shown in Figure 7-68.
Note: A manual restart of the FUJITSU Enterprise Postgres process on all the FUJITSU
Enterprise Postgres pods that use the FEPAction CR is required for changes to
postgresql.conf to take effect. A restart causes a short outage on the cluster, so this
action should be performed while considering a service interruption to the published
cluster.
Click the YAML tab and modify the values that are shown in Table 7-11 on page 205 and
click Create, as shown in Figure 7-70 on page 205.
2. Now, a database that is named publisher, which was created in , “Automatic backup” on
page 162, with pgbench is used. From the FUJITSU Enterprise Postgres client, use the
pg_dump command to dump a schema definition of the publisher cluster’s publisher into a
file (Example 7-4).
b. Create a role that is named logicalrepluser and grant the required privileges to this
role. The privileges that you grant depend on your requirements, as shown in
Example 7-6.
c. Create a publication that is named mypub. Define the publication for the database and
tables that will be replicated, as shown in Example 7-7.
Taking a bank teller system as an example for the regional system, the following three
tables are specified in Example 7-7:
• The branch table (pgbench_branches)
• Account table (pgbench_accounts)
• Bank teller table (pgbench_tellers)
All database operations that include INSERT, UPDATE, and DELETE are replicated by
default.
d. Verify the publication that you created with the query that is shown in Example 7-8. The
output of this command is shown in Example 7-9. To verify the publication, use the
query that is shown in Example 7-10 on page 207.
2. From the FUJITSU Enterprise Postgres client on the subscriber side, create a database
that is named subscriber on the subscriber side by using the command that is shown in
Example 7-11.
3. Transfer the file that was dumped by the FUJITSU Enterprise Postgres client that was
created on the publisher side to the FUJITSU Enterprise Postgres client that was created
on the subscriber side. Use the psql command to point to the transferred file on the
subscriber side to create the tables, as shown in Example 7-12.
4. From the FUJITSU Enterprise Postgres client on the subscriber side, connect to the
database that you created by using the command that is shown in Example 7-13.
5. Define a subscription by using the command that is shown in Example 7-14. Logical
replication starts when this command completes. The existing data in the publication that
is targeted in a subscription is copied when replication starts.
6. Check the created subscription with the command that is shown in Example 7-15. A
sample output is provided in Example 7-16 on page 208.
count
---------
1000000
(1 row)
count
-------
10
(1 row)
count
-------
100
(1 row)
This use case explained how to expand the database within a single RHOCP cluster. It is also
possible to expand to other environments, such as:
Different RHOCP clusters
x86 based IBM Cloud clusters, which have a different CPU architecture than IBM
LinuxONE
In this chapter, we provide an example of database migration that uses Fujitsu Enterprise
Postgres on IBM LinuxONE.
Processable data has become more diverse because it was developed as open source. As a
result, many features for business use were developed, and open source became
increasingly important, which encouraged enhanced mission-critical quality that is a
non-functional requirement and mandatory for business use. This synergy made the
acceptance of open source technology in mainstream businesses and open source to
become an essential part of enterprise systems.
In recent years, many organizations started the process of data modernization for DX. As a
result, they are increasingly handling data that uses various open source software. In
addition, software with sufficient functions for practical use is appearing. Table 8-1 shows
examples of open source software that is used for data handling in enterprise systems.
For this reason, linking a mission-critical DBMS with peripheral data sources and data
processing tools to create values is necessary to drive data modernization. To achieve this
goal, the DBMS requires an open interface that can work with peripheral data sources and
data processing tools.
various data sources that are linked through open interfaces. For this reason, many database
engineers who are considering database migration from Oracle Database have considered
adopting open source PostgreSQL as their target database because of its open interface and
the benefits of reduced licensing fees.
However, database engineers might hesitate to choose open source PostgreSQL because of
their concerns about reliability and operations, especially when migrating from enterprise
systems with HA and reliability requirements. Additionally, if database specialists in the
organization have used only Oracle Database, the skills development for migration is a major
concern. The costs of investigating features and training engineers might be significant if
there is not sufficient knowledge about a migration to PostgreSQL.
This section describes how to solve these challenges with knowledge that is based on
numerous migrations that Fujitsu has carried out, with two viewpoints to be considered in
database migration:
Product
The combination of IBM LinuxONE and Fujitsu Enterprise Postgres enables database
engineers to build highly reliable data processing systems that meet the essential
requirements of enterprise systems, which include HA architectures with FIPS 140-2 Level
4 security.
The following two sections highlight key considerations for migration:
– Section 8.2.1, “Business continuity” on page 215
This section introduces the features that Fujitsu Enterprise Postgres provides for
business continuity and the HA features that are further strengthened by
IBM LinuxONE.
– Section 8.2.2, “Mitigating security threats” on page 217
This section introduces the enhanced security features of Fujitsu Enterprise Postgres
and the data encryption features that are available in combination with
IBM LinuxONE.
Figure 8-1 shows one of the HA, highly secure architecture implementations of Fujitsu
Enterprise Postgres on LinuxONE.
Figure 8-1 Database configuration for an enterprise system with IBM LinuxONE and FUJITSU
Enterprise Postgres
Note: For more information about implementing the architecture that is shown in
Figure 8-1, see Chapter 5 “High availability and high reliability architectures” and Chapter 6
“Connection pooling and load balancing with Pgpool-II” in Data Serving with FUJITSU
Enterprise Postgres on IBM LinuxONE, SG24-8499.
Note: To learn more about SQL performance tuning or other migration works, contact
Fujitsu Professional Services, found at:
https://www.postgresql.fastware.com/contact
Fujitsu services are available for proof-of-concept (POC) assistance and for migration
in production environments for smoother delivery of migration projects.
Oracle RAC is a clustered architecture where multiple nodes make up a single database on
shared storage such as SAN or NAS. Oracle RAC supports an active/active configuration and
load balancing across nodes. Fujitsu Enterprise Postgres supports active/standby cluster as
a HA architecture. One of the nodes is used for read/write workloads, and the other node is
used for read-only workloads. Load balancing can be implemented by using a
connection-pooling software that is known as Pgpool-II.
Figure 8-2 shows the HA architectures for Oracle RAC and FUJITSU Enterprise Postgres.
Figure 8-2 HA architectures for Oracle RAC and FUJITSU Enterprise Postgres
Despite the differences in storage and synchronization, similarities can be seen in Figure 8-2
on page 215. Both databases require appropriate computer resources for each node to
configure HA architectures. If a node fails, the HA mode of operation is degraded to
single-node operations in both database products, which means that the system should be
designed for single-node operations in the case of node failures. For example, in a node
failure to maintain the same performance as before the failure, each node requires twice the
CPU resources at peak hours.
Fujitsu Enterprise Postgres follows a similar approach to Oracle RAC to achieve the business
continuity that is required for enterprise systems.
Note: For more information about Database Multiplexing, see 2.1 “Availability and reliability
features” in Data Serving with FUJITSU Enterprise Postgres on IBM LinuxONE,
SG24-8499.
Connection Manager
Connection Manager allows quick detection of network errors and server outages with no
response for extended periods. This task is done by using mutual heartbeat monitoring
between the client and the database servers. When an abnormality is detected, the
database server is notified in the form of a forced collection of SQL connections with the
client, and the client is notified by an error event through SQL connection. Because
Connection Manager determines which database server to connect to, applications need
to retry only the SQL that returned an error, which ensures business restarts with minimal
downtime. Figure 8-4 shows the Connection Manager processes.
To keep data secure, database security requires comprehensive management of the following
three aspects of information assets:
Confidentiality: Access to information is managed to prevent leakage of information
outside of the company. Access control or prevention of information leakage must be
considered.
Integrity: Information integrity is ensured, and information cannot be changed or misused
by an unauthorized party. Prevention or detection of data falsification is required.
Availability: Information is accessible to authorized users anytime. Power supplies must be
ensured and a redundant system configuration should be set.
In addition, databases that are used in enterprise systems often require the following two
security features:
Advanced encryption to prevent critical data exposure
An audit feature for early detection of unauthorized access
Fujitsu Enterprise Postgres extends PostgreSQL for use in an enterprise system and provides
these two security features so that organizations can migrate from commercial enterprise
systems such as Oracle Database without compromising their security.
It is also necessary to prevent critical data from being exposed when SQL fetches the data.
Critical data should be obfuscated so that unauthorized users who do not have the correct
permissions cannot see that sensitive information.
Fujitsu Enterprise Postgres fulfills these security requirements through two enterprise
features: Transparent Data Encryption (TDE) and Data Masking.
TDE
The key to data encryption is how seamless it is to encrypt data and how secure the
encryption key management is:
– Seamless data encryption
Storage data and backup data can be transparently encrypted without application
modification:
• The encryption algorithm does not change the size of the object that is encrypted,
so there is no storage overhead.
• The encryption level fulfills the requirements for the Payment Card Industry Data
Security Standard (PCI-DSS) and allows confidential information such as credit
card numbers to be made unrecognizable on disk.
• CP Assist for Cryptographic Functions (CPACF) in the IBM Z processor is used to
minimize the encryption and decryption overhead.
Figure 8-5 on page 219 shows the Fujitsu Enterprise Postgres Transparent Data
Encryption processes.
Figure 8-6 File-based and hardware-based keystore management options in Fujitsu TDE
Data Masking
With Fujitsu Data Masking, you can obfuscate specific columns or part of the columns of
tables that store sensitive data while still maintaining the usability of the data. The data
that is returned for queries to the application is changed so that users can reference the
data without exposing the data. For example, for a query of a credit card number, all the
digits except the last 4 digits of the credit card number can be changed to "*" so that the
credit card number can be referenced.
The benefit of using Fujitsu Data Masking is that SQL modification in existing application is
not required to obfuscate sensitive data. Query results are masked according to the
configured data masking policy. Database administrators can specify the masking target,
masking type, masking condition, and masking format in a masking policy.
Figure 8-7 shows a Data Masking use case for test data management.
Figure 8-7 Test data management use case for data masking
By using Fujitsu Audit Logging, the database access related logs can be retrieved in audit
logs. Actions by the administrators and users that are related to the databases are output to
the audit log.
The benefit of using Fujitsu Enterprise Postgres Audit Logging is that audit logs can be output
to a dedicated log file that is separate from the server log, which enables efficient and
accurate log monitoring. Also, the audit log is written asynchronously, so there is no
performance impact for logging.
Figure 8-8 on page 221 shows an example of the Audit Logging process.
One of the benefits of partitioning is enabling the use of partition pruning. If partition pruning
is enabled, only the partitions that match SQL search conditions are accessed to read data,
which improves SQL query performance compared to accessing all partitions.
Example 8-2 shows how partition pruning affects the query plan and improves performance. It
is an SQL query that retrieves all rows where the ID column is ID1.
Example 8-2 SQL query retrieving all rows with a specific column value
SELECT * FROM t1 WHERE id = 'ID1';
The query plan for the SQL query that is shown in Example 8-2 is shown in Example 8-3 on
page 223.
In this query plan, only t1_1 is used, and all other partitions such as t1_2 are not used
because the PostgreSQL query engine creates an access plan to fetch data only from
partition t1_1. This process is known as partition pruning, where data is extracted only from
those partitions that match the partitioning key criteria.
Next, we compare the query plans that are created for tables with partition pruning and
without partition pruning.
For comparison, partition pruning can be disabled. To verify the effects of partition pruning,
the query plan for running the same SQL query without partition pruning is shown in
Example 8-4.
As shown in Example 8-4 on page 223, this query plan accesses and uses all the partitions
t1_1 - t1_9. Notice that the execution time that is shown in the last line is 2298 ms without
partition pruning. The execution time was 932 ms with partition pruning, as shown in
Example 8-3 on page 223, which means that the partition pruning feature produced improved
query performance in our test.
Even if an SQL query does not support partition pruning, partition pruning can be enabled
with SQL tuning to improve SQL query performance.
In this section, we present two use cases of SQL tuning. These examples use table t1 in
Example 8-1 on page 222 and table t2 in Example 8-5.
Use case 1
Use case 1 uses the SQL query that is shown in Example 8-6. This query specifies id =
'ID1' as a condition for the sub-query. Because the main query specifies condition is t1.id =
t3.id, this query retrieves only rows where t1.id is ID1. Running this query requires
searching only the t1_1 partition.
However, in some cases, PostgreSQL may not allow partition pruning based on the conditions
that are specified in the sub-query. Therefore, as shown in Example 8-7, all partitions are
accessed and searched, and the execution time is 17,907 ms.
Even if partition pruning does not work as shown in Example 8-6 on page 225, it is possible to
enable partition pruning by adding the search condition id = 'ID1' in the main query, as
shown in Example 8-8.
The query plan for the SQL that is shown in Example 8-8 is shown in Example 8-9. In this
query plan, only partition t1_1 is used. As a result, the execution time is reduced to 3593 ms,
which improves performance.
Example 8-9 Use case 1: Query plan after tuning with partition pruning enabled
[fsepuser@rdbkpgr1 ~]$ psql -p 27500 -d postgres -c "EXPLAIN ANALYZE SELECT * FROM
t1, (SELECT id, max(value2) FROM t2 where id = 'ID1' GROUP BY id) t3 WHERE t1.id =
t3.id and t1.id = 'ID1';"
QUERY PLAN
----------------------------------------------------------------------------------
Nested Loop (cost=0.00..129428.80 rows=9000000 width=16) (actual
time=14.037..3140.969 rows=1000000 loops=1)
-> Seq Scan on t1_1 t1 (cost=0.00..16925.00 rows=1000000 width=8) (actual
time=13.970..512.296 rows=1000000 loops=1)
Filter: (id = 'ID1'::bpchar)
-> Materialize (cost=0.00..3.82 rows=9 width=8) (actual time=0.000..0.001
rows=1 loops=1000000)
-> GroupAggregate (cost=0.00..3.69 rows=9 width=8) (actual
time=0.055..0.057 rows=1 loops=1)
Group Key: t2.id
-> Seq Scan on t2 (cost=0.00..3.50 rows=20 width=8) (actual
time=0.012..0.035 rows=20 loops=1)
Filter: (id = 'ID1'::bpchar)
Rows Removed by Filter: 180
Planning Time: 0.461 ms
JIT:
Functions: 12
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 1.331 ms, Inlining 0.000 ms, Optimization 0.404 ms, Emission
12.987 ms, Total 14.722 ms
Execution Time: 3593.280 ms
(15 rows)
Use case 2
Use case 2 uses the SQL query that is shown in Example 8-10. This query specifies
t2.value2 = 11 as a search condition, which means that the t1.id column, which is the
partition key, is not specified in the search condition.
Therefore, as shown in Example 8-11, all partitions are accessed, and the execution time is
19,545 ms.
JIT:
Functions: 9
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 0.880 ms, Inlining 0.000 ms, Optimization 0.347 ms, Emission
7.850 ms, Total 9.077 ms
Execution Time: 19545.829 ms
(24 rows)
Now, think about the INSERT statement that was shown in Example 8-5 on page 224. Table t2
has only two corresponding ID columns when value2 is determined. Suppose it is known that
even if table t2 is updated in the future, only a few ID columns corresponding to the value of
value2 will be updated. In this case, it is a best practice that you change the SQL query to
retrieve the id value corresponding to value2 first, and then specify the search condition by
using the id value, as shown in Example 8-12.
In the second SELECT statement, t1.id is specified in the search condition. Therefore, as
shown in Example 8-13, partition pruning is used and only two partitions, t1_1 and t1_9, are
accessed, which results in a total execution time of 4,670 ms for the two SELECT statements,
which improves performance.
Example 8-13 Use case 2: Query plan after tuning with partition pruning enabled
[fsepuser@rdbkpgr1 ~]$ psql -p 27500 -d postgres -c "EXPLAIN ANALYZE SELECT
DISTINCT t2.id FROM t2 WHERE t2.value2 = 11;"
QUERY PLAN
----------------------------------------------------------------------------------
Unique (cost=3.51..3.52 rows=2 width=4) (actual time=0.049..0.056 rows=2
loops=1)
-> Sort (cost=3.51..3.51 rows=2 width=4) (actual time=0.048..0.050 rows=2
loops=1)
Sort Key: id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on t2 (cost=0.00..3.50 rows=2 width=4) (actual
time=0.010..0.019 rows=2 loops=1)
Filter: (value2 = 11)
Rows Removed by Filter: 198
Planning Time: 0.183 ms
Execution Time: 0.123 ms
(9 rows)
[fsepuser@rdbkpgr1 ~]$ psql -p 27500 -d postgres -c "EXPLAIN ANALYZE SELECT * FROM
t1, t2 WHERE t1.id = t2.id AND t2.value2 = 11 AND t1.id IN ('ID1', 'ID9');"
QUERY PLAN
----------------------------------------------------------------------------------
Hash Join (cost=3.52..57853.53 rows=400000 width=16) (actual
time=0.073..4240.195 rows=1000000 loops=1)
Hash Cond: (t1.id = t2.id)
-> Append (cost=0.00..43850.00 rows=2000000 width=8) (actual
time=0.033..2775.092 rows=2000000 loops=1)
In this section, we described specific examples of SQL tuning to improve query performance.
SQL performance tuning requires field experience because it involves deep understanding of
the working of PostgreSQL query engine processing. In addition, often the necessity of SQL
tuning is brought to attention only after performance verification is performed at the end of the
migration process, which means that SQL tuning knowledge is required to keep the schedule
as planned and ensure a successful migration.
Note: For more information about Fujitsu performance tuning, see 8.3.2, “Performance
tuning tips” on page 258.
Applications that are used in enterprise systems often require concurrent connections to
databases to improve throughput. In multiprocessing, memory is allocated for each process,
even for common information, which might lead to a lack of memory.
Fujitsu Enterprise Postgres provides the Global Meta Cache feature to reduce memory usage
by deploying a meta cache, which is the common information between connections, on shared
memory and deploying only the process-specific information to each process memory. In
enterprise systems with several thousand connections and more than 100,000 tables, Global
Meta Cache reduces memory usage from a dozen terabytes to several dozen gigabytes.
Note: For more information about Global Meta Cache on FUJITSU Enterprise Postgres,
see 2.3.3 “Global Meta Cache” in Data Serving with FUJITSU Enterprise Postgres on IBM
LinuxONE, SG24-8499.
Note: For more information about migration technical knowledge, see 8.3.1,
“Experience-based migration technical knowledge” on page 231.
In addition, understanding the features of the target product of migration is critical to bringing
out the performance of the product and ensuring stability. Therefore, in enterprise systems
where stable operation is a key requirement, skills development of the members performing
the migration work is essential.
The content of the required training depends on the level of proficiency of the members
performing the migration work and the roles that they are responsible for. Fujitsu offers
Professional Services to flexibly assist organizations.
In this chapter, the knowledge and approach for successful migration from Oracle Database
to Fujitsu Enterprise Postgres are introduced:
Experience-based migration technical knowledge: Introduction to the Fujitsu approach for
pre-migration planning and migration.
Performance tuning tips: Outline of the tasks that are required for performance tuning.
Note: The migration expertise that is introduced in this chapter is essential to the success
of migration projects. For more information about migration works, contact Fujitsu
Professional Services at:
https://www.postgresql.fastware.com/contact
The following sections cover the differences between Oracle and Fujitsu Enterprise Postgres
in the following areas:
File structure of tables
Concurrency control
Transactions
Locking
Expansion of data storage capacity
History of data changes
Encoding
Database configuration files
Schemas
Other differences and their complexity level of migration
Table 8-2 shows a comparison between Oracle Database and Fujitsu Enterprise Postgres. A
key difference to be aware of during migration is that Fujitsu Enterprise Postgres cannot store
the data of multiple tables in one data file.
Table 8-2 Oracle and Fujitsu Enterprise Postgres file structure comparison
Oracle Database Fujitsu Enterprise Postgres
Oracle Database stores objects such as Fujitsu Enterprise Postgres creates one or
tables and indexes in data files, which are more physical data files and stores data of
physical files. Data files can be allocated to one table. Data files are stored in a fixed
table spaces. directory. However, by using a table space,
you can store data files in any directory.
Concurrency control
The management of concurrent access to data is essential for good performance. It also
prevents excessive locking that might potentially restrict access to data while still allowing the
flexibility that is provided by different isolation levels.
Fujitsu Enterprise Postgres achieves concurrency with read consistency by using a similar
method to Oracle Database. Both apps make multiple copies of a row to present the
appropriate information to a client based on when the transaction started. However, Fujitsu
Enterprise Postgres does not remove the old copies of row data when it is updated or deleted,
which results in physical files that can increase considerably in size, especially where a high
frequency of update occurs. Therefore, the key consideration is the strategy to mitigate
excessive growth of physical files, which can lead to performance degradation over time.
Table 8-3 compares concurrency control between Oracle Database and FUJITSU Enterprise
Postgres.
Table 8-3 Oracle and Fujitsu Enterprise Postgres concurrency control comparison
Oracle Database FUJITSU Enterprise Postgres
When Data Manipulation Language (DML) When DML statements change data, Fujitsu
statements change data, Oracle Database Enterprise Postgres marks the old value of
stores the old value of data in an UNDO row data and adds new values of data to the
table space. This space can be reused when end. The old value of data is not deleted and
it is no longer needed for other data updates. left for the purposes of rollback.
If data that is changed in a transaction is not If data that is changed in a session is not
committed and the same data is viewed in committed and the same data is viewed in
another session, Oracle uses UNDO another session, Fujitsu Enterprise Postgres
information. UNDO information is data that uses the old value of row data.
is already committed when viewed.
Note: Fujitsu Enterprise Postgres requires running VACUUM regularly. VACUUM enables
the reuse of data storage spaces that are marked as used. VACUUM can also be
configured to run automatically by using the autovacuum feature.
Transactions
Fujitsu Enterprise Postgres supports transactions in the same way as Oracle Database.
However, there are differences in autocommit and transaction error handling.
Because commit is run in different units, application changes might be required when
migrating. Oracle Database commits per statement, but Fujitsu Enterprise Postgres commits
per transaction.
Table 8-4 shows the comparison of transaction processing between Oracle and FUJITSU
Enterprise Postgres.
Table 8-4 Transaction processing comparison: Oracle Database and FUJITSU Enterprise Postgres
Oracle Database FUJITSU Enterprise Postgres
Error handling: If an error occurs within a Error handling: If an error occurs within a
transaction but COMMIT is run at the end of it, transaction, Fujitsu Enterprise Postgres roll s
Oracle Database commits the data that results back the transaction, even if COMMIT is run at the
from successful DML statements execution. end of the transaction.
Locking
Locks are supported on both Fujitsu Enterprise Postgres and Oracle Database. However,
there are some differences in lock behavior.
Fujitsu Enterprise Postgres waits to acquire the lock unless applications have the appropriate
settings. So, the applications wait for a response from FUJITSU Enterprise Postgres.
Therefore, application changes might be required as part of the migration.
Table 8-5 shows a comparison of locking between an Oracle Database and a Fujitsu
Enterprise Postgres database.
Table 8-5 Oracle Database and Fujitsu Enterprise Postgres locking comparison
Oracle Database FUJITSU Enterprise Postgres
Oracle Database supports table-level locks and Fujitsu Enterprise Postgres also supports
row-level locks. table-level locks and row-level locks.
Explicit Locks are obtained by applications by Explicit Locks are obtained by applications by
using the following SQL statements: using the following SQL statements:
The LOCK TABLE statement sets a table-level A LOCK TABLE statement sets a table-level
lock. lock.
The SELECT statement with the FOR UPDATE A SELECT statement with the FOR UPDATE
clause or the FOR SHARE clause sets a clause or the FOR SHARE clause sets a
row-level lock. row-level lock.
A DDL statement sets the appropriate locks A DDL statement sets the appropriate locks
automatically. If the lock cannot be set, an error automatically. If DDL cannot set a lock, the
occurs. application waits until a lock is set.
Fujitsu Enterprise Postgres requires all the data, such as tables, to be copied to expand the
capacity. Therefore, when designing the target database system, it is a best practice to
consider the following items to avoid the immediate need of expanding data capacity:
Determine the appropriate disk size and table space configuration.
Determine whether to use partitioning.
Determine an appropriate vacuuming strategy.
If it is necessary to expand the data capacity after going into production, plan and implement
expansion while considering the data copy time.
Table 8-6 shows a side-by-side comparison of Oracle Database and Fujitsu Enterprise
Postgres data storage capacity expansion.
Table 8-6 Data storage capacity expansion comparison: Oracle Database and FUJITSU Enterprise
Postgres
Oracle Database FUJITSU Enterprise Postgres
Expanding capacity is achieved by increasing The size of a table space cannot be increased.
the size of table space. To expand the capacity, a new table space must
be created on a new disk. Then, all the data,
such as tables and indexes, must be moved to
the new table space.
Encoding
There are differences in the supported encodings between Oracle Database and FUJITSU
Enterprise Postgres. If the source database uses an encoding that Fujitsu Enterprise
Postgres does not support, change it to one of the supported encodings on FUJITSU
Enterprise Postgres.
When migrating a database, consider which encoding to use on the target system.
Review the parameters and connection settings on Oracle Database and set the parameters
in the Fujitsu Enterprise Postgres configuration files to have the same effect as Oracle
Database.
Schemas
Fujitsu Enterprise Postgres supports the concept of a schema like Oracle Database.
However, there are differences in their functions.
Oracle Database automatically creates a schema with the same name as the user. Fujitsu
Enterprise Postgres has a “public” schema by default. A schema with the same name as the
user is not automatically created.
Because of the differences in automatically created schemas and schema search orders,
design settings and definitions for Fujitsu Enterprise Postgres so that the schema can be
used in the same way as database operations, such as data extraction in Oracle Database.
Specifically, it is important to set the search_path parameter. Fujitsu Enterprise Postgres uses
the search_path parameter to manage the schema search path, which is the list of schemas
to look in. If a schema name is not specified when running queries, Fujitsu Enterprise
Postgres uses the search path to determine which object is meant. The order that is specified
in the search path is used to search the schema, and the first matching object is taken to be
the one that is wanted, and it is used for query execution.
By default, the user who created the schema owns the schema. Therefore, appropriate
privileges are required to allow other users access.
This section explains the migration complexities of major features from Oracle Database to
Fujitsu Enterprise Postgres and a description of the differences that you should be aware of.
The migration complexity level is outlined in Table 8-7, which is key in subsequent tables to
indicate the migration complexity level for each task.
The major database elements that might be impacted are outlined in Table 8-8.
Maximum database 0 -
capacity
Number of indexes in a 0 -
table
Binary data types 2 Fujitsu Enterprise Postgres does not support the
same binary data types. Thus, redesign and
changes are required.
Globalization support 1 -
The major performance elements and their migration complexity level are outlined in
Table 8-9.
Parallel query 1 -
Table 8-11 lists the operational tasks and their migration complexity levels.
Changing DB 0 -
configuration: Adding
columns or indexes
Reorganize index spaces 1 The REINDEX statement can reorganize the index
spaces.
High-speed loader 1 -
Data replication 1 -
Data Masking 1 -
Migration complexity levels and their descriptions for application development are shown in
Table 8-13.
ODBC 1 -
Automatic 0 -
deadlock
detection
Backup settings 1 -
Recovery 1 -
Therefore, a quality migration assessment should be one of the first activities that an
organization undertakes if they are considering a change in their DBMS.
This section first describes topics that anyone about to conduct a migration assessment
should think about, particularly in terms of establishing appropriate objectives for an
assessment. Then, it describes the process that is used to migrate database resources and
applications.
Although database administrators and application teams are generally the core part of a
migration team, other teams such as infrastructure, DevOps, security, vendor, business,
and support are all required to be involved at some point:
– Extra hardware is often required to perform a migration. Some strategies to avoid
downtime involve running the old and new database platforms side by side for an
extended period.
– Organizational policies often dictate how data should be secured and conformed to
industry accepted security benchmarks. Often, different features must be implemented
and configured to meet these goals in a new product.
– Monitoring and alerting might need to be integrated into existing systems, or retraining
might be a requirement for DevOps and support teams to support the new system.
– Replacement of backup and recovery software and processes, or integration into
existing archiving solutions, might be required.
– Training of staff in the new technology or administration tools is often required.
Therefore, the output of a quality migration assessment at the beginning of a migration
journey allows organizations to make informed decisions and build a migration plan that
ensures success by accounting for every aspect.
Feasibility
The decision to change DBMSs should consider many factors, and the methodology to
decide whether to migrate can be different for every organization. The amount of
importance that an organization might allocate to a particular factor might differ
considerably. However, some factors do stand out as a considerable obstacle to
performing a migration.
One example is where vendors warrant their software only when it is used with a specific
database. If the application that uses the database is a third-party product, check with the
vendor about support for your proposed database platform because warranties might be
affected.
Partnering with an experienced database migration service provider helps to identify those
things that your organization should be weighing in its decision on whether migration is
feasible.
Understanding effort
Understanding the type of effort, where to expend it, and the amount that is required is
important to calculate the time and cost of a migration. Here is an overview of some of the
different areas where effort can be required and how you can estimate it.
You leverage automation in almost everything that you do to reduce effort, and automation
can certainly help in reducing the effort that is required to migrate a database to FUJITSU
Enterprise Postgres. However, there is often a large capital expense that is associated
with automation investment. Fortunately, significant investment already has been made by
various commercial and open source projects in this area (including Fujitsu). The focus of
a substantial amount of this investment is around database structure, code, and data.
Database structure, code, and data are key foci because they are obvious things to be
migrated, with clear mappings (in most cases) to a target database equivalent. However,
there are other areas that should be thoroughly assessed where automated tools are not
available, which are covered later in this section.
When using a tool to perform an automated assessment of the database schema, it
should ideally be calibrated with the intended migration tool so that it is “aware” of the level
of automation and can provide an accurate assessment of how much can be migrated
automatically and how much requires manual effort.
Because most relational database implementations are based on an ANSI SQL Standard,
there is a certain level of compatibility between different database vendor
implementations. Therefore, tools generally work on the concept of identifying
incompatibilities with the target database in the source database DDL and source files.
These incompatibilities are broadly classified into those incompatibilities that can be
migrated automatically and ones that require manual effort.
Manual effort is a relative (between incompatibilities) arbitrary value that is associated with
each incompatibility type that can be adjusted by a multiplication factor:
– Experience of the migration team.
– Contingency buffer.
– Other complexities (like environmental).
The total effort is the total of the incompatibility manual effort that is multiplied by the
multiplication factor.
For the estimate to be accurate, a good correlation between the experience of the team
and the multiplication factor is required. Generally, the more experienced the team, the
more accurate the factor, the shorter the estimate, and the more accurate the overall
migration estimate. This reason is one to consider using a migration service provider like
Fujitsu.
Another reason to consider the services of an experienced migration provider is their
accumulated experience, which is typically consolidated and articulated through a
knowledge base that covers best practices for resolving incompatibilities that require
rewriting or technical know-how. The provider can represent significant savings compared
to the same activity being conducted by a relatively inexperienced team.
For more information about the types of incompatibilities, see “Other differences and their
complexity level of migration” on page 235.
Areas where automated assessment is difficult include the following ones:
– Architecture.
The database architecture focuses on the design and construction of a database
system that can meet the defined requirements for the system. Such requirements
cover features such as resilience, HA, security, flexibility, and performance.
DBMSs from different vendors deliver these features through different mechanisms, so
architectures might vary across vendor implementations to achieve the same or
equivalent functions.
HA is one area where these differences can occur. Shared storage that is used by two
read/write instances might be regarded as a strength by one vendor but a weakness by
another vendor (due to the single point of failure of the storage) who favors a hot
standby with separate-replicated storage as a more resilient approach.
Careful consideration of what the requirements of the organization are and how they
are best met by available architecture designs should be the key focus.
Often, requirements can be met by more than one architecture, so maintainability and
flexibility should also be considered. Complex architectures might add risk when
compared to simple ones that still deliver the needs of an organization.
– Security and governance.
Data security is a major concern for organizations with significant consequences for
not complying with growing regulatory policies around the management of personal
data. These consequences are financial penalties for noncompliance and the loss of
trust from your customers.
Enterprise organizations have strict policies for how data must be protected and the
benchmarks to be met. Features and configuration differ between vendors, and care
must be taken to ensure that the configuration of target platforms continue to meet
such policies and benchmarks.
– Tools.
Many commercial DBMSs come with their own brand of tools for administration,
monitoring, backups, and so on. Changing DBMSs usually results in also having to use
a different tool for these functions.
There are several tools that are available that provide organizations with suitable
functions for these tasks. Which tools to use might depend on the specific
requirements or areas of importance to them. Adequate time should be allocated for an
evaluation of a suitable toolset and training in its use.
– Training.
Moving employees to a new DBMS can present some significant challenges. Staff has
many years that are invested in a product and gained a level of competence that
provides them with worth within an organization. Moving to a different product can
create concerns for some employees about the potential loss of that investment in
themselves.
Fujitsu Enterprise Postgres is like Oracle Database, and much of the existing
knowledge that is possessed by Oracle DBAs can be applied to a Postgres product.
However, employees should be encouraged to learn about the benefits of learning the
new system.
– Licensing and subscriptions.
– Testing.
One of the most important areas of a migration is testing. Testing often makes up more
than 40% of a migration project, but it is easily underestimated when assessing the
migration effort.
These areas are often overlooked during migration projects and can result in the success
of the project being compromised.
To accurately assess these areas requires a good understanding of the current
environment (from a technical standpoint and an operational and governance
perspective), and a good understanding of the target environment and how it can deliver
the equivalent or better results. Therefore, migrations often involve a blended team that is
made up of an organization’s own subject matter experts (SMEs) and a migration partner
with experience in the targeted platform.
Ensuring success
Planning for success is all about coverage and removing unknowns. A successful migration
project is one that does a good job of mitigating risk, and risk mitigation is all about
comprehensive scoping and detailed planning.
Understanding the data being migrated and how migration activities affect customers and the
impact on business in terms of loyalty and reputation is an important step to ensure that
appropriate checks or backup options are in place.
Data migrations are seldom an isolated project. More often, they are part of a larger
modernization or transformation project. If so, then close collaboration with the larger project
can avoid many issues that are associated with waiting until the new system is complete.
Although automation helps mitigate technical risks, a thorough data verification process that
runs at the data storage level as data is migrated helps to identify problems early before they
impact the business. This process should be implemented in addition to user testing.
With testing, data verification, and data reconciliation, you can avoid an impact on the
business through early detection of issues. However, identifying how data issues occurred
can present its own set of challenges. Change Data Capture (CDC) or a data auditing
capability should be built in to the migration process so that issues can be understood and
quickly resolved.
Again, using the knowledge of an experienced migration service provider helps to plan for a
successful migration.
Maximizing strengths
A successful migration should provide the business with new technology and the ability to use
data in better ways that allows the business to respond to a fast-changing business
environment. Therefore, new features of the data storage platform are a consideration during
migration planning.
For example, the ability to store and access data in a JSON format is used by applications to
exchange data. Should data from a source system be stored in a binary JSON column of the
target database or as individual columns?
Another example that is applicable to Fujitsu Enterprise Postgres is the Vertical Columnar
Index feature, which updates indexed columnar structures in memory as row data is updated.
This feature allows an efficient execution of various analytical style queries. Using this feature
is something that should be considered in migration planning.
These types of considerations require expert knowledge about the data and how it can be
leveraged by the business, and the features of the target system and how to best leverage
them.
The feasibility of migrating database resources is determined primarily by the level of the
migration effort, which depends on the source system scale and construction. Therefore,
migration projects should always start with the assessment step at the pre-migration planning
stage.
The migration path to Fujitsu Enterprise Postgres includes the following main workflows,
which are illustrated in Figure 8-10 on page 244:
Step 1: Assessment
The source system is examined thoroughly in terms of database architecture, the size of
the data, and assets such as database schemas. Then, the technical impact is analyzed,
and the level of effort of the migration project is identified.
Step 2: Estimation
In this step, the economic impact of a migration project is analyzed. The cost of the
migration project is estimated, including testing, based on the assessment result. In
practice, other costs such as the infrastructure that is required, temporary licenses, and
training staff should also be considered. Based on this estimation, the project owner
determines the feasibility and decides whether to proceed with this migration project.
Step 3: Preparation for migration
The migration plan is created in this step. The plan includes a schedule and team structure
that is determined based on the estimation result. Preparation also takes place to begin
the next step by building the development environment and the production environment.
Step 4: Migration
In the final step, migration is performed according to the following system development
process:
– Database configuration design and construction
– Database operation design
– Application design
– Implementation
– Testing
Step 1: Assessment
Assessment is the first step of a migration project, which analyzes the technical impact and
identifies the level of effort of the migration project. The level of the migration effort varies
widely depending on several factors. Therefore, the source system must be explored in terms
of database architecture, the size of data, and assets such as database schemas to
understand what must be done in the migration step. The difficulty to complete the migration
must be assessed. Assessment is required at the pre-migration planning stage.
One of the major consideration points for database migration is the impact on system
performance. If system performance after migration is a key concern, performance validation
is necessary during this step. Validation is done by pre-migrating some schemas and
applications to evaluate whether they meet the system performance requirements.
In “Step 3: Preparation for migration” on page 247, the migration plan is created based on the
skills and productivity of the engineers. Therefore, if this migration is the first time that you do
a migration from Oracle Database to FUJITSU Enterprise Postgres, it is a best practice to
evaluate the skills and productivity of engineers in this step.
The main exploration targets and assessment points are shown in Table 8-15.
Data To identify how long data migration will take, assess the following
points:
The number of table constraints and indexes.
Data size.
Data migration strategy.
Step 2: Estimation
The second step is to estimate the migration project cost and how long data migration will
take. At the end of this step, the project owner determines the feasibility and decides whether
to proceed with this migration project.
Migration cost
Based on the assessment result, the migration effort and the required scope for design,
implementation, and testing is determined. The cost of the migration project cost includes
considerations of the amount of work that is required and the skills and productivity of the
engineers performing the migration.
When estimating the testing efforts, it is a best practice to allow for sufficient time to
prepare for database-migration-specific issues. When migrating a database, some
differences between the source database and the target database might remain unnoticed
in the design or implementation steps. For example, it is difficult to see the following
differences before testing:
– Calculation results of numbers that run in SQL might differ due to the differences in
rounding-up or rounding-down.
– The SQL output of date and time might differ due to the differences in the data types
precision or format.
Data migration time
It is also important to know in this step how long it takes to migrate data.
The migration time depends on the data size and the migration strategy. Even if the data
volume is the same, the data migration time varies depending on the selected migration
method and environment. Therefore, it is a best practice that you perform a data migration
rehearsal in a test environment. The objective of the rehearsal is to verify that data
migration will be successful and validate the migration time.
Step 4: Migration
The final step is to migrate all the assets from Oracle Database to FUJITSU Enterprise
Postgres. This step is the same as a system development process.
Database configuration design and construction
The configuration of target databases is designed to meet the organization's system
requirements and deliver a system that is equivalent to source databases. When
designing, consider the differences in the database architecture between Oracle Database
and FUJITSU Enterprise Postgres.
Table 8-16 shows two types of general database architecture. When designing, pay
attention to the differences, especially for HA systems.
Single instance One server contains a single database with one instance. There
is no special consideration.
Note: For more information about HA on FUJITSU Enterprise Postgres, see 2.1.1,
“Database multiplexing” in Data Serving with FUJITSU Enterprise Postgres on IBM
LinuxONE, SG24-8499.
Audit Logging See 4.5, “Audit Logging” in Data Serving with FUJITSU
Enterprise Postgres on IBM LinuxONE, SG24-8499.
HA and high reliability See Chapter 5, “High availability and high reliability
architectures” in Data Serving with FUJITSU Enterprise
Postgres on IBM LinuxONE, SG24-8499.
Backup and recovery See 9.1, “Backup and recovery overview” in Data Serving with
FUJITSU Enterprise Postgres on IBM LinuxONE, SG24-8499.
Database monitoring See 9.4, “Monitoring” in Data Serving with FUJITSU Enterprise
Postgres on IBM LinuxONE, SG24-8499.
Database version upgrade See Appendix A. “Version upgrade guide” in Data Serving with
FUJITSU Enterprise Postgres on IBM LinuxONE, SG24-8499.
Application design
A modification method for assets such as SQL and applications to run on the target
database system is designed.
– SQL
The basic elements of SQL include data types, DDL, DML, and functions. Table 8-18
shows the key considerations of SQL migration. Table 8-19 on page 250 shows DDL
statements. Table 8-20 on page 251 shows DML statements. Table 8-21 on page 252
and Table 8-22 on page 253 show functions. Table 8-22 on page 253 shows other
types of SQL statements.
Note: Appendix C, “Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and
PL/pgSQL” on page 387 provides specific examples of SQL migrations that are frequently
used in Oracle Database.
Representing Character Data Both databases support CHAR, VARCHAR2, NCHAR, and NVARCHAR2 types.
However, the maximum sizes are specified differently.
CLOB, NCLOB, and LONG types can be converted to TEXT type to store text.
Representing Numeric Data Oracle Database and Fujitsu Enterprise Postgres support different number data types.
Convert as needed to resolve the differences in significant figures and truncation.
Representing Date and Time Both databases support date and time data types. However, convert as needed to
Data resolve the differences in precision, time zone specification, and format.
Representing Specialized Large Object data types, such as BLOB, store binary data. Convert to alternative
Data data types on FUJITSU Enterprise Postgres.
Both databases support XML data types and JSON data types, but the functions
are different. Design how to convert.
Other data types are not supported on FUJITSU Enterprise Postgres. Conversion
must be accounted for.
Identifying Rows by Address Fujitsu Enterprise Postgres does not have equivalent data types for ROWID and
UROWID. Convert as needed to identify the row by using the SERIAL type or
SEQUENCE.
Displaying metadata for SQL Fujitsu Enterprise Postgres does not support equivalent data types for displaying
Operators and Functions metadata for SQL operators and functions such as ARGn data type. Conversion must
be accounted for.
CREATE SCHEMA The CREATE SCHEMA statement does not create a schema object on Oracle Database.
This statement creates a new schema object on FUJITSU Enterprise Postgres. Specify
a schema name that is different from your existing schemas. Otherwise, the database
server issues an error.
CREATE DATABASE LINK The CREATE DATABASE LINK statement creates a database link on Oracle Database that
enables access to objects on another database.
The Foreign Data Wrapper (FDW) function is used on Fujitsu Enterprise Postgres to
implement the function to access objects on another database, which allows access to
objects on Fujitsu Enterprise Postgres or Oracle Database. The CREATE EXTENSION
statement enables an FDW feature that is supplied as additional modules.
CREATE TRIGGER The CREATE TRIGGER statement creates a database trigger on both databases. However,
conversion must account for the following differences.
Syntax to write trigger functions.
Languages to write trigger functions.
Events that call a trigger's function.
CREATE INDEX Oracle Database supports several types of indexes, such as B-tree indexes, bitmap
indexes, and functional-based indexes.
Fujitsu Enterprise Postgres supports only B-tree indexes. If Oracle Database uses
indexes other than B-tree, they must be changed to B-tree indexes, or these indexes
must be deleted.
The difference in the length of the index keys and the data types that are specified for
the index key must be considered.
CREATE MATERIALIZED VIEW The CREATE MATERIALIZED VIEW statement creates a materialized view on both
databases. However, conversion must account for the following differences:
Features that are supported in materialized views.
How to refresh materialized views.
The syntax to write materialized views.
CREATE OPERATOR The CREATE OPERATOR statement allows you to define operators on both databases.
However, conversion must account for the differences in syntax.
CREATE SEQUENCE The CREATE SEQUENCE statement creates a sequence on both databases. However,
conversion must account for the differences in syntax.
CREATE FUNCTION and Both databases support the CREATE FUNCTION and CREATE PROCEDURE statements to
CREATE PROCEDURE create stored functions and stored procedures. However, conversion must account for
the differences in syntax and languages.
DDL Description
CREATE TABLE The CREATE TABLE statement creates a relational table on both databases. However,
conversion must account for the differences in the data types that are specified in the
tables and syntax.
After creating a table, partitions can be defined on both databases. However,
conversion must account for the following differences:
Types of partitioning.
Partitioning features.
Syntax to define partitions.
CREATE VIEW The CREATE VIEW statement creates a view on both databases. However, Fujitsu
Enterprise Postgres does not support the WITH READ ONLY option, so conversion must
account for the differences in syntax.
CREATE ROLE The CREATE ROLE statement create a role on both databases. However, conversion must
account for the differences in syntax.
CREATE USER The features of database users are different. A user on Fujitsu Enterprise Postgres is a
type of role. Thus, conversion must account for the differences in the functions and DDL
syntax.
User authentication on Fujitsu Enterprise Postgres is managed in the configuration file
pg_hba.conf.
CREATE * Fujitsu Enterprise Postgres does not support database objects such as clusters,
index-organized tables, object tables, packages, and synonyms. You must implement
equivalent functions.
MERGE Oracle Database supports the MERGE statement to select rows from one or more
sources for update or insertion into a table or view.
The equivalent function in Fujitsu Enterprise Postgres is implemented by using the
INSERT statement with the ON CONFLICT clause.
EXPLAIN PLAN The EXPLAIN PLAN statement on Oracle Database is equivalent to the EXPLAIN
statement on FUJITSU Enterprise Postgres. However, conversion must account for the
following differences:
The executed results are not stored in a table.
Syntax.
Numeric functions Most numeric functions are supported on FUJITSU Enterprise Postgres. However,
conversion must account for the differences in parameters and precision of return
values.
The following functions are not supported. Thus, other functions must be used for
implementation.
BITAND
REMAINDER
Character functions Most character functions returning character values are supported on FUJITSU
returning character Enterprise Postgres. However, conversion must account for the differences in
values parameters.
The following functions are not supported. Thus, other functions must be used for
implementation.
NCHR
NLS_INITCAP
NLS_LOWER
NLS_UPPER
SOUNDEX
TRANSLATE ... USING
Character functions Most character functions returning number values are supported on FUJITSU
returning number Enterprise Postgres.
values
Single-Row Functions
Date and time Most date and time functions are supported on FUJITSU Enterprise Postgres. However,
functions conversion must account for the differences in precision and time zones that are
available.
General comparison General comparison functions are supported on FUJITSU Enterprise Postgres.
functions
Conversion functions Most conversion functions are supported on FUJITSU Enterprise Postgres. However,
conversion must account for the differences in precision and parameters.
Collection functions CARDINALITY is not supported on FUJITSU Enterprise Postgres. The equivalent
information is acquired by running SQL.
Other collection functions are also not supported on FUJITSU Enterprise Postgres.
Consider how to implement the equivalent functions.
XML functions Most XML functions are supported on FUJITSU Enterprise Postgres. However,
conversion must account for the differences in parameters.
JSON functions Both databases support JSON functions, but the function is different. Consider how to
implement the equivalent function on FUJITSU Enterprise Postgres.
Encoding and Both databases support DECODE, but the function is different. Consider how to implement
decoding functions an equivalent function on FUJITSU Enterprise Postgres.
Other encoding and decoding functions are not supported on FUJITSU Enterprise
Postgres. Consider how to implement the equivalent functions.
NULL-related functions Most NULL-related functions are supported on FUJITSU Enterprise Postgres.
Environment and USER is not supported on FUJITSU Enterprise Postgres. Implement the equivalent
identifier functions function by using an alternative function.
Other environment and identifier functions are not supported on FUJITSU Enterprise
Postgres. Consider how to implement the equivalent functions.
Function Description
Other functions The following functions are not supported on FUJITSU Enterprise Postgres. Consider
how to implement the equivalent functions.
Character set functions
Collation functions
Large object functions
Hierarchical functions
Data mining functions
Aggregate functions Most aggregate functions are supported on FUJITSU Enterprise Postgres. However,
conversion must account for the differences in return values and options.
Analytic functions Most analytic functions are supported on FUJITSU Enterprise Postgres. However,
conversion must account for the differences in return values and options.
Object reference functions Object reference functions are not supported on FUJITSU Enterprise Postgres.
Consider how to implement the equivalent functions.
Model functions Model functions are not supported on FUJITSU Enterprise Postgres. Consider how to
implement the equivalent functions.
OLAP functions OLAP functions are not supported on FUJITSU Enterprise Postgres. Consider how to
implement the equivalent functions.
Data cartridge functions Data cartridge functions are not supported on FUJITSU Enterprise Postgres. Consider
how to implement equivalent functions.
User-defined functions User-defined functions are supported on FUJITSU Enter-prise Postgres. However,
conversion must account for the differences such as syntax so that it can be run on
FUITSU Enterprise Postgres.
COMMIT Both databases support COMMIT statements. However, conversion must account for the
differences in syntax.
ROLLBACK Both databases support ROLLBACK statements. However, conversion must account for the
Transaction Control Statements
differences in syntax.
If the TO SAVEPOINT clause is specified on the Oracle Database, use the ROLLBACK TO
SAVEPOINT statement on FUJITSU Enterprise Postgres.
SAVEPOINT Both databases support SAVEPOINT statements. However, conversion must account for
the differences in creating a save point with the same name of an existing save point.
SET TRANSACTION Both databases support SET TRANSACTION statements, but the function is different.
If the ISOLATION LEVEL clause is specified on the Oracle Database, use the SET
TRANSACTION statement on Fujitsu Enterprise Postgres to account for the differences in
syntax.
If other clauses are specified, consider how to implement the equivalent function.
SET CONSTRAINT Both databases support the SET CONSTRAINT statement function, but the statement name
is different. Convert to SET CONSTRAINTS on FUJITSU Enterprise Postgres.
ALTER SESSION The ALTER SESSION statement is not supported on FUJITSU Enterprise Postgres.
Session Control Statement
SET ROLE Both databases support SET ROLE statements, but the function is different. Consider how
to implement the equivalent function.
SET ROLE statements on Oracle Database enable or disable a role for the current session.
SET ROLE statements on Fujitsu Enterprise Postgres change the user identifier for the
current session.
System Control Statement Both databases support ALTER SYSTEM statements, but the function is different because
of the differences in the database architecture. Consider how to implement the
equivalent function.
Operators Most operators are supported on FUJITSU Enterprise Postgres. However, if Oracle
Database uses the following operators, consider how to convert or implement the
equivalent function:
Hierarchical query operators such as PRIOR and CONNECT_BY_ROOT
MINUS
Multiset operators such as MULTISET, MULTISET EXCEPT, MULTISET INTERESCT, and
MULTISET UNION
Both databases support the concatenation operator '||'. However, conversion must
account for the differences in behavior when specifying concatenated strings containing
NULL.
Expressions Most expressions are supported on FUJITSU Enterprise Postgres. However, conversion
must account for the differences in syntax and format of the returned value.
The following expressions are not supported. Thus, consider how to implement
equivalent functions.
CURSOR expressions
Model expressions
Object access expressions
Placeholder expressions
Type constructor expressions
Implicit Data Both databases support implicit data conversion, but the function is different. The scope
Others
Conversion of implicit data conversion in Fujitsu Enterprise Postgres is smaller than Oracle
Database. Consider how to implement the equivalent function by using alternatives such
as explicit data conversion.
Zero-length character Oracle Database handles zero-length character values as NULL, but Fujitsu Enterprise
value Postgres handles them as not NULL.
If zero-length character values are used as NULL on an Oracle Database, consider how
to implement the equivalent function on FUJITSU Enterprise Postgres.
– PL/SQL
Table 8-23 shows key considerations when migrating from Oracle Database PL/SQL to
Fujitsu Enterprise Postgres PL/pgSQL.
Block Basic syntax elements of PL/SQL such as block, error handling, and variables are
Basic syntax
Variables
Scalar data types For more information about SQL data types, see Table 8-18 on page 249.
Both databases support the BOOLEAN data type.
Other data types such as PLS_INTEGER and BINARY_INTEGER data types are not
Data types
Composite data types Collection types are not supported on FUJITSU Enterprise Postgres. Consider how to
implement the equivalent functions, for example, by using a temporary table.
Both databases support record variables. However, conversion must account for the
minor differences.
LOOP statement Both databases support basic LOOP statements, WHILE LOOP statements, and FOR LOOP
statements. However, conversion must account for the differences in the REVERSE clause
of the FOR LOOP statement.
GOTO statement The GOTO statement is not supported on FUJITSU Enterprise Postgres. Consider how to
implement the equivalent function on FUJITSU Enterprise Postgres.
Cursor Both databases support cursors. However, conversion must account for the minor
differences.
Transaction processing Both databases support COMMIT and ROLLBACK statements. However, conversion must
Static SQL
and control account for the differences in transaction control specifications. For example, on
FUJITSU Enterprise Postgres, in a block including error handling that uses the
EXCEPTION clause, COMMIT and ROLLBACK statements return errors.
EXECUTE IMMEDIATE The EXECUTE IMMEDIATE statement on Oracle Database is equivalent to the EXECUTE
Dynamic SQL
OPEN FOR statement Both databases support OPEN FOR statements. However, conversion must account for the
minor differences.
Subprograms Subprograms are not supported on FUJITSU Enterprise Postgres. Thus, consider how
to implement the equivalent functions.
Triggers Both databases support triggers. However, conversion must account for the minor
differences.
Packages Packages are not supported on FUJITSU Enterprise Postgres. Thus, consider how to
implement the equivalent functions.
Oracle Supplied PL/SQL Some packages and procedures such as DBMS_OUTPUT, UTL_FILE, and DBMS_SQL are
packages supported on FUJITSU Enterprise Postgres. However, conversion must account for the
minor differences.
Other Oracle supplied PL/SQL packages are not supported on FUJITSU Enterprise
Postgres. Thus, consider how to implement the equivalent functions.
– Application
Some interfaces that are supported on Oracle Database are not supported on
FUJITSU Enterprise Postgres. Even if the same interface is supported on both
databases, API specifications might differ. Consider how to implement the equivalent
functions on Fujitsu Enterprise Postgres while accounting for the differences in the
interfaces.
Fujitsu Enterprise Postgres supports the following client interfaces.
• JDBC driver.
• ODBC driver.
• libpq - C Library.
• ECPG - Embedded SQL in C.
– Batch file for database operation
The specifications of operation commands that are written in batch files for database
operations are widely different between Oracle Database and FUJITSU Enterprise
Postgres. Consider how to implement the equivalent functions on FUJITSU Enterprise
Postgres.
– Data
The basic steps of data migration are as follows.
i. Extract data from the source database and store data in files.
ii. Convert data formats that are stored in files in step i to the target database format.
iii. Move and insert data that is converted in step ii to the target database.
Note: For more information about performance tuning, see 8.3.2, “Performance tuning tips”
on page 258.
– Operational testing
Make sure that the target system can operate the business as expected. Create the
test case scenarios for each of the following steady operations, complete all scenarios,
and confirm whether the results are as expected:
• Start and stop the database.
• Copy data as backups and manage them.
• Check disk usage and allocate free spaces by running VACUUM or rebuilding an
index.
• Review the audit log information.
• Check the connection status. For example, check whether there is a connection that
is connected for a long period or that occupies resources to prevent performance
degradation.
• Patching.
• Switching, disconnection, and failback nodes for maintenance in a HA environment.
– Recovery testing
Make sure that business can be recovered and continue if there are abnormal
problems in the target system. Create the test case scenarios for each of the following
operations, complete all scenarios, and confirm whether business can be resumed
immediately:
• Recovering from hardware failures, for example, disks and network equipment.
• Recovering data from backups.
• Processing during an abnormal operation of an application. For example, if there is
a connection that occupies resources, disconnect to eliminate the waiting status.
• Processing while running out of disk space.
• Processing during a failover in a HA environment, which is called a failback.
This section outlines the tasks that are required for performance tuning.
Performance tuning can be organized into three perspectives: the optimal use of computer
resources, minimizing I/O, and narrowing the search area. Each perspective is described in
detail:
Optimal use of computer resources
The key point is to consider the architecture of PostgreSQL itself. PostgreSQL uses a
write-once architecture. To make effective use of this mechanism, the parameters of the
configuration file postgresql.conf must be adjusted for computer resources, and
resources must be distributed for optimization.
Details are provided in the following tables:
– Table 8-24 on page 259
– Table 8-25 on page 260
– Table 8-26 on page 260
Minimizing I/O
The most important aspect of performance tuning is to reduce I/O activity that is
associated with data refresh operations. Therefore, the database performs processing in
memory as much as possible to improve performance. However, there are certain
processes in PostgreSQL that are run to ensure the persistence of the data, such as
COMMIT and checkpoint. COMMIT is the process where updates to the database are written
to the disk and saved. Checkpoint is the process where data that is held in memory is
written to the disk. Writing data in the memory to disks must be done efficiently or it might
lead to various bottlenecks.
Details are provided in the following tables:
– Table 8-27 on page 260
– Table 8-28 on page 261
– Table 8-29 on page 261
Narrowing the search area
The key is ensuring efficient data access by avoiding unnecessary processing and
resource consumption when accessing data in the database. Specifically, SQL statements
must be written and SQL must be run based on the latest statistics to perform data
processing with minimum I/O processing.
Also, actively use mechanisms such as caching similar queries by using prepared
statements and eliminating connection overhead with connection pooling.
Details are provided in the following tables:
– Table 8-30 on page 262
– Table 8-31 on page 262
– Table 8-32 on page 262
Table 8-24 Organizing table and index data and updating statistics
Task Description
Tuning goal Organize table and index data and update statistics.
Tuning objectives 1. Prevent bloating of data files and increased I/O processing.
2. Prevent statistics from becoming stale and causing unnecessary
I/O activity without proper execution planning.
Tuning method Consider setting up and implementing the following tasks regularly:
1. Preventing data files from enlarging.
– Reuse unnecessary area by VACUUM.
– Remove unnecessary space with REINDEX.
2. Updating statistics to reduce unnecessary I/O activity by
updating statistics with ANALYZE.
Tuning objectives Adjust the parameters for the execution plan to increase the
likeliness that a better execution plan is selected.
Tuning method In postgresql.conf, adjust the parameters for work memory size,
genetic query optimizer, planner’s estimated costs, parallel
processing, and conflict.
Table 8-26 Accelerating SQL execution with resource partitioning and avoiding locks
Task Description
Tuning goal Accelerate SQL execution with resource splitting and lock
avoidance.
Tuning method 1. Leverage table spaces and partitioning. Tune SQL to use
partition pruning.
2. Review application logic to shorten the database resource lock
duration, such as by splitting transactions.
Tuning objectives Optimize I/O processing by tuning various buffer sizes, such as
shared_buffer and wal_buffers, and parameters that re related to
WAL and checkpoints.
Tuning method Decide the values of various buffer sizes, including shared_buffer,
wal_buffers, checkpoint_timeout, and max_wal_size and
parameters for WAL and checkpoint based on the hardware
specifications and description of the operation.
Task Description
Tuning method 1. When inserting bulk data, temporarily remove indexes and
create them later one at a time.
2. If there are many updates, apply a specification that provides a
margin to the data storage area (FILLFACTOR).
This process can streamline processing by eliminating the need
for index updates.
Tuning goal Large amounts of data are stored in bulk and in parallel.
Tuning method Store data by using a COPY command and running multiple
executions in parallel.
Tuning objectives 1. Replace sort and scan processes with indexes to improve SQL
performance.
2. Improve performance by using indexes more efficiently.
Tuning objectives The performance of SQL varies greatly depending on the action
plans. Ensuring that the action plan does not fluctuate is critical.
Task Description
We describe the features of these two use cases and the concepts of migration. Also, we
describe how to determine the target platform when migrating database systems.
Virtualization level 1
Applications run on operating systems (OSs) that are directly hosted by server machines.
Virtualization level 2
The infrastructure layer is virtualized. Historically, with the improvement of server machine
processing power and capacity, an abstraction layer, which is known as hypervisors, were
introduced. Hypervisors run between the physical hardware and virtualized machines to
allow applications to efficiently use compute resources of one machine. The abstraction is
realized by software, firmware, or hardware. Each virtual machine (VM) runs its own guest
OS. In virtualization level 2, the applications are abstracted from the infrastructure layer.
Virtualization level 3
The OS layer is virtualized. Applications share the OS kernel of the host system, which
resolves the concerns of excess overhead in memory and storage allocation that is
required for VMs. In virtualization level 3, the applications are abstracted from the OS and
infrastructure layers.
Along with the virtualization of underlying layers, containerization technology can package
a single application and its dependencies into a lightweight “container” to run on any OS.
The advancement of container orchestration and platforms promote the adoption of
containerized applications.
In the following sections, we describe the general characteristics and the migration of legacy
systems, which are divided into two categories:
Migration scenario for large-scale legacy systems
These systems are often migrated to a traditional on-premises environment with
virtualization level 1 or 2.
Migration scenario for small and medium-scale systems
These systems are often migrated to a private cloud with virtualization level 3 or a public
cloud with virtualization level 3.
Modernization should be strategically planned and carried out to achieve DX initiatives with
maintainability and extensibility.
Fujitsu Enterprise Postgres is a database with an open interface that is based on open source
PostgreSQL with extra enhanced features, such as security and HA and other features that
can be used in enterprise systems. It also provides flexibility for several environments.
Therefore, it can be used in either use case. In addition, there are also benefits in cost
savings because HA features and security strengthening features are available as standard
features.
However, the system requirements vary, so migration to a container environment might not be
possible. In some cases, a traditional on-premises environment with virtualization level 1 or 2
might be best to meet the system requirements. Therefore, identify the characteristics of the
systems first; decide whether migration to a container environment is possible; and then
decide whether migration to a traditional on-premises environment with virtualization level 1
or 2 is wanted to determine the appropriate target system platform.
Note: For more information about container technology, contact a FUJITSU representative
at the following website:
https://www.fast.fujitsu.com/contact
Section 8.4.3, “Migration scenario for large-scale legacy systems” on page 266 describes a
use case of migration from large enterprise systems on Oracle RAC to Fujitsu Enterprise
Postgres on a traditional on-premises environment with virtualization level 2. The test cases
that are used to explain this migration is conducted on the lab environment that was set up for
this book, which simulates this use case.
Section 8.4.4, “Migration scenario for small and medium-scale systems” on page 279
describes a use case of migration from small to medium-sized enterprise systems on Oracle
RAC to Fujitsu Enterprise Postgres on a containerized environment with virtualization level 3.
The test cases that are used to explain this migration is conducted in a lab environment that
was set up for this publication, which simulates this use case.
The options that are available for the target architecture of a migration is not limited to the
ones that are presented in this book. For example, to offload hardware maintenance in the
target database system, some organizations might prefer to move from legacy database
systems on private cloud environments to a public cloud while the database software is kept
in a non-containerized form.
For an experience-based look at the installation of Red Hat OpenShift Container Platform
(RHOCP) on IBM LinuxONE, see Red Hat OpenShift Installation Process Experiences on
IBM Z and IBM LinuxONE.Platform and cloud types can be mixed and matched to cater to the
nature of different database systems and workloads.
To further enhance the scalability and manageability of containerized systems, IBM offers
IBM Cloud Pak for Multi Cloud Management, which offers a management layer of multiple
Red Hat OpenShift clusters across private cloud and IBM public cloud.
For more information or assistance with designing the migration journey that best achieves
your organization’s goals, contact your IBM or Fujitsu customer service representative.
For more information about these requirements, see the following sections.
8.2.1, “Business continuity” on page 215
8.2.2, “Mitigating security threats” on page 217
8.2.3, “SQL performance tuning” on page 221
System environment
The examples that are shown in this section are validated on the following systems. Both
systems use servers with the same specifications.
Source system
– OS: Red Hat Enterprise Linux
– Database: Oracle Database 19c
Target system
– OS: Red Hat Enterprise Linux
– Database: Fujitsu Enterprise Postgres 13
Migration scope
The migration scope includes DDL, data, and applications in an Oracle Database (PL/SQL):
Online processing and table data
As an example of online transactional processing, we use TPC Benchmark C (TPC-C),
which portrays the activity of a wholesale supplier such as ordering and payment. This
online processing performs consecutive processing of ordering, payment, order status
checks, delivery, and inventory checks.
In addition to online processing, we should migrate the table data that is used in the
processing. The data size that is used for this example is ~1.5 GB.
Batch job: Daily processing for aggregating business data
We created a batch job for the business process by using TPC-C databases for this
example. The batch job was created by using stored procedures of PL/SQL. This process
assumes a closing operation for daily daytime operations. This processing aggregates the
number of orders and sales for each item for the day and inserts the aggregated data into
the daily_sales table. This table is defined as a quarterly partitioned table. The table
definition is shown in Example 8-14.
Migration tools
In this section, we describe the tools that are used in the migration. Ora2Pg is used for DDL
and data migration, which is one of the migration tools that is used when migrating from
Oracle Database to PostgreSQL.
Source of the tool
This tool is open-source software and is available at no charge. It adopts the GPL license.
Comply with the license policy when using it.
Note: For more information about and to download Ora2Pg, see Ora2Pg.
Migration steps for a typical OLTP processing and table data scenario
In this section, we introduce the steps to migrate DDL and table data according to the
migration process that is described in 8.3.1, “Experience-based migration technical
knowledge” on page 231. We assume that we already know where changes will be made and
determined how to convert them during migration.
DDL migration
We show the steps to migrate DDL when using Ora2Pg and running on the Oracle host
machine.
In this example, a partitioned table is used to demonstrate DDL migration.
Table data migration
Ora2Pg can also be used for migrating the data that is stored in the table. As a best
practice, use test data to confirm that the scripts work before using them on live data. In
general, table data may be migrated to a new system at a different point (before go live).
To simplify the steps, DDL and table data are migrated simultaneously in this example.
Because Ora2Pg does not have extract, transform, and load (ETL) or CDC functions, we
cannot ensure data integrity when retrieving data from a running database. Therefore, we
choose a simple migration method that disconnects all client connections to the Oracle
Database and migrate the DDL and table data.
Note: If you need help when using a solution that uses ETL or CDC, contact a Fujitsu
representative at https://www.postgresql.fastware.com/contact.
3. Divide the DDLs that were retrieved in step 1 on page 269 into “DDL before inserting data”
and “DDL after inserting data”.
Generally in a table with foreign key constraints, it is necessary to determine the order in
which data is inserted and consider the constraint conditions, which can be a complicated
process. However, in database migration, the procedure can be simplified because the
table data in the source database already satisfies foreign key constraints. In this example,
insert all the data without any foreign key constraints that are defined for the target table,
and then define foreign key constraints after data is loaded.
To follow this procedure, divide the DDLs of the table definition that were retrieved in step
1 on page 269 into two parts: DDLs before inserting data (without a definition of foreign
key constraints), and DDLs after inserting data (the definition of foreign key constraints) by
using the ALTER TABLE statement.
Also, divide the DDL defining indexes in the same way as for the table definition to reduce
the time that is required to load data.
4. Run the “DDL before inserting data” scripts in Fujitsu Enterprise Postgres by using the
psql utility to define tables, as shown in Example 8-17.
5. Insert the table data that was retrieved in step 2 on page 269 into FUJITSU Enterprise
Postgres.
The table data that was retrieved in step 2 on page 269 is inserted into Fujitsu Enterprise
Postgres by using the psql utility to run the script, as shown in Example 8-18.
7. Update statistics.
As a best practice, run the ANALYZE command to update the statistics that are associated
with the table, especially when a large amount of data was inserted. Not doing so can
result in poor query plans. We used the command that is shown in Example 8-20.
Note: The numbers that are used in the annotations (such as “Step 1" and “Step 2") in the
examples indicate the migration step numbers that are described in “Procedure for
aggregating business data daily” on page 271.
WHERE
s.s_i_id = oo.i_id AND
s.s_w_id = oo.s_w_id
GROUP BY s.s_i_id ;
BEGIN
target_day := TRUNC(calc_day, 'DD');--Step 5
-- Delete any previous data
DELETE FROM daily_sales WHERE TRUNC(entry_date, 'DD') = target_day;
COMMIT;
-- Insert aggregated records in daily_sales table
FOR var_rec IN c1 LOOP
INSERT INTO daily_sales VALUES
(target_day, var_rec.i_id, var_rec.quantity, empty_blob());--Step 6
-- Store sales slip data
SELECT bussiness_form INTO wk_blob FROM daily_sales WHERE entry_date =
target_day AND i_id = var_rec.i_id FOR UPDATE;
wk_bfile := BFILENAME('FILE_DIR', var_rec.i_id || '.pdf');
DBMS_LOB.FILEOPEN(wk_bfile, DBMS_LOB.FILE_READONLY);
DBMS_LOB.LOADFROMFILE( wk_blob, wk_bfile, DBMS_LOB.GETLENGTH( wk_bfile));
DBMS_LOB.FILECLOSE(wk_bfile);
END LOOP;
COMMIT;
END;
/
The following steps describe how to convert this procedure in Oracle Database PL/SQL, as
shown in Example 8-21 on page 271, to Fujitsu Enterprise Postgres PL/pgSQL:
1. Change the syntax of CREATE PROCEDURE.
Change the syntax of the CREATE PROCEDURE statement to accommodate the differences
between Oracle and FUJITSU Enterprise Postgres. Table 8-33 shows the differences
between the two procedures.
Table 8-33 The syntax of CREATE PROCEDURE
Oracle Database FUJITSU Enterprise Postgres
BLOB bytes
NUMBER(6, 0) int
Following these steps facilitates the migration of the procedure from PL/SQL to PL/pgSQL, as
shown in Example 8-22.
BEGIN
target_day := date_trunc('day', calc_day);
-- Delete any previous data
DELETE FROM daily_sales WHERE date_trunc('day', entry_date) = target_day;
COMMIT;
-- Insert aggregated records including sales slip data into daily_sales table
FOR var_rec IN c1 LOOP
INSERT INTO daily_sales VALUES
(target_day, var_rec.i_id, var_rec.quantity,
pg_read_binary_file('/tmp/data/' || var_rec.i_id || '.pdf'));
END LOOP;
COMMIT;
END;
$$;
The following steps describe how to convert this function in Oracle Database PL/SQL
(Example 8-23 on page 275) to Fujitsu Enterprise Postgres PL/pgSQL:
1. Change the syntax of CREATE FUNCTION.
Change the syntax of the CREATE FUNCTION statement because it is different between
Oracle Database and Fujitsu Enterprise Postgres (Table 8-39).
Table 8-39 The syntax of CREATE FUNCTION
Oracle Database FUJITSU Enterprise Postgres
NUMBER numeric
6. Convert the outer join by using (+) to the SQL standard description (Table 8-43).
Oracle Database supports (+) to perform an outer join. Fujitsu Enterprise Postgres uses
LEFT JOIN to achieve the same result.
Table 8-43 Converting an outer join that is specified as (+)
Oracle Database FUJITSU Enterprise Postgres
NVL(s.total_quantity, 0) coalesce(s.total_quantity::numeric,
0::numeric)
Following these steps facilitates the migration of the function from PL/SQL to PL/pgSQL, as
shown in Example 8-24.
We start with the characteristics and key considerations for migration of small and
medium-scale systems:
Characteristics
We assume that small or medium scale line-of-business systems are in organizations and
that core systems are installed in branches or stores, such as in supermarket chains.
Therefore, many systems are scattered throughout the company. These systems are
operated, managed, and secured at each location and often operated primarily on small
servers. The data that is handled in these systems is closed within each line-of-business,
branch, and store to ensure high confidentiality. The total cost of operating and managing
these systems increases proportionately to the number of systems that are scattered
throughout the company.
Key considerations for migration
Consider reducing operational and management costs by consolidating them into a
container environment. There are two options about where to deploy a container: on a
private cloud and on a public cloud. Choose the appropriate option depending on how
sensitive the data is. In either choice of cloud, using container technology enables
organizations to reduce operational costs.
When considering the migration of small and medium-scale database systems, it is also a
great opportunity to consider moving to open interfaces. The benefit of replacing legacy
systems with open systems is that the organization will have access to many tools and
supporting software that can be implemented easily with the open interfaces. The wide
variety of options serve as building blocks of DX. In addition to the benefits of open
systems, Fujitsu Enterprise Postgres for Kubernetes enables the use of automation for
database system operations based on Kubernetes technology.
In addition, when consolidating into a container environment, the data is transferred to an
environment where multiple databases share the hardware layer and the platform layers.
Therefore, it is necessary to consider enhanced security compared to the security
measures that are taken for the source database systems where limited, closed access to
each individual server ensured high confidentiality.
These concerns are addressed by the isolation capabilities of RHOCP and high security
features of FUJITSU Enterprise Postgres. When different database systems are launched on
RHOCP in different containers, there is no need to consider conflicting port numbers, OS
usernames and database usernames, and directory configuration. Also, the RHOCP Project
feature (namespaces in Kubernetes) allows database systems to be isolated, which means
that with the appropriate permissions set for the database administrators, a database
administrator with permission to operate a particular project cannot interfere with the
database systems of other projects.
RHOCP and Fujitsu Enterprise Postgres provide role-based access control (RBAC). When
the authentication and permissions features of RHOCP and Fujitsu Enterprise Postgres are
configured and operational, legitimate means of attack do not influence the contents of the
database. An example of an attack is using SQL from client applications to access the
database.
However, there are non-legitimate means of attack, for example, stealing the storage device
or file images or accessing directly the network equipment to eavesdrop on communication
data. In these cases, authentication and permission settings are not a valid countermeasure.
Encryption is effective to protect the data from attacks against database systems by these
non-legitimate means. In addition to RBAC, Fujitsu Enterprise Postgres provides file-level
encryption and encryption of communication.
Figure 8-12 Access restriction and encryption for data protection with RHOCP and FUJITSU Enterprise
Postgres
The migration can run independently in each of the database systems because a namespace
is allocated for each database cluster in the target architecture. Each of the small and
medium-scale database systems that are scattered within an organization are migrated to
one namespace on the RHOCP environment. Therefore, after migration to the container
environment, organizations can continue to use the database names and database
usernames that are currently used. By separating the database clusters (namespaces), the
relationship between DBAs, database owners, and developers are maintained. The isolation
between databases is also ensured.
In addition, the high capacity of IBM LinuxONE can be leveraged, so hardware resources can
be used effectively.
Note: For more information about the procedures for deployment and operations of Fujitsu
Enterprise Postgres by using Fujitsu Enterprise Postgres Operator, see Chapter 7,
“Leveraging containers” on page 153.
This section describes the important points to remember when moving to a container
environment. The work that is associated with the migration to an RHOCP environment by
using Fujitsu Enterprise Postgres Operator is explained.
For more information about the migration flow, see 8.3.1, “Experience-based migration
technical knowledge” on page 231. For more information about migration to FUJITSU
Enterprise Postgres, see 8.4.3, “Migration scenario for large-scale legacy systems” on
page 266.
This section assumes that the RHOCP environment on the cloud was created in advance.
The following sections explain the migration process for databases.
Assessment
An assessment should be conducted while considering the migration.
This task is a common one when migrating to an on-premises environment. There are no
special points to consider for the migration to an RHOCP environment.
Estimation
The cost and time that is required for the migration must be estimated to determine whether
to perform the migration.
This task is a common one when migrating to an on-premises environment. There are no
special points to consider for the migration to an RHOCP environment.
The resources (CPU, memory, and storage), projects (namespace), and administrator ID of
the target RHOCP environment must be prepared.
Migration
In this task, the design and construction of the database configuration, operational design of
the database, application design, implementation, and testing are performed.
Consider the following items for databases in the RHOCP environment for each activity:
Migration activities: Database configuration design and construction:
– Security design
Design security to ensure security in an environment with consolidated databases.
There are two types of security design: permission design for accounts, and encryption
of files and communication paths.
Design permissions and scope for DBAs and developers so that each person has no
access to surrounding database instances. Apply permission settings to the accounts.
For more information about implementing Fujitsu Enterprise Postgres features for
file-level encryption and communication-path encryption, see “Enabling security with
the Fujitsu Enterprise Postgres Operator” on page 285.
– Cluster configuration design
The cluster configuration comes in a template. The design of the cluster configuration,
such as the number of replicas, can be designed based on the source system.
As a best practice, set up at least one replica for availability. Estimate the number of
replicas against performance requirements.
Migration activities: Database operation design:
– Backup design
The backup runs automatically as configured. Design retention periods and schedules
for backups (full backup and incremental backup) in accordance with current
requirements.
– Healing design
Automatic failover and automatic recovery are automated as configured, so there are
no other considerations aside from configuring them.
– Monitoring design
Configure monitoring with Grafana, Alert Manager, and Prometheus. Design the
operations of monitoring to include the utilization of tools that are linked by open
interfaces.
For more information, see 7.5.2, “Operation” on page 162 and 7.5.3, “Fluctuation” on
page 190.
Migration activities: Application design
Application design remains the same as for a migration to an on-premises environment.
There are no special considerations for migrating to an RHOCP environment.
Migration activities: Implementation
Implementation remains the same as a migration to an on-premises environment. There
are no special considerations for migrating to an RHOCP environment.
Migration activities: Testing
After migrating assets from the source system, perform operation verification to ensure
that the target system is functioning correctly:
– Consolidation on the RHOCP environment
Verify that the isolation and security settings between databases are sufficient.
– Migrating to containers
Verify that the backup works as configured by setting up an automated backup.
Note: If these two security features must be used, perform both settings when the
database cluster is deployed. It is not possible to enable security features after the
database cluster is deployed. By enabling these features, it is possible to dynamically
deploy encrypted table spaces and add clients by using MTLS.
Note: The setting to enable TDE and MTLS are explained in separate procedures. To
enable both security features when deploying the database cluster, follow steps 1 on
page 289 - 19 on page 295 in “Deploying communication path encryption by using MTLS”
on page 289. Then, when setting the parameters for step 20 on page 299, include the
parameter changes that are described in step 3 on page 286 in “Deploying with database
encryption by using TDE” on page 286. Complete the final steps in “Deploying
communication path encryption by using MTLS” on page 289.
In the Create FEPCluster window, click the YAML tab. Update the values as shown in
Table 8-46 and CR configuration parameters as shown in Figure 8-17 on page 288.
Update the deployment parameters and click Create to create a cluster.
Note: The TDE master encryption key can be updated by using pgx_set_master_key. For
more information about updating the TDE master encryption key, see 4.4.10 “Managing the
keystore” through “Rotating the TDE master key” in Data Serving with FUJITSU Enterprise
Postgres on IBM LinuxONE, SG24-8499.
4. From the Fujitsu Enterprise Postgres client, connect to the postgres database and create
a table space to apply TDE by using the command that is shown in Example 8-25.
5. Verify that the created table space is the target of the encryption by using the command
that is shown in Example 8-26.
7. From the Fujitsu Enterprise Postgres client, connect to the securedb database and create
secure_table. Then, insert the data ‘Hello World’ (Example 8-28).
8. Verify that the encrypted table can be referenced transparently (Example 8-29).
The list of certificates, certificate file names, private key file names, and passphrases that
are used in the deployment procedure is listed in Table 8-47.
CA certificate cacert - -
Patroni - mydb-patroni-cert -
certificate
repluser - mydb-repluser-cert -
certificate
rewinduser - mydb-rewinduser-cert -
certificate
Note: For the migration procedure that is presented in this publication, the self-signed
certificates are provided by a private CA. In a production environment, set up a CA that
is trusted by the administrators and users.
2. Create a ConfigMap to store the CA certificate. We used the command that is shown in
Example 8-32.
3. Create a password to protect the Fujitsu Enterprise Postgres server private key. We used
the command that is shown in Example 8-33.
4. Create the Fujitsu Enterprise Postgres server private key (Example 8-34).
Example 8-34 Sample openssl command to create a server private key with output
$ openssl genrsa -aes256 -out fep.key 2048
Generating RSA private key, 2048-bit long modulus
................................................+++
.......+++
e is 65537 (0x10001)
Enter pass phrase for fep.key: abcdefghijk
Verifying - Enter pass phrase for fep.key: abcdefghijk
5. Create a server certificate signing request. We used the command that is shown in
Example 8-35.
$ openssl x509 -req -in fep.csr -CA myca.pem -CAkey myca.key -out fep.pem -days
365 -extfile <(cat /etc/pki/tls/openssl.cnf <(cat san.cnf)) -extensions SAN
-CAcreateserial
Signature ok
subject=/CN=mydb-headless-svc
Getting CA Private Key
Enter pass phrase for myca.key: 0okm9ijn8uhb7ygv
7. Create the TLS secret to store the server certificate and key (Example 8-37).
Note: At the time of writing, the Fujitsu Enterprise Postgres container does not support
a password protected private key for Patroni.
$ openssl x509 -req -in patroni.csr -CA myca.pem -CAkey myca.key -out
patroni.pem -days 365 -extfile <(cat /etc/pki/tls/openssl.cnf <(cat san.cnf))
-extensions SAN -CAcreateserial
Signature ok
subject=/CN=mydb-headless-svc
Getting CA Private Key
Enter pass phrase for myca.key: 0okm9ijn8uhb7ygv
11.Create a TLS secret to store the Patroni certificate and key (Example 8-41).
12.Create a private key for a postgres user client certificate (Table 8-42).
Example 8-42 Creating a private key for the postgres user client certificate
Note: At the time of writing, the SQL client inside the Fujitsu Enterprise Postgres server
container does not support password-protected certificate.
13.Create a certificate signing request for the postgres user client certificate (Example 8-43).
$ openssl x509 -req -in postgres.csr -CA myca.pem -CAkey myca.key -out
postgres.pem -days 365
15.Create a TLS secret to store the postgres user certificate and key (Example 8-45).
19.In the Operator details window, select Create Instance, as shown in Figure 8-19.
In the Create FEPCluster window, click the YAML tab. Update the values as shown in
Table 8-49. Update the deployment parameters and click Create to create a cluster, as
shown in Figure 8-20 on page 299.
20.The database cluster is deployed, and the deployment status can be checked by selecting
Workloads → Pods, as shown in Figure 8-21. The status shows Running after the cluster
is ready.
All Fujitsu Enterprise Postgres pods must show the status as Running.
You successfully installed Fujitsu Enterprise Postgres on Red Hat OpenShift Cluster on an
IBM LinuxONE server platform with MTLS-enabled communication path.
Note: For more information about the parameters of the Fujitsu Enterprise Postgres cluster
FEPCluster CR configuration, see 1.1, “FEPCluster Parameter” in Fujitsu Enterprise
Postgres 13 for Kubernetes Reference Guide.
Note: If the server and client root certificates are different, the DBA must update the
spec.fep.postgres.tls.caName parameter. For more information, see the Fujitsu
Enterprise Postgres 13 for Kubernetes Reference Guide.
3. The clients (application developers) use the server root certificate (myca.pem), client
certificate (tls.crt), and private key (tls.key) to connect to the database cluster, as
shown in Example 8-46.
Additional LinuxONE use cases can be found in Appendix A of the IBM Redbooks publication,
Practical Migration from x86 to LinuxONE, SG24-8377.
For more information about other use cases, see the IBM LinuxONE client web page.
Location data is captured in a standard geometric format that is usually based on the
Geographic Coordinate System (GCS), which uses latitude and longitude. These calculations
are based on the 360 degrees of the Earth with the equator representing the positive and
negative division of the latitude degrees and the Prime Meridian (Greenwich Observatory,
London) representing the start and end of the longitude degrees. This measurement system
evolved to a standard known as ISO 6709, but there are many variants and influences that
are based on the requirements to measure and plot above or below ground or beyond Earth,
that is, outer space.
Having deployed PostGIS within the Fujitsu PostgreSQL Enterprise Server, these coordinates
are converted and stored as either geometry or geography data types and then associated
with spatial reference systems (SRSs). Unlike other database management systems
(DBMSs), PostGIS supports multiple SRS IDs instead of the usual EPSG:4326, which is a
worldwide system that is used by GPS systems. These values consist of components that
describe a series of 3D geographic parameters, such as the orientation, latitude, longitude,
and elevation in reference to geographic objects, which define coordinate systems and spatial
properties on a map.
Figure 8-2 on page 303 provides an example of the Tower of London, which is a Central
London Place of Interest.
Latitude: 51.508530 DMS Lat: 51° 30' 30.7080'' N
Longitude: -0.07702 DMS Long: 0° 4' 34.0752'' W
So, in summary, data from geocoding systems, whether GPS devices, postal code systems,
or mapping services such as What3words, and imagery data are stored, converted, and
rendered as visual models in graphical mapping solutions. Openstreetmap.org is a common
open-source application that graphically maps the location by converting geocoding data into
GCS coordinates and providing overlays such as satellite imagery that are embedded into
many commercial applications. These coordinates vary in accuracy based on the precision of
the data that includes GPS coordinates, where each degree represents an area of 111 km2 .
You can use up to eight decimal places, which represent an accuracy to within 1.11 mm2 .
Typically, postal addresses contain four decimal places, and Google Maps uses seven
decimal places, which represent an accuracy of 11.1 meters2 .
For more information about this topic, see Precision and Address geocoding.
The common debate among data scientists is whether to run models, predictions, or
R/Python programs locally by using client tools such as Quantum Geographic Information
System (QGIS) or OpenJump against files, or use databases and specifically PostGIS
functions. Raster data is the complex data that provides the second or third dimension of
geospatial queries, often overlaying socioeconomic data, such as population densities and
weather patterns, on top of geographical maps. As models are developed, scientists use their
PCs and dedicated x86 server platforms to provide compute- and memory-intensive
processing, but they can operate only at small volumes and often run out of space or memory.
Therefore, the scientists must break the models into much smaller executable units and then
stitch the results back together. Using public clouds solves some of the compute and memory
challenges, but unpredictable processing models often result in unforeseen billing costs,
which make this experiment expensive.
As you can imagine, this geospatial processing involves large volumes of data and intensive
data processing that combines large mapping data sources with socioeconomic data. You
need a high-performing database that can process the complex analytical queries while
inferring the properties of several data types. FUJITSU Enterprise Postgres on
IBM LinuxONE combines those key properties to ensure a robust, secure, and
high-performing geospatial platform.
others might involve complex pre-built functions that are written in C, PL/SQL, or other
languages.
In this section, we describe some of the most commonly used features and functions with
comments about the performance requirements where available. For more information about
these features and functions, see the PostGIS Reference guide.
The casts that are shown in Figure 8-3 transform data types from one format to another one.
Here are the associated data type functions, which are also known as geometry accessors
and constructors and editors.
AddGeometryColumn: Adds a geometry column to an existing table.
DropGeometryColumn: Removes a geometry column from a spatial table.
DropGeometryTable: Drops a table and all its references into geometry_columns.
Find_SRID: Returns the SRID that is defined for a geometry column.
Populate_Geometry_Columns: Ensures that geometry columns are defined with type
modifiers or have the appropriate spatial constraints.
UpdateGeometrySRID: Updates the SRID of all features in a geometry column, and the
table metadata.
Constructors
Here are some of the constructor functions that are used to create geometries:
ST_Point: Creates a point with the provided coordinate values. Alias for ST_MakePoint.
ST_PointZ: Creates a point with the provided coordinate and SRID values.
ST_PointM: Creates a point with the provided coordinate and SRID values.
ST_PointZM: Creates a point with the provided coordinate and SRID values.
ST_Polygon: Creates a Polygon from a linestring with a specified SRID.
ST_TileEnvelope: Creates a rectangular polygon in Web Mercator (SRID:3857) by using
the XYZ tile system.
ST_HexagonGrid: Returns a set of hexagons and cell indexes that cover the bounds of the
geometry argument.
ST_Hexagon: Returns a single hexagon that uses the provided edge size and cell
coordinate within the hexagon grid space.
ST_SquareGrid: Returns a set of grid squares and cell indexes that cover the bounds of
the geometry argument.
ST_Square: Returns a single square that uses the provided edge size and cell coordinate
within the square grid space.
Accessors
Here are some of the accessor functions:
ST_Area: Returns the area of the surface if it is a polygon or multi-polygon. For a
“geometry” type, the area is in SRID units. For a “geography” type, the area is in square
meters.
ST_Boundary: Returns the boundary of a geometry.
ST_Distance: For a geometry type, returns the 2-dimensional Cartesian minimum
distance (based on the spatial reference) between two geometries in projected units. For a
geography type, defaults to a return spheroidal minimum distance between two
geographies in meters.
ST_Intersection: (T) Returns a geometry that represents the shared portion of geomA and
geomB. The geography implementation does a transform to geometry to do the
intersection and then transform back to WGS84.
ST_Intersects: Returns TRUE if the geometries or geography spatially intersect in 2D
(share any portion of space) and returns FALSE if they do not (they are disjointed). For
geography, the tolerance is 0.00001 meters, so any points that close are considered to
intersect.
ST_Length: Returns the 2D length of the geometry if it is a linestring or multilinestring.
Geometry is in units of spatial reference, and geography is in meters (default spheroid).
ST_Perimeter: Returns the length measurement of the boundary of an ST_Surface or
ST_MultiSurface for geometry or geography (polygon or multipolygon). The geometry
measurement is in units of spatial reference, and geography is in meters.
Spatial relationships
Spatial functions model data into objects that can be represented graphically. When
combined with geometry or geography data types, they model information onto maps, such
as population densities or number of fast-food restaurants in an area.
Topological relationships
The PostGIS topology types and functions are used to manage topological objects such as
faces, edges, and nodes. Among these types and functions are the following ones:
ST_3DIntersects: Returns true if two geometries spatially intersect in 3D. Only for points,
linestrings, polygons, and polyhedral surfaces (area).
ST_Contains: Returns true if no points of B lie in the exterior of A, and A and B have at
least one interior point in common.
ST_ContainsProperly: Returns true if B intersects the interior of A but not the boundary or
exterior.
ST_CoveredBy: Returns true if no point in A is outside B.
ST_Covers: Returns true if no point in B is outside A.
ST_Crosses: Returns true if two geometries have some, but not all, interior points in
common.
ST_LineCrossingDirection: Returns a number indicating the crossing behavior of two
linestrings.
ST_Disjoint: Returns true if two geometries do not intersect (they have no point in
common).
ST_Equals: Returns true if two geometries include the same set of points.
ST_Intersects: Returns true if two geometries intersect (they have at least one point in
common).
ST_OrderingEquals: Returns true if two geometries represent the same geometry and
have points in the same directional order.
ST_Overlaps: Returns true if two geometries intersect and have the same dimension but
are not contained by each other.
ST_Relate: Tests whether two geometries have a topological relationship matching an
Intersection Matrix pattern or computes their Intersection Matrix.
ST_RelateMatch: Tests whether a DE-9IM Intersection Matrix matches an Intersection
Matrix pattern.
ST_Touches: Returns true if two geometries have at least one point in common, but their
interiors do not intersect.
ST_Within: Returns true if no points of A lie in the exterior of B, and A and B have at least
one interior point in common.
Distance relationships
The PostGIS distance types and functions provide spatial distance relationships between
geometries:
ST_3DDWithin: Returns true if two 3D geometries are within a certain 3D distance.
ST_3DDFullyWithin: Returns true if two 3D geometries are entirely within a certain 3D
distance.
ST_DFullyWithin: Returns true if two geometries are entirely within a certain distance.
ST_PointInsideCircle: Tests whether a point geometry is inside a circle that is defined by a
center and radius.
ST_DWithin: Returns true if the geometries are within a given distance.
Example 8-1 shows an example of using the function ST_DWithin, and Figure 8-4
provides the results of the example. This example provides an answer to the question,
“How many fast-food restaurants within 1 mile of a US highway?”1
Measurement functions
The following functions compute measurements of distance, area, and angles. There are also
functions to compute geometry values that are determined by measurements.
ST_Area: Returns the area of polygonal geometry.
ST_BuildArea: Creates a polygonal geometry that is formed by the linework of a geometry.
ST_Azimuth: Returns the north-based azimuth as the angle in radians as measured
clockwise from the vertical on pointA to pointB.
ST_Angle: Returns the angle between three points, or between two vectors (four points or
two lines).
ST_ClosestPoint: Returns the 2D point on g1 that is closest to g2. That point is the first
point of the shortest line.
ST_3DClosestPoint: Returns the 3D point on g1 that is closest to g2. That point is the first
point of the 3D shortest line.
ST_Distance: Returns the distance between two geometry or geography values.
1
Sources: http://www.fastfoodmaps.com; US highways maps, found at https://gisgeography.com/us-road-map/;
PostGIS In Action, found at https://www.manning.com/books/postgis-in-action-third-edition.
As an example, you want to know the population density of an area outside of the town center,
which is a ring road in effect, where you are interested in locating your retail business. The
function that is shown in Example 8-2 creates an areal geometry that is formed by the
constituent linework of a geometry. The return type can be a polygon or multi-polygon,
depending on the input.
Now, consider the effect of applying that query to a population density raster data set and
identifying the number of inhabitants (and potential clients) that are based within an area.
That area might vary in a sparsely populated terrain versus a large city. The data results and
data processing requirements would vary considerably.
Raster functions
Raster data provides a representation of the world as a surface divided up into a regular grid
array of cells2, where each of these cells has an associated value. When transferred into a
GIS setting, the cells in a raster grid can potentially represent other data values, such as
temperature, rainfall, or elevation. Each raster has one or more tiles, each having a set of
pixel values that are grouped into chunks. Rasters can be georeferenced. There are a
number of constructors, editors, accessors, and management functions that are related to
raster data sets, but in this section we describe the processing functions that transpose
socioeconomic data onto geospatial maps.
Figure 8-6 on page 311 provides an example of vector and raster grid models.
2 https://spatialvision.com.au/blog-raster-and-vector-data-in-gis/
Box3D: Returns the box 3D representation of the enclosing box of the raster.
ST_Clip: Returns the raster that is clipped by the input geometry. If no band is specified,
all bands are returned. If crop is not specified, true is assumed, which means that the
output raster is cropped.
ST_ConvexHull: Returns the convex hull geometry of the raster, including pixel values that
are equal to BandNoDataValue. For regular-shaped and non-skewed rasters, it provides
the same result as ST_Envelope, so it is useful only for irregularly shaped or skewed
rasters.
ST_DumpAsPolygons: Returns a set of geometry value (geomval) rows from a raster
band. If no band number is specified, the band number defaults to 1.
ST_Envelope: Returns the polygon representation of the extent of the raster.
ST_HillShade: Returns the hypothetical illumination of an elevation raster band by using
the provided azimuth, altitude, brightness, and elevation scale inputs. Useful for visualizing
terrain.
ST_Aspect: Returns the surface aspect of an elevation raster band. Useful for analyzing
terrain.
ST_Slope: Returns the surface slope of an elevation raster band. Useful for analyzing
terrain.
ST_Intersection: Returns a raster or a set of geometry-pixel value pairs representing the
shared portion of two rasters or the geometrical intersection of a vectorization of the raster
and a geometry.
ST_Polygon: Returns a polygon geometry that is formed by the union of pixels that have a
pixel value that does not have a data value. If no band number is specified, the band
number defaults to 1.
ST_Reclass: Creates a raster that is composed of band types that are reclassified from
original. The nband is the band to be changed. If nband is not specified, it i is assumed to
be 1. All other bands are returned unchanged. For example, you can convert a 16BUI
band to an 8BUI for simpler rendering as viewable formats.
ST_Union: Returns the union of a set of raster tiles into a single raster that is composed of
one band. If no band is specified for union, band number 1 is assumed. The resulting
raster's extent is the extent of the whole set. In intersection, the resulting value is defined
by p_expression, which is one of the following values: LAST (the default when none is
specified), MEAN, SUM, FIRST, MAX, and MIN.
Summary
Many of the sample functions that we have shown here are complex in nature, and they cast
or explicitly call C programs to read, compare, transform, and process several geospatial data
sources. They are intended to compliment or even replace the traditional data science
approach of using statistical analysis in classifying data, identifying similarities, and predicting
trends. Using FUJITSU Enterprise Postgres to support PostGIS functions on IBM LinuxONE
provides a high-performance platform that is scalable and has advanced security.
One such service provider in the UK, Space Syntax Ltd, developed a rich PostgreSQL-based
platform with extensive socioeconomic data running on IBM LinuxONE. This platform
contains ordnance survey data, a rich and detailed mapping source for the UK, and non-UK
mapping data sources to support projects around the world. This platform is embellished with
social-economic data, which includes population information such as demographics, travel,
education, welfare, healthcare, and financial-related data.
Figure 8-8 shows a cycle network analysis that was done by Space Syntax for the
Department for Transport.
Figure 8-8 Space Syntax: Department for Transport cycle network analysis
Using a combination of queries that uses PostGIS spatial functions, Figure 8-9 shows how
government-defined cycle network treatments are assigned to specific parts of the street
network based on how likely each street segment is to be used for journeys by bike, a mix of
adjacent land uses, and the number of traffic pedestrians.
Figure 8-9 Space Syntax: Cycle network that is categorized according to character of movement
The queries that are shown in Figure 8-10 on page 315 are spatial queries that run by using
multiple data sources that are available exclusively for approved partners by the UK
Geospatial Commission. These sources include active travel routes and transport data, social
data that is based on census polls, and Ordnance Survey Mastermap data. The Mastermap
includes the core location identifiers (Unique Property Reference Numbers (UPRNs), Unique
Street Reference Numbers (USRNs), and the Topographic Object Identifier (TOID)) that
provide a golden thread to link a wide range of data sets together to provide insights that
otherwise are not possible.
The large tables contain the core road, pavement, cycle pathways, and commercial buildings
data, which contains over 100 million combined records. To determine the quickest routes
between point A -> B in a certain period, geospatial reference queries and functions were
developed. The location grid coordinates are provided, and the query plots the geometrical
positions and then calculates, based on all the available data, the potential distance by road,
cycle route, or pavement within the given period. Results vary from seconds to hours based
on the location, such as rural, semi-rural, town, or city, and the given time, such as 15 minutes
to several hours.
While implementing these tables, the key geometry fields are added to spatial indexes. With
the spotty temporal library, you can use functions to index points within a region, on a region
containing points, and points within a radius to enable fast queries on this data during location
analysis. As the SQL shows in Example 8-4 on page 316, the sample query uses an index
scan within the ST_Intersects PostGIS function to improve the comparison between two
geom columns. This spatial index was created with the following statement:
CREATE INDEX sidx_gb_rtl_shp_geom ON uk_landuse.gb_rtl_shp USING gist (geom)
Example 8-4 shows a portion of the code that is required to interpret this geospatial data (the
remainder is commercially sensitive, so it is not shown here).
The results were captured as data sets from which individual locations were transposed into
graphical representations (Figure 8-12).
Summary
In summary, this analysis is complex: It intersects multiple data sources and places them into
a positional framework, breaks down each geometric location into tiles and chunks, and then
applies the socioeconomic data patterns.
FUJITSU Enterprise Postgres with the PostGIS extension running on IBM LinuxONE provides
the ideal environment to run this code against the geospatial data to ensure strong
performance, resilience, data integrity, and security.
What if it is possible to resolve all these business issues by using today's mainframe
technology?
This section describes how IBM LinuxONE, when combined with IBM Storage, provides
superior availability, performance, and security than competing platforms by using a sample
anonymized client environment. With the potential to reduce power requirements by
approximately 70% while decreasing the footprint that is required to run equivalent services
on x86 servers by up to 50% in a study that is associated with this type of consolidation effort,
it is easy to see the start of your potential savings along with the reduced carbon footprint that
this solution provides3.
Figure 8-13 shows the sample comparison that is used for this chapter.
3
The IBM Economics Consulting and Research team provided slightly better numbers for this specific consolidation
effort. Refer to the link to find your savings based on your quantifiable metrics. You can save costs and gain other
benefits by consolidating servers on IBM LinuxONE systems.
Figure 8-15 shows how the environment was expanded to three active data centers for our
use: Poughkeepsie, New York, US (POK), Montpellier France (MOP), and the Washington
System Center (WSC) in Herndon, Virginia, US.
The three MongoDB database instances were deployed with IBM Cloud Infrastructure Center
(IBM CIC) at each data center while keeping all the nodes in a cluster geographically
dispersed. Connectivity was possible by using multiple virtual private networks (VPNs) for
increased visibility and availability into the collective systems. A mix of IBM FlashSystem®
9100 and IBM FlashSystem 9200 were used for storage, with all of them configured to levels
that included IBM Safeguarded Copy (IBM SGC) V8.4.0 for the controllers. The IBM SGC
volumes are managed and accessed with CSM for recovery, but creation was handled by
automated policies on the storage devices.
IBM CIC does not contain a native mechanism for immutable snapshots because that
capability is provided by the IBM FlashSystem 9x controller and Copy Services Manager.
Figure 8-16 shows a sample GUI that you can use to complete the following tasks:
Select the size of the MongoDB database that you are creating based on standard T-Shirt
sizes.
Select a checkbox to add Appendix J (App J) to the MondoDB databases that you are
creating that require it.
Now, when a new MongoDB cluster is created, the App J Compliant checkbox can be
selected and IBM SGC copies start based on the policy that are defined for them.
changes from organization to organization. Work with your organization’s compliance group
to determine what their technical and process checklist entails.
Figure 8-17 provides a quick outline of the two main types of attacks against data today,
along with the recovery efforts that are needed. Although it is possible to use an IBM SGC
copy of the data for normal database restoration or recovery, these efforts are part of
normal backup and restore practices so that the application and Linux teams can recover
from general inconsistencies that are discovered during normal operations.
The following automation scripts are available online from MongoDB to help with recovery:
Deploy Automatically with GitHub
Backup and Restore with Filesystem Snapshots
The Configure LVM logical volumes Ansible playbook was used to create the Logical Volume
Management (LVM) and snapshot area.
In the sample environment, we did not include automation to determine which copy was
tainted because that task was beyond the scope of the project. It is not a hard prerequisite for
App J compliance. In typical real-world scenarios, an organization has several options for this
task:
Validation before snapshot creation.
Asynchronous validation.
Validation on detection of a cyberattack in another workload (post-mortem).
IBM has solutions for all these options, but they are beyond the scope of this book.
Figure 8-18 on page 322 shows a diagram of IBM Copy Services Manager with a 5-minute
frequency and 1-day retention. 5 minutes was set up for demonstration purposes because
waiting 30 minutes or multiple hours is not practical to showcase this technology.
Figure 8-18 IBM Copy Services Manager with 5-minute frequency and 1-day retention
In addition, there are several supporting playbooks to perform tasks such as:
1. Quiescing and resuming the database so that backups may be taken.
2. Terminating the replica set gracefully.
In general, the process of enabling IBM SGC copies for a general environment is as follows:
1. Gather a list of volumes and respective storage devices.
2. Ensure that the storage devices have the Safeguarded Policy available.
3. Ensure that there is sufficient space for Safeguarded Pools.
4. Create a volume group (with Safeguarded Policy attached).
5. Attach volumes to a volume group.
6. (optional) Validate enablement in CSM (in about 2 - 3 minutes or by logging in to the
storage device and inspecting the Safeguarded pool).
The process is controlled by a playbook with a parameter file that defines locations,
credentials, and other important information.
Playbook: The playbook performs the following tasks:
a. Updates the host’s address from the deployment data.
b. Adds the name servers as specified in the parameter file.
c. Refreshes the Red Hat Enterprise Linux subscription manager data.
d. Updates the system and installs some extra packages.
e. Cleans the YUM cache.
f. Captures the VM as an image.
Parameter file: This JSON file contains parameters that are used by the playbook to
construct a base image.
Base inventory file: This trivial file is used by the playbook so that you can create an IP
address for the instantiated VM:
base_image
These tasks are performed by using Ansible playbooks. In addition, there are playbooks for
creating shadow VMs, which can be used for recovery. The process is almost identical to the
steps that are outlined in this section. The differences are described in , “Shadow instance
deployment” on page 336.
The Ansible configuration consists of a playbook and a set of tasks that is repeated for each
of the VMs that are being provisioned.
Shadow overview
When recovering a replica, here are the three options:
1. Use the existing VMs and replace the Mongo data volume.
2. Create VMs that mimic the size of the original.
3. Create shadows of the original (asynchronously during deployment).
Table 8-1 provides a list of the pros and cons of each option.
Fresh VM deployment. A clean state that is clear of malware. Time-consuming (< 5 minutes)
but reasonable given longer RTO
windows.
The mongo-operations playbooks were created to facilitate the deployment and operation of
Mongo instances.
Required files
There are two base files that are used to define and control the Ansible environment:
1. ansible.cfg
2. hosts
The hosts file is created by running a script that is invoked by the playbook, which passes the
IP names and addresses of the nodes.
Ansible configuration
Certain settings in Ansible are adjustable through a configuration file. By default, this file is
/etc/ansible/ansible.cfg, and for this project, a simple file is in a local directory, which is
shown in Example 8-5.
The important entry here is inventory, which tells Ansible where to find a definition of all the
endpoints to be made known to Ansible.
Hosts
Hosts can be defined in Ansible in many different ways. In our simple implementation, we use
a flat file in the .ini format that is created by a script that is invoked from a playbook. An
example is show in , “Hosts file” on page 338. The file is named after the replica set. This file
is used to create shadows, terminate the replica set, and destroy the replica set.
Deployment
Deployment consists of running a deployment playbook against the hosts that are defined in
the hosts file.
Playbooks
The deployment playbook takes a base RHEL 7 system and performs the following tasks:
1. Install software, which includes MongoDB and supporting software.
2. Install configuration files.
3. Define an admin user for Mongo.
4. Enable operation in an SELinux enforcing environment.
Tasks
The deployment playbook (, “MongoDB deployment overview” on page 324) prepares the
environment for a working MongoDB installation.
Supporting files
The following files are used to support the deployment process:
SELinux files
Two policies must be created and installed:
a. Enable Full Time Diagnostic Data Capture (FTDC).
b. Allow access to /sys/fs/cgroup.
Proc policy
The current SELinux policy does not allow the MongoDB process to open and read
/proc/net/netstat, which is required for FTDC.
To create the policy that is used by the playbook, run the script that is shown in
Example 8-6 to create mongodb_proc_net.te.
– admin.js
The JavaScript file that is shown in Example 8-9 adds the user admin to the admin
database. The admin password is defined in this script and should be changed to meet
your requirements.
Other playbooks
Here are some other relevant playbooks:
Quiesce
The quiesce playbook (, “Quiescing” on page 346) locks the database to prevent it from
being updated. You are prompted for the admin password.
Resume
The resume playbook (, “Resuming” on page 346) unlocks the database to allow updates
to proceed. You are prompted for the admin password.
Shutdown
The shutdown playbook (8.2.5, “Terminating” on page 347) uses systemd to gracefully shut
down a replica set.
Data volume
Ansible uses the OpenStack module to delete the MongoDB data volume of each provisioned
VM.
8.2.3 Playbooks
This section provides an annotated description of the playbooks that are used to create
images and deploy MongoDB replica sets.
Base deployment
This playbook creates a base image for use by other deployment processes. Example 8-10
shows our sample playbook. The line numbers are for this and the following examples only so
that you can easily reference the explanations that follow.
[038] file:
[039] path: /etc/resolv.conf
[040] state: absent
[041]
[042] - name: Create new /etc/resolv.conf
[043] file:
[044] path: /etc/resolv.conf
[045] state: touch
[046] owner: root
[047] group: root
[048] mode: 0644
[049]
[050] - name: Add content to /etc/resolv.conf
[051] blockinfile:
[052] path: /etc/resolv.conf
[053] block: |
[054] {% for dns in cic_dns %}
[055] nameserver {{ dns }}
[056] {% endfor %}
[057]
[058] - name: Clean Subscription Manager
[059] command: subscription-manager clean
[060]
[061] - name: Remove Old Subscription Manager
[062] yum:
[063] name: katello-ca-*
[064] state: absent
[065] update_cache: yes
[066]
[067] - name: Add New Subscription Manager
[068] yum:
[069] name: "{{ sub_rpm }}"
[070] state: present
[071] update_cache: yes
[072]
[073] - name: Register and auto-subscribe
[074] community.general.redhat_subscription:
[075] state: present
[076] org_id: "{{ sub_org }}"
[077] activationkey: "{{ sub_key }}"
[078] ignore_errors: yes
[079]
[080] - name: Upgrade System
[081] yum:
[082] name=*
[083] state=latest
[084]
[085] - name: Install Tools
[086] yum:
[087] name: "{{ item }}"
[088] state: present
[089] with_items:
[090] - vim
[091] - yum-utils
[092] - net-tools
[093] - lvm2
[094]
[095] - name: Update .bashrc
[096] become: true
[097] blockinfile:
[098] path: .bashrc
[099] block: |
[100] alias vi=vim
[101]
[102] - name: Clean yum cache
[103] command: yum clean all
[104]
[105] - name: Cleanup
[106] file:
[107] path: "{{ item }}"
[108] state: absent
[109] with_items:
[110] - /var/cache/yum/s390x/7Server
[111]
[112] - name: Create Image Snapshot
[113] local_action:
[114] module: command
[115] cmd: ./capture.py -v "{{ this_file }}" -i "{{ deployed_vm.openstack.id
}}"
[116] register: result
Here are the line numbers from Example 8-10 on page 328 and their descriptions:
[007] - [011]: Check whether the image that you are building exists.
[013] - [016]: If the image exists, then use it or use the one from the parameter file.
[018] - [031]: From the localhost, deploy a starting image.
[033] - [035]: Update the host's address from the deployment data. All references to 'base'
now resolve to this address.
[037] - [040]: Erase the existing /etc/resolv.conf file.
[042] - [048]: Create an empty /etc/resolv.conf.
[050] - [056]: Add the name servers as specified in the parameter file.
[058] - [059]: Clean any subscription manager configuration.
[061] - [065]: Install the subscription manager parameter package.
[067] - [071]: Install the subscription manager parameter package as specified in the
parameter file.
[073] - [078]: Activate the subscription.
[080] - [083]: Update the system.
[085] - [093]: Install some extra packages.
[095] - [100]: Update .bashrc to include an alias for vim.
[102] - [110]: Clean the YUM cache.
[112] - [116]: Capture the VM as an image.
Parameters
The JSON file that is shown in Example 8-11 contains parameters that are used by the
playbook to construct a base image. The line numbers are to make it easier for reference in
this example only and should not be included in your parameter file.
Here are the line numbers from Example 8-11 and their descriptions:
[001] - cic_base_name: The name of the image to be produced.
[002] - cic_url: The URL of the IBM CIC management node.
[003] - cic_user: The user ID that is used for connecting to the IBM CIC host.
[004] - cic_password: The password that is associated with cic_user.
[005] - cic_project: The project under which the IBM CIC work is performed.
[006] - cic_cert: The certificate that is used if IBM CIC uses a self-signed certificate.
[007] - cic_flavor: The flavor of the deployed VM.
[008] - cic_rhel_image: The image on which ours will be based.
[009] - cic_vlan: The LAN to be associated with the deployed VM.
[010] - cic_key_name: The name of the key to be placed in /root/.ssh/authorized_keys. It
must be uploaded before deployment.
[011] - cic_availability_zone: A way to create logical groupings of hosts.
[012] - cic_host: The name of the node on which to create this VM.
[013] - cic_dns: A list of DNS addresses to be placed into /etc/resolv.conf.
[014] - sub_org: The organization to use with subscription manager registration.
[015] - sub_key: The subscription manager activation key.
[016] - sub_rpm: The URL of the subscription manager satellite RPM.
[017]: The parameter file name that is used by the playbook to invoke the capture process.
Example 8-12 shows the deploy-hosts.yml playbook, which controls the provisioning
process. The line numbers in this example are for reference only for this book and would not
appear in the playbook.
Here are the line numbers from Example 8-12 and their descriptions:
[002 - 008]: The playbook runs on the localhost and captures data in variables.
[019 - 025]: The playbook invokes the deploy-image.yml tasks for each node in the replica
set.
[027 - 030]: The playbook prepares the parameters to be passed to the prepareMongo
script.
[032 - 033]: The playbook invokes the prepareMongo script, which creates supporting files
for the deploy-mongo.yml playbook.
The deploy-image.yml file that is called in Example 8-13 contains the tasks that are required
to provision a single VM and data volume.
Here are the line numbers from Example 8-13 on page 333 and their descriptions:
[001 - 013]: Deploy a VM by using the parameters that are passed in extra_args.json.
[015 - 024]: Create a data volume for Mongo.
[026 - 031]: Attach the volume to the VM.
[033 - 037]: Query the volume that was created in lines [015 - 024].
[039 - 041]: Form the volume ID that the IBM SAN Volume Controller uses from the
volume information.
[043 - 045]: Define the volume name based on the volume data that you extracted.
[047 - 059]: Send the mkvolumegroup command to the SAN Volume Controller and ignore
any errors (volumegroup might already be defined).
[051 - 053]: Send the chvolumegroup command to the SAN Volume Controller to associate
the policy with volumegroup.
[055 - 057]: Send the chvdisk command to the SAN Volume Controller to include it in the
volumegroup.
[059 - 061]: Set the facts that are used by the prepareMongo script.
The file that is shown in Example 8-14 provides the variables that are required by the
deploy-hosts.yml playbook and deploy-image.yml tasks.
Here are the line numbers from Example 8-14 on page 334 and their descriptions:
[001]: Name of the nodes and hostnames.
[002]: Flavor of image that will deployed.
[003]: Name of the image that will be deployed.
[004]: Name of the LAN that will be associated with the nodes.
[005]: Storage pool for data volumes.
[006]: Key to use to authenticate SSH connections.
[007]: Zone that will be used for deployment.
[008]: Host on which nodes are run.
[009]: Instance numbers that will be used in hostnames
[010]: IP name or address of the SAN Volume Controller.
[011]: SAN Volume Controller username, and the public key that was specified when user
was created.
[012]: Name of the policy to apply to volumegroup.
[013]: Name of volumegroup.
[014]: Port to SSH for SAN Volume Controller.
OpenStack support
The file that is shown in Example 8-15 is used by the OpenStack module to authenticate
against the IBM CIC host.
Here are the line numbers from Example 8-15 on page 335 and their descriptions:
[004]: URL of the IBM CIC host.
[005]: Username to authenticate.
[006]: Password that will be used in authentication.
[007]: Project that will be accessed.
[008]: Certificate that is used in connection authentication.
Hosts file
The file that is shown in Example 8-18 is created by the deployment playbooks and contains
all the information that is needed for a successful deployment.
[mongo_nodes]
MONGO-RS0-1 ansible_ssh_host="[ip-1]" shadow="0" vmName="cic003c1" wwn="[wwn]"
volId="2a40014a-f233-46a7-9dfa-07370bb604fe"
volName="volume-MONGO-RS0-1-data-2a40014a-f233" ansible_user=root
ansible_python_interpreter="/usr/bin/env python2"
MONGO-RS0-2 ansible_ssh_host="[ip-2]" shadow="0" vmName="cic003c2" wwn="[wwn]"
volId="01d40a84-1409-47e3-be49-429104d8c8fa"
volName="volume-MONGO-RS0-2-data-01d40a84-1409" ansible_user=root
ansible_python_interpreter="/usr/bin/env python2"
MONGO-RS0-3 ansible_ssh_host="[ip-3]” shadow=”0” vmName=”cic003c3” wwn=”[wwn]”
volId=”146d8c19-1073-41e6-91aa-937e26eb73d2”
volName=”volume-MONGO-RS0-3-data-146d8c19-1073” ansible_user=root
ansible_python_interpreter=”/usr/bin/env python2”
[mongo_master]
MONGO-RS0-1 ansible_ssh_host=[ip-1] ansible_user=root
ansible_python_interpreter=”/usr/bin/env python2”
Under [mongo_nodes], add the hosts that will participate in the replica set. Under
[mongo_master], specify one of the nodes to become the master.
In Example 8-18 on page 338, our hosts file defines a category of hosts that is called
mongo_nodes that consists of three hosts that are defined by the IP addresses, for example,
129.40.186.[215,218,220]. These addresses identify servers that Ansible may manage. An
IP name also can be used. The second parameter defines which user is used when
contacting that host (any public keys for the Ansible player must be present on that host). The
third parameter defines which Python interpreter to use. The YUM tasks that used by the
deployment playbook require Python 2.7.
The mongo_master parameter defines one of the nodes as the master in the replica set.
Controller
The controller playbook (shown in Example 8-19) is a sort of master playbook that is used to
invoke the Tasks playbook (, “Tasks” on page 339 playbook).
Here are the line numbers from Example 8-19 and their descriptions:
[002] - [003]: Act on those hosts in the inventory file within the 'mongo_nodes' group.
[004] - [005]: The Mongo deployment happens in the mongodb/tasks/main.yml file.
[006] - [008]: Set the node count based on the number of hosts that is defined. Do not
force termination of non-shadow hosts.
[010] - [018]: When Mongo is installed and working, terminate the shadow virtual
machines.
Tasks
Tasks let you configure and save frequently run jobs so you can later run them with one click.
[214] with_items:
[215] - { name: 'vm.dirty_ratio', value: '15', state:
'present' }
[216] - { name: 'vm.dirty_background_ratio', value: '5', state:
'present' }
[217] - { name: 'vm.swappiness', value: '10', state:
'present' }
[218] - { name: 'net.core.somaxconn', value: '4096', state:
'present' }
[219] - { name: 'net.ipv4.tcp_fin_timeout', value: '30', state:
'present' }
[220] - { name: 'net.ipv4.tcp_keepalive_intvl', value: '30', state:
'present' }
[221] - { name: 'net.ipv4.tcp_keepalive_time', value: '120', state:
'present' }
[222] - { name: 'net.ipv4.tcp_max_syn_backlog', value: '4096', state:
'present' }
[223]
[224] #
[225] # Enable SNMP so we can monitor Mongo
[226] #
[227] - name: Copy SNMP server configuration
[228] copy:
[229] src: snmpd.conf
[230] dest: /etc/snmp/snmpd.conf
[231] owner: root
[232] group: root
[233] mode: 0644
[234]
[235] - name: Copy SNMP trap configuration
[236] copy:
[237] src: snmptrapd.conf
[238] dest: /etc/snmp/snmptrapd.conf
[239] owner: root
[240] group: root
[241] mode: 0644
[242]
[243] - name: Enforce SELinux
[244] ansible.posix.selinux:
[245] policy: targeted
[246] state: enforcing
[247]
[248] - name: Ensure that services are enabled and running
[249] ansible.builtin.systemd:
[250] name: "{{ item.name }}"
[251] enabled: "{{ item.enabled }}"
[252] state: "{{ item.state }}"
[253] with_items:
[254] - { name: 'snmpd', enabled: 'yes', state: 'started' }
[255] - { name: 'mongod', enabled: 'yes', state: 'started' }
[256]
[257] #
[258] # If we have >= 3 nodes, then we define a replica set
[259] #
[260] - name: Enable replica set operation [1] - Copy script
[261] copy:
[262] src: rs.js
[263] dest: /tmp
[264] owner: root
[265] group: root
[266] mode: 0600
[267] when: inventory_hostname in groups['mongo_master'] and nodeCount >= "3"
[268]
[269] - name: Enable replica set operation [2] - Run shell
[270] command: mongo /tmp/rs.js
[271] when: inventory_hostname in groups['mongo_master'] and nodeCount >= "3"
[272]
[273] #
[274] # Define the admin user
[275] #
[276] - name: Add admin user to Mongo [1] - Copy script
[277] copy:
[278] src: adminuser.js
[279] dest: /tmp
[280] owner: root
[281] group: root
[282] mode: 0600
[283] when: inventory_hostname in groups['mongo_master']
[284]
[285] - name: Add admin user to Mongo [2] - Run shell to add user
[286] command: mongo /tmp/adminuser.js
[287] when: inventory_hostname in groups['mongo_master']
[288]
[289] #
[290] # Get rid of the ephemera
[291] #
[292] - name: Cleanup
[293] file:
[294] path: "{{ item }}"
[295] state: absent
[296] with_items:
[297] - /tmp/mongodb_cgroup_memory.pp
[298] - /tmp/mongodb_proc_net.pp
[299] - /tmp/rs.js
[300] - /tmp/adminuser.js
[301] - /tmp/findVol
Here are the line numbers from Example 8-20 on page 340 and their descriptions:
[003 - 010]: Add the MongoDB repository so that YUM can install it.
[012 - 021]: Install MongoDB and its supporting programs.
[023 - 036]: Load and run the find volume process.
[038 - 043]: Partition the data volume.
[045 - 049]: Create pv and the volume group.
[051 - 056]: Create a 768 MB logical volume.
[058 - 062]: Make an XFS file system on the logical volume.
[064 - 071]: Create a mount point for the volume.
Quiescing
Use the mongo command to lock the database, as shown in Example 8-21.
Here are the line numbers from Example 8-21 and their descriptions:
[001]: This playbook will run against all nodes because it does not know which node is
primary.
[008]: Run db.fsyncLock() to lock the database.
Resuming
Use the mongo command to unlock the database, as shown in Example 8-22.
Here are the line numbers from Example 8-22 and their descriptions:
[001]: This playbook will run against all nodes because it does not which node is the
primary.
[009]: We run db.fsyncUnlock until the lockCount reaches 0.
8.2.5 Terminating
The termination of a replica set is performed by using a controller playbook (, “Controller” on
page 347) and an embedded task.
Controller
The controller playbook that is shown in Example 8-23 includes the terminate-host.yml task
that shuts down replica set VMs.
Here are the line numbers from Example 8-23 on page 347 and their descriptions:
[005] - [006]: Force the termination of the VMs.
[008] - [012]: Invoke the terminate virtual machine task for each member of the replica set.
Embedded task
The playbook that is shown in Example 8-24 shuts down a VM if it is a shadow MongoDB
server or if we force the shutdown of a real MongoDB server.
Here are the line numbers from Example 8-24 and their descriptions:
[002] - [004]: This playbook is also used when deploying shadows to shut them down if the
playbook is configured to do so.
[006] - [011]: Use the OpenStack API to terminate a VM.
Master playbook
The destroy-hosts.yml playbook, which is shown in Example 8-25, controls the
decommissioning process.
Here are the line numbers from Example 8-25 on page 348 and their descriptions:
[002 - 005]: The playbook runs on the localhost.
[007 - 011]: Invoke the destroy-image.yml tasks for each node in the replica set. Use the
data from the inventory file to guide the process.
Here are the line numbers from Example 8-26 and their descriptions:
[001 - 003]: Remove the safeguarded policy from the volume group.
[005 - 007]: Remove the data volume from the volume group.
[009 - 011]: Remove the volume group.
[013 - 020]: Destroy the instance.
[022 - 026]: Destroy the volume.
Here are the line numbers from Example 8-27 and their descriptions:
[001]: Name of the nodes and hostnames.
[002]: Flavor of image to be deployed.
[003]: Name of the image to be deployed.
[004]: Name of the LAN to be associated with the nodes.
[005]: Storage pool for data volumes.
[006]: Key that will be used for authenticating SSH connections.
[007]: Zone that will be used for deployment.
[008]: Host on which the nodes run.
[009]: Instance numbers that will be used in the hostnames.
[010]: IP name or address of the SAN Volume Controller.
[011]: The SAN Volume Controller username and the public key that was specified when
the user was created.
[012]: Name of policy to apply to volumegroup.
[013]: Name of the volume group.
[014]: Port to SSH for SAN Volume Controller.
What if it is possible to resolve all these business issues by using today's mainframe
technology?
This chapter describes how IBM LinuxONE, when combined with IBM Storage, provides
superior availability, performance, and security than competing platforms by using a sample
anonymized client environment. With the potential to reduce power requirements by
approximately 70% while decreasing the footprint that is required to run equivalent services
on x86 servers by up to 50% in a study that is associated with this type of consolidation effort,
it is easy to see the start of your potential savings along with the reduced carbon footprint that
this solution provides1.
1
The IBM Economics Consulting and Research team provided slightly better numbers for this specific consolidation
effort. Refer to the link to find your savings based on your quantifiable metrics. You can save costs and gain other
benefits by consolidating servers on IBM LinuxONE systems.
Figure B-1 shows the sample comparison that is used for this chapter.
Figure B-3 shows how the environment was expanded to three active data centers for our
use: Poughkeepsie, New York, US (POK), Montpellier France (MOP), and the Washington
System Center (WSC) in Herndon, Virginia, US.
The three MongoDB database instances were deployed with IBM Cloud Infrastructure Center
(IBM CIC) at each data center while keeping all the nodes in a cluster geographically
dispersed. Connectivity was possible by using multiple virtual private networks (VPNs) for
increased visibility and availability into the collective systems. A mix of IBM FlashSystem
9100 and IBM FlashSystem 9200 were used for storage, with all of them configured to levels
that included IBM Safeguarded Copy (IBM SGC) V8.4.0 for the controllers. The IBM SGC
volumes are managed and accessed with CSM for recovery, but creation was handled by
automated policies on the storage devices.
IBM CIC does not contain a native mechanism for immutable snapshots because that
capability is provided by the IBM FlashSystem 9x controller and Copy Services Manager.
Figure B-4 shows a sample GUI that you can use to complete the following tasks:
Select the size of the MongoDB database that you are creating based on standard T-Shirt
sizes.
Select a checkbox to add Appendix J (App J) to the MondoDB databases that you are
creating that require it.
Now, when a new MongoDB cluster is created, the App J Compliant checkbox can be
selected and IBM SGC copies start based on the policy that are defined for them.
regulations, App J is open to interpretation, and the definition of being FFIEC App J compliant
changes from organization to organization. Work with your organization’s compliance group
to determine what their technical and process checklist entails.
Figure B-5 provides a quick outline of the two main types of attacks against data today,
along with the recovery efforts that are needed. Although it is possible to use an IBM SGC
copy of the data for normal database restoration or recovery, these efforts are part of
normal backup and restore practices so that the application and Linux teams can recover
from general inconsistencies that are discovered during normal operations.
The following automation scripts are available online from MongoDB to help with recovery:
Deploy Automatically with GitHub
Backup and Restore with Filesystem Snapshots
The Configure LVM logical volumes Ansible playbook was used to create the Logical Volume
Management (LVM) and snapshot area.
In the sample environment, we did not include automation to determine which copy was
tainted because that task was beyond the scope of the project. It is not a hard prerequisite for
App J compliance. In typical real-world scenarios, an organization has several options for this
task:
Validation before snapshot creation.
Asynchronous validation.
Validation on detection of a cyberattack in another workload (post-mortem).
IBM has solutions for all these options, but they are beyond the scope of this book.
Figure B-6 on page 358 shows a diagram of IBM Copy Services Manager with a 5-minute
frequency and 1-day retention. 5 minutes was set up for demonstration purposes because
waiting 30 minutes or multiple hours is not practical to showcase this technology.
Figure B-6 IBM Copy Services Manager with 5-minute frequency and 1-day retention
In addition, there are several supporting playbooks to perform tasks such as:
1. Quiescing and resuming the database so that backups may be taken.
2. Terminating the replica set gracefully.
In general, the process of enabling IBM SGC copies for a general environment is as follows:
1. Gather a list of volumes and respective storage devices.
2. Ensure that the storage devices have the Safeguarded Policy available.
3. Ensure that there is sufficient space for Safeguarded Pools.
4. Create a volume group (with Safeguarded Policy attached).
5. Attach volumes to a volume group.
6. (optional) Validate enablement in CSM (in about 2 - 3 minutes or by logging in to the
storage device and inspecting the Safeguarded pool).
The process is controlled by a playbook with a parameter file that defines locations,
credentials, and other important information.
Playbook: The playbook performs the following tasks:
a. Updates the host’s address from the deployment data.
b. Adds the name servers as specified in the parameter file.
c. Refreshes the Red Hat Enterprise Linux subscription manager data.
d. Updates the system and installs some extra packages.
e. Cleans the YUM cache.
f. Captures the VM as an image.
Parameter file: This JSON file contains parameters that are used by the playbook to
construct a base image.
Base inventory file: This trivial file is used by the playbook so that you can create an IP
address for the instantiated VM:
base_image
These tasks are performed by using Ansible playbooks. In addition, there are playbooks for
creating shadow VMs, which can be used for recovery. The process is almost identical to the
steps that are outlined in this section. The differences are described in , “Shadow instance
deployment” on page 372.
The Ansible configuration consists of a playbook and a set of tasks that is repeated for each
of the VMs that are being provisioned.
Shadow overview
When recovering a replica, here are the three options:
1. Use the existing VMs and replace the Mongo data volume.
2. Create VMs that mimic the size of the original.
3. Create shadows of the original (asynchronously during deployment).
Table B-1 provides a list of the pros and cons of each option.
Fresh VM deployment. A clean state that is clear of malware. Time-consuming (< 5 minutes)
but reasonable given longer RTO
windows.
The mongo-operations playbooks were created to facilitate the deployment and operation of
Mongo instances.
Required files
There are two base files that are used to define and control the Ansible environment:
1. ansible.cfg
2. hosts
The hosts file is created by running a script that is invoked by the playbook, which passes the
IP names and addresses of the nodes.
Ansible configuration
Certain settings in Ansible are adjustable through a configuration file. By default, this file is
/etc/ansible/ansible.cfg, and for this project, a simple file is in a local directory, which is
shown in Example B-1.
The important entry here is inventory, which tells Ansible where to find a definition of all the
endpoints to be made known to Ansible.
Hosts
Hosts can be defined in Ansible in many different ways. In our simple implementation, we use
a flat file in the .ini format that is created by a script that is invoked from a playbook. An
example is show in , “Hosts file” on page 374. The file is named after the replica set. This file
is used to create shadows, terminate the replica set, and destroy the replica set.
Deployment
Deployment consists of running a deployment playbook against the hosts that are defined in
the hosts file.
Playbooks
The deployment playbook takes a base RHEL 7 system and performs the following tasks:
1. Install software, which includes MongoDB and supporting software.
2. Install configuration files.
3. Define an admin user for Mongo.
4. Enable operation in an SELinux enforcing environment.
Tasks
The deployment playbook (, “MongoDB deployment overview” on page 360) prepares the
environment for a working MongoDB installation.
Supporting files
The following files are used to support the deployment process:
SELinux files
Two policies must be created and installed:
a. Enable Full Time Diagnostic Data Capture (FTDC).
b. Allow access to /sys/fs/cgroup.
Proc policy
The current SELinux policy does not allow the MongoDB process to open and read
/proc/net/netstat, which is required for FTDC.
To create the policy that is used by the playbook, run the script that is shown in
Example B-2 to create mongodb_proc_net.te.
– admin.js
The JavaScript file that is shown in Example B-5 adds the user admin to the admin
database. The admin password is defined in this script and should be changed to meet
your requirements.
Other playbooks
Here are some other relevant playbooks:
Quiesce
The quiesce playbook (, “Quiescing” on page 382) locks the database to prevent it from
being updated. You are prompted for the admin password.
Resume
The resume playbook (, “Resuming” on page 382) unlocks the database to allow updates
to proceed. You are prompted for the admin password.
Shutdown
The shutdown playbook (, “Terminating” on page 383) uses systemd to gracefully shut
down a replica set.
Data volume
Ansible uses the OpenStack module to delete the MongoDB data volume of each provisioned
VM.
Playbooks
This section provides an annotated description of the playbooks that are used to create
images and deploy MongoDB replica sets.
Base deployment
This playbook creates a base image for use by other deployment processes. Example B-6
shows our sample playbook. The line numbers are for this and the following examples only so
that you can easily reference the explanations that follow.
[090] - vim
[091] - yum-utils
[092] - net-tools
[093] - lvm2
[094]
[095] - name: Update .bashrc
[096] become: true
[097] blockinfile:
[098] path: .bashrc
[099] block: |
[100] alias vi=vim
[101]
[102] - name: Clean yum cache
[103] command: yum clean all
[104]
[105] - name: Cleanup
[106] file:
[107] path: "{{ item }}"
[108] state: absent
[109] with_items:
[110] - /var/cache/yum/s390x/7Server
[111]
[112] - name: Create Image Snapshot
[113] local_action:
[114] module: command
[115] cmd: ./capture.py -v "{{ this_file }}" -i "{{ deployed_vm.openstack.id
}}"
[116] register: result
Here are the line numbers from Example B-6 on page 364 and their descriptions:
[007] - [011]: Check whether the image that you are building exists.
[013] - [016]: If the image exists, then use it or use the one from the parameter file.
[018] - [031]: From the localhost, deploy a starting image.
[033] - [035]: Update the host's address from the deployment data. All references to 'base'
now resolve to this address.
[037] - [040]: Erase the existing /etc/resolv.conf file.
[042] - [048]: Create an empty /etc/resolv.conf.
[050] - [056]: Add the name servers as specified in the parameter file.
[058] - [059]: Clean any subscription manager configuration.
[061] - [065]: Install the subscription manager parameter package.
[067] - [071]: Install the subscription manager parameter package as specified in the
parameter file.
[073] - [078]: Activate the subscription.
[080] - [083]: Update the system.
[085] - [093]: Install some extra packages.
[095] - [100]: Update .bashrc to include an alias for vim.
[102] - [110]: Clean the YUM cache.
[112] - [116]: Capture the VM as an image.
Parameters
The JSON file that is shown in Example B-7 contains parameters that are used by the
playbook to construct a base image. The line numbers are to make it easier for reference in
this example only and should not be included in your parameter file.
Here are the line numbers from Example B-7 and their descriptions:
[001] - cic_base_name: The name of the image to be produced.
[002] - cic_url: The URL of the IBM CIC management node.
[003] - cic_user: The user ID that is used for connecting to the IBM CIC host.
[004] - cic_password: The password that is associated with cic_user.
[005] - cic_project: The project under which the IBM CIC work is performed.
[006] - cic_cert: The certificate that is used if IBM CIC uses a self-signed certificate.
[007] - cic_flavor: The flavor of the deployed VM.
[008] - cic_rhel_image: The image on which ours will be based.
[009] - cic_vlan: The LAN to be associated with the deployed VM.
[010] - cic_key_name: The name of the key to be placed in /root/.ssh/authorized_keys. It
must be uploaded before deployment.
[011] - cic_availability_zone: A way to create logical groupings of hosts.
[012] - cic_host: The name of the node on which to create this VM.
[013] - cic_dns: A list of DNS addresses to be placed into /etc/resolv.conf.
[014] - sub_org: The organization to use with subscription manager registration.
[015] - sub_key: The subscription manager activation key.
[016] - sub_rpm: The URL of the subscription manager satellite RPM.
[017]: The parameter file name that is used by the playbook to invoke the capture process.
Example B-8 shows the deploy-hosts.yml playbook, which controls the provisioning
process. The line numbers in this example are for reference only for this book and would not
appear in the playbook.
Here are the line numbers from Example B-8 and their descriptions:
[002 - 008]: The playbook runs on the localhost and captures data in variables.
[019 - 025]: The playbook invokes the deploy-image.yml tasks for each node in the replica
set.
[027 - 030]: The playbook prepares the parameters to be passed to the prepareMongo
script.
[032 - 033]: The playbook invokes the prepareMongo script, which creates supporting files
for the deploy-mongo.yml playbook.
The deploy-image.yml file that is called in Example B-9 contains the tasks that are required
to provision a single VM and data volume.
Here are the line numbers from Example B-9 on page 369 and their descriptions:
[001 - 013]: Deploy a VM by using the parameters that are passed in extra_args.json.
[015 - 024]: Create a data volume for Mongo.
[026 - 031]: Attach the volume to the VM.
[033 - 037]: Query the volume that was created in lines [015 - 024].
[039 - 041]: Form the volume ID that the IBM SAN Volume Controller uses from the
volume information.
[043 - 045]: Define the volume name based on the volume data that you extracted.
[047 - 059]: Send the mkvolumegroup command to the SAN Volume Controller and ignore
any errors (volumegroup might already be defined).
[051 - 053]: Send the chvolumegroup command to the SAN Volume Controller to associate
the policy with volumegroup.
[055 - 057]: Send the chvdisk command to the SAN Volume Controller to include it in the
volumegroup.
[059 - 061]: Set the facts that are used by the prepareMongo script.
The file that is shown in Example B-10 provides the variables that are required by the
deploy-hosts.yml playbook and deploy-image.yml tasks.
Here are the line numbers from Example B-10 on page 370 and their descriptions:
[001]: Name of the nodes and hostnames.
[002]: Flavor of image that will deployed.
[003]: Name of the image that will be deployed.
[004]: Name of the LAN that will be associated with the nodes.
[005]: Storage pool for data volumes.
[006]: Key to use to authenticate SSH connections.
[007]: Zone that will be used for deployment.
[008]: Host on which nodes are run.
[009]: Instance numbers that will be used in hostnames
[010]: IP name or address of the SAN Volume Controller.
[011]: SAN Volume Controller username, and the public key that was specified when user
was created.
[012]: Name of the policy to apply to volumegroup.
[013]: Name of volumegroup.
[014]: Port to SSH for SAN Volume Controller.
OpenStack support
The file that is shown in Example B-11 is used by the OpenStack module to authenticate
against the IBM CIC host.
Here are the line numbers from Example B-11 on page 371 and their descriptions:
[004]: URL of the IBM CIC host.
[005]: Username to authenticate.
[006]: Password that will be used in authentication.
[007]: Project that will be accessed.
[008]: Certificate that is used in connection authentication.
Hosts file
The file that is shown in Example B-13 is created by the deployment playbooks and contains
all the information that is needed for a successful deployment.
[mongo_nodes]
MONGO-RS0-1 ansible_ssh_host="[ip-1]" shadow="0" vmName="cic003c1" wwn="[wwn]"
volId="2a40014a-f233-46a7-9dfa-07370bb604fe"
volName="volume-MONGO-RS0-1-data-2a40014a-f233" ansible_user=root
ansible_python_interpreter="/usr/bin/env python2"
MONGO-RS0-2 ansible_ssh_host="[ip-2]" shadow="0" vmName="cic003c2" wwn="[wwn]"
volId="01d40a84-1409-47e3-be49-429104d8c8fa"
volName="volume-MONGO-RS0-2-data-01d40a84-1409" ansible_user=root
ansible_python_interpreter="/usr/bin/env python2"
MONGO-RS0-3 ansible_ssh_host="[ip-3]” shadow=”0” vmName=”cic003c3” wwn=”[wwn]”
volId=”146d8c19-1073-41e6-91aa-937e26eb73d2”
volName=”volume-MONGO-RS0-3-data-146d8c19-1073” ansible_user=root
ansible_python_interpreter=”/usr/bin/env python2”
[mongo_master]
MONGO-RS0-1 ansible_ssh_host=[ip-1] ansible_user=root
ansible_python_interpreter=”/usr/bin/env python2”
Under [mongo_nodes], add the hosts that will participate in the replica set. Under
[mongo_master], specify one of the nodes to become the master.
In Example B-13 on page 374, our hosts file defines a category of hosts that is called
mongo_nodes that consists of three hosts that are defined by the IP addresses, for example,
129.40.186.[215,218,220]. These addresses identify servers that Ansible may manage. An
IP name also can be used. The second parameter defines which user is used when
contacting that host (any public keys for the Ansible player must be present on that host). The
third parameter defines which Python interpreter to use. The YUM tasks that used by the
deployment playbook require Python 2.7.
The mongo_master parameter defines one of the nodes as the master in the replica set.
MongoDB deployment
MongoDB deployment is performed by using a playbook (, “Controller” on page 375) that
invokes another playbook for each member of the replica set. In this section, we describe
each of those playbooks.
Controller
The controller playbook (shown in Example B-14) is a sort of master playbook that is used to
invoke the Tasks playbook (, “Tasks” on page 376 playbook).
Here are the line numbers from Example B-14 and their descriptions:
[002] - [003]: Act on those hosts in the inventory file within the 'mongo_nodes' group.
[004] - [005]: The Mongo deployment happens in the mongodb/tasks/main.yml file.
[006] - [008]: Set the node count based on the number of hosts that is defined. Do not
force termination of non-shadow hosts.
[010] - [018]: When Mongo is installed and working, terminate the shadow virtual
machines.
Tasks
Tasks let you configure and save frequently run jobs so you can later run them with one click.
[049]
[050] #
[051] # Check if we have a data volume and prepare it if found
[052] #
[053] - name: Get path
[054] stat:
[055] path: "/dev/vdmdv"
[056] register: dev
[057]
[058] - name: partition data device
[059] parted:
[060] device: "/dev/vdmdv"
[061] number: 1
[062] state: present
[063] when: dev.stat.exists
[064]
[065] - name: Get path to partition
[066] find:
[067] recurse: no
[068] paths: "/dev/mapper"
[069] file_type: link
[070] follow: no
[071] patterns: "*{{ wwn }}*1"
[072] register: info
[073] when: dev.stat.exists
[074]
[075] - name: Create Mongo data volume group
[076] lvg:
[077] vg: vg_mongo
[078] pvs: "{{ info.files[0].path }}"
[079] when: dev.stat.exists
[080]
[081] - name: Create LVM
[082] lvol:
[083] vg: vg_mongo
[084] lv: data
[085] size: 100%VG
[086] when: dev.stat.exists
[087]
[088] - name: Create an xfs file system
[089] filesystem:
[090] fstype: xfs
[091] dev: /dev/mapper/vg_mongo-data
[092] when: dev.stat.exists
[093]
[094] - name: Make mount point
[095] file:
[096] path: /var/lib/mongo
[097] state: directory
[098] owner: mongod
[099] group: mongod
[100] mode: '0755'
[101] when: dev.stat.exists
[102]
[103] - name: Add mount point for Mongo volume
[104] ansible.posix.mount:
[105] backup: yes
[106] path: /var/lib/mongo
[107] src: /dev/mapper/vg_mongo-data
[108] fstype: xfs
[109] boot: yes
[110] dump: '1'
[111] passno: '2'
[112] state: present
[113] when: dev.stat.exists
[114]
[115] - name: Mount the volume
[116] command: mount -a
[117] when: dev.stat.exists
[118]
[119] - name: Change ownership of data volume
[120] file:
[121] path: /var/lib/mongo
[122] state: directory
[123] owner: mongod
[124] group: mongod
[125] recurse: yes
[126] when: dev.stat.exists
[127]
[128] #
[129] # SELinux processing
[130] #
[131] - name: Copy SELinux proc policy file
[132] copy:
[133] src: mongodb_proc_net.pp
[134] dest: /tmp/mongodb_proc_net.pp
[135] owner: root
[136] group: root
[137] mode: 0644
[138]
[139] - name: Copy SELinux cgroup policy file
[140] copy:
[141] src: mongodb_cgroup_memory.pp
[142] dest: /tmp/mongodb_cgroup_memory.pp
[143] owner: root
[144] group: root
[145] mode: 0644
[146]
[147] - name: Update SELinux proc file policy
[148] command: semodule -i /tmp/mongodb_proc_net.pp
[149]
[150] - name: Update SELinux cgroup memory policy
[151] command: semodule -i /tmp/mongodb_cgroup_memory.pp
[152]
[153] - name: Apply SELinux policies
[154] community.general.sefcontext:
[155] target: "{{ item.target }}"
[156] setype: "{{ item.set_type }}"
[157] state: "{{ item.state }}"
[158] with_items:
[210] sysctl:
[211] name: "{{ item.name }}"
[212] value: "{{ item.value }}"
[213] state: "{{ item.state }}"
[214] with_items:
[215] - { name: 'vm.dirty_ratio', value: '15', state:
'present' }
[216] - { name: 'vm.dirty_background_ratio', value: '5', state:
'present' }
[217] - { name: 'vm.swappiness', value: '10', state:
'present' }
[218] - { name: 'net.core.somaxconn', value: '4096', state:
'present' }
[219] - { name: 'net.ipv4.tcp_fin_timeout', value: '30', state:
'present' }
[220] - { name: 'net.ipv4.tcp_keepalive_intvl', value: '30', state:
'present' }
[221] - { name: 'net.ipv4.tcp_keepalive_time', value: '120', state:
'present' }
[222] - { name: 'net.ipv4.tcp_max_syn_backlog', value: '4096', state:
'present' }
[223]
[224] #
[225] # Enable SNMP so we can monitor Mongo
[226] #
[227] - name: Copy SNMP server configuration
[228] copy:
[229] src: snmpd.conf
[230] dest: /etc/snmp/snmpd.conf
[231] owner: root
[232] group: root
[233] mode: 0644
[234]
[235] - name: Copy SNMP trap configuration
[236] copy:
[237] src: snmptrapd.conf
[238] dest: /etc/snmp/snmptrapd.conf
[239] owner: root
[240] group: root
[241] mode: 0644
[242]
[243] - name: Enforce SELinux
[244] ansible.posix.selinux:
[245] policy: targeted
[246] state: enforcing
[247]
[248] - name: Ensure that services are enabled and running
[249] ansible.builtin.systemd:
[250] name: "{{ item.name }}"
[251] enabled: "{{ item.enabled }}"
[252] state: "{{ item.state }}"
[253] with_items:
[254] - { name: 'snmpd', enabled: 'yes', state: 'started' }
[255] - { name: 'mongod', enabled: 'yes', state: 'started' }
[256]
[257] #
[258] # If we have >= 3 nodes, then we define a replica set
[259] #
[260] - name: Enable replica set operation [1] - Copy script
[261] copy:
[262] src: rs.js
[263] dest: /tmp
[264] owner: root
[265] group: root
[266] mode: 0600
[267] when: inventory_hostname in groups['mongo_master'] and nodeCount >= "3"
[268]
[269] - name: Enable replica set operation [2] - Run shell
[270] command: mongo /tmp/rs.js
[271] when: inventory_hostname in groups['mongo_master'] and nodeCount >= "3"
[272]
[273] #
[274] # Define the admin user
[275] #
[276] - name: Add admin user to Mongo [1] - Copy script
[277] copy:
[278] src: adminuser.js
[279] dest: /tmp
[280] owner: root
[281] group: root
[282] mode: 0600
[283] when: inventory_hostname in groups['mongo_master']
[284]
[285] - name: Add admin user to Mongo [2] - Run shell to add user
[286] command: mongo /tmp/adminuser.js
[287] when: inventory_hostname in groups['mongo_master']
[288]
[289] #
[290] # Get rid of the ephemera
[291] #
[292] - name: Cleanup
[293] file:
[294] path: "{{ item }}"
[295] state: absent
[296] with_items:
[297] - /tmp/mongodb_cgroup_memory.pp
[298] - /tmp/mongodb_proc_net.pp
[299] - /tmp/rs.js
[300] - /tmp/adminuser.js
[301] - /tmp/findVol
Here are the line numbers from Example B-15 on page 376 and their descriptions:
[003 - 010]: Add the MongoDB repository so that YUM can install it.
[012 - 021]: Install MongoDB and its supporting programs.
[023 - 036]: Load and run the find volume process.
[038 - 043]: Partition the data volume.
[045 - 049]: Create pv and the volume group.
Quiescing
Use the mongo command to lock the database, as shown in Example B-16.
Here are the line numbers from Example B-16 and their descriptions:
[001]: This playbook will run against all nodes because it does not know which node is
primary.
[008]: Run db.fsyncLock() to lock the database.
Resuming
Use the mongo command to unlock the database, as shown in Example B-17.
Here are the line numbers from Example B-17 and their descriptions:
[001]: This playbook will run against all nodes because it does not which node is the
primary.
[009]: We run db.fsyncUnlock until the lockCount reaches 0.
Terminating
The termination of a replica set is performed by using a controller playbook (, “Controller” on
page 383) and an embedded task.
Controller
The controller playbook that is shown in Example B-18 includes the terminate-host.yml task
that shuts down replica set VMs.
Here are the line numbers from Example B-18 on page 383 and their descriptions:
[005] - [006]: Force the termination of the VMs.
[008] - [012]: Invoke the terminate virtual machine task for each member of the replica set.
Embedded task
The playbook that is shown in Example B-19 shuts down a VM if it is a shadow MongoDB
server or if we force the shutdown of a real MongoDB server.
Here are the line numbers from Example B-19 and their descriptions:
[002] - [004]: This playbook is also used when deploying shadows to shut them down if the
playbook is configured to do so.
[006] - [011]: Use the OpenStack API to terminate a VM.
Master playbook
The destroy-hosts.yml playbook, which is shown in Example B-20, controls the
decommissioning process.
Here are the line numbers from Example B-20 on page 384 and their descriptions:
[002 - 005]: The playbook runs on the localhost.
[007 - 011]: Invoke the destroy-image.yml tasks for each node in the replica set. Use the
data from the inventory file to guide the process.
Here are the line numbers from Example B-21 and their descriptions:
[001 - 003]: Remove the safeguarded policy from the volume group.
[005 - 007]: Remove the data volume from the volume group.
[009 - 011]: Remove the volume group.
Here are the line numbers from Example B-22 and their descriptions:
[001]: Name of the nodes and hostnames.
[002]: Flavor of image to be deployed.
[003]: Name of the image to be deployed.
[004]: Name of the LAN to be associated with the nodes.
[005]: Storage pool for data volumes.
[006]: Key that will be used for authenticating SSH connections.
[007]: Zone that will be used for deployment.
[008]: Host on which the nodes run.
[009]: Instance numbers that will be used in the hostnames.
[010]: IP name or address of the SAN Volume Controller.
[011]: The SAN Volume Controller username and the public key that was specified when
the user was created.
[012]: Name of policy to apply to volumegroup.
[013]: Name of the volume group.
[014]: Port to SSH for SAN Volume Controller.
Case of error
In this case, we use the SQL that is shown in Example C-1.
In general, DELETE statements use the FROM clause to specify the database objects from which
to delete rows. However, in SQL1, the FROM clause is missing from the DELETE statement.
When running SQL1 in an Oracle Database, one row is deleted from the table, as shown in
Example C-2.
Oracle treats zero-length strings as NULL. Therefore, the SELECT statement that extracts rows
where a column (COL_2) is NULL returns rows, including the rows where a column (COL_1) is
yyyyy, as shown in Example C-5.
COL_1 COL_2
----- -----
yyyyy
zzzzz
When running SQL2 in FUJITSU Enterprise Postgres, the result does not include a row
where column COL_1 is yyyyy, as shown in Example C-6 because FUJITSU Enterprise
Postgres treats that zero-length string as a different value from NULL.
col_1 | col_2
-------+-------
zzzzz |
To avoid these problems after a database migration, engineers must modify the SQL and
PL/SQL in applications to ensure that they receive the same results after migration as before.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 389
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Hierarchical queries.
Sequence Sequence.
Sequence pseudocolumns.
REGEXP_LIKE.
Function SYSDATE.
SYS_CONNECT_BY_PATH.
Implicit conversion.
Zero-length string.
Cursor variables
SQLCODE
Exponentiation operator
FORALL statement
One migration pattern contains multiple examples. The caption prefix identifies what is
explained in each example. In the next sections, we demonstrate the differences between
Oracle SQL and FUJITSU Enterprise Postgres SQL.
When showing runtime examples in an Oracle Database, the caption prefixes are as
follows:
– [Oracle-SQL]: An example of SQL.
– [Oracle-PL/SQL]: An example of PL/SQL.
– [Oracle-Result]: A runtime result of SQL or PL/SQL. In some cases, the results that are
not related to the migration pattern might not be shown in this example.
When showing runtime examples in FUJITSU Enterprise Postgres, the caption prefixes
are as follows:
– [FUJITSU Enterprise Postgres: SQL]: An example of SQL.
– [FUJITSU Enterprise Postgres-PL/pgSQL]: An example of PL/pgSQL.
– [FUJITSU Enterprise Postgres-Result]: A runtime result of SQL or PL/pgSQL. In some
cases, the results that are not related to the migration pattern might not be shown in
this example.
Note: Some of the examples in this section use Oracle Compatible features. For more
information about Oracle Compatible features, see 2.5.2, “Oracle compatible features” and
7.3 “Oracle Compatibility features” in Data Serving with FUJITSU Enterprise Postgres on
IBM LinuxONE, SG24-8499.
SQL
This section covers the following topics:
SELECT statement
DELETE or TRUNCATE statements
ROWNUM pseudocolumn
Sequence
Conditions
Function
Others
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 391
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
SELECT statement
This section covers the following topics:
Migration pattern: MINUS operator
Migration pattern: Hierarchical queries
Migration pattern: Correlation name of subquery
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 393
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 395
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
How you convert the SQL depends on whether the application accesses other partitions while
deleting data in the target partition:
Case 1: Do not concurrently access any partition other than the one where the data is
deleted.
Replace the ALTER TABLE TRUNCATE PARTITION statement in the Oracle Database
(Example C-23 with the result in Example C-24) with the TRUNCATE statement in FUJITSU
Enterprise Postgres (Example C-25).
Case 2: Concurrently access partitions other than the one where the data is deleted.
Replace the ALTER TABLE TRUNCATE PARTITION statement in the Oracle Database
(Example C-23 with the result in Example C-24) with the DELETE statement in FUJITSU
Enterprise Postgres (Example C-26 on page 397 with the result in Example C-27 on
page 397). The DELETE statement does not lock partitions other than the target partition.
However, when migrating to FUJITSU Enterprise Postgres, consider the following items:
– The DELETE statement takes longer to run than the TRUNCATE statement.
– The DELETE statement works differently than the TRUNCATE statement. The DELETE
statement sets flags to indicate that data is deleted in the area. This area is not freed
until the VACUUM statement runs. As a result, performance might degrade when
retrieving or updating after data is deleted and reinserted into the target partition.
Example: C-25 [FUJITSU Enterprise Postgres: SQL] TRUNCATE statements for a partition (Case 1:
Do not access concurrently)
CREATE TABLE TBL(COL_1 CHAR(2), COL_2 CHAR(5))
PARTITION BY LIST(COL_1);
CREATE TABLE TBL_A1 PARTITION OF TBL FOR VALUES IN ('A1');
CREATE TABLE TBL_A2 PARTITION OF TBL FOR VALUES IN ('A2');
CREATE TABLE TBL_B1 PARTITION OF TBL FOR VALUES IN ('B1');
CREATE TABLE TBL_B2 PARTITION OF TBL FOR VALUES IN ('B2');
INSERT INTO TBL VALUES ('A1', '11111');
INSERT INTO TBL VALUES ('A2', '22222');
INSERT INTO TBL VALUES ('B1', '33333');
Example: C-26 [FUJITSU Enterprise Postgres: SQL] TRUNCATE statements for a partition (Case 2:
Access concurrently)
CREATE TABLE TBL(COL_1 CHAR(2), COL_2 CHAR(5))
PARTITION BY LIST(COL_1);
CREATE TABLE TBL_A1 PARTITION OF TBL FOR VALUES IN ('A1');
CREATE TABLE TBL_A2 PARTITION OF TBL FOR VALUES IN ('A2');
CREATE TABLE TBL_B1 PARTITION OF TBL FOR VALUES IN ('B1');
CREATE TABLE TBL_B2 PARTITION OF TBL FOR VALUES IN ('B2');
INSERT INTO TBL VALUES ('A1', '11111');
INSERT INTO TBL VALUES ('A2', '22222');
INSERT INTO TBL VALUES ('B1', '33333');
INSERT INTO TBL VALUES ('B2', '44444');
DELETE FROM TBL_A1;
SELECT * FROM TBL;
ROWNUM pseudocolumn
This section describes the following topics:
Migration pattern: ROWNUM specified in the select list
Migration pattern: ROWNUM specified in the WHERE clause
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 397
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example: C-30 [FUJITSU Enterprise Postgres: SQL] ROWNUM specified in the SELECT list
CREATE TABLE TBL(COL_1 CHAR(3));
INSERT INTO TBL VALUES('AAA');
INSERT INTO TBL VALUES('BBB');
INSERT INTO TBL VALUES('CCC');
INSERT INTO TBL VALUES('DDD');
SELECT ROW_NUMBER() OVER() AS ROWNUM, COL_1
FROM (SELECT * FROM TBL ORDER BY COL_1 DESC) WTBL;
Example: C-31 [FUJITSU Enterprise Postgres-Result] ROWNUM specified in the SELECT list
rownum | col_1
--------+-------
1 | DDD
2 | CCC
3 | BBB
4 | AAA
Example: C-34 [FUJITSU Enterprise Postgres: SQL] ROWNUM specified in the WHERE clause
CREATE TABLE TBL(COL_1 CHAR(3));
INSERT INTO TBL VALUES('AAA');
INSERT INTO TBL VALUES('BBB');
INSERT INTO TBL VALUES('CCC');
INSERT INTO TBL VALUES('DDD');
SELECT COL_1
FROM (SELECT * FROM TBL ORDER BY COL_1 DESC) WTBL
LIMIT 2;
Example: C-35 [FUJITSU Enterprise Postgres-Result] ROWNUM specified in the WHERE clause
col_1
-------
DDD
CCC
Sequence
This section describes the following topics:
Migration pattern: Sequence
Migration pattern: Sequence pseudocolumns
Here are the following major differences and how to convert them:
CACHE
To specify how many sequence numbers are to be pre-allocated and stored in memory,
FUJITSU Enterprise Postgres supports the CACHE option, as does Oracle Database, but
the function is different than Oracle Database. Oracle Database caches sequence
numbers on a per instance basis, but FUJITSU Enterprise Postgres caches them on a per
session basis.
The default value when the CACHE option is omitted is different. Oracle Database sets the
cache size to 20, while FUJITSU Enterprise Postgres sets it to 1.
Because of these differences, when migrating to FUJITSU Enterprise Postgres, consider
the intended use and performance impact of sequence numbers. For example, if
performance when getting the sequence number is the highest priority and it is not
important to skip the number on a per session basis, set the cache size as you would with
Oracle Database. If the highest priority is to avoid skipping the number, set the cache size
to 1.
NOCACHE
To indicate that values of the sequence are not pre-allocated, Oracle Database supports
the NOCACHE option, but FUJITSU Enterprise Postgres does not. Remove the NOCACHE
keyword when migrating. If the CACHE option is not specified, FUJITSU Enterprise Postgres
sets the cache size to 1, so the state is the same as when Oracle Database specifies the
NOCACHE option.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 399
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example C-36 shows a sequence definition when using an Oracle Database with the results
shown in Example C-37. Example C-38 shows a sequence definition when using FUJITSU
Enterprise Postgres with the results shown in Example C-39.
NO MAXVALUE
NO MINVALUE
NO CYCLE;
Example C-40 on page 401 and Example C-41 on page 401 show the use of sequence
pseudocolumns in the Oracle Database, and Example C-42 on page 401 and Example C-43
on page 401 show the use of sequence pseudocolumns in FUJITSU Enterprise Postgres.
Conditions
This section describes the following topics:
Migration pattern: Inequality operator
Migration pattern: REGEXP_LIKE
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 401
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Compare the use of inequality operators as used in Oracle Database (Example C-44 and
Example C-45) with their use in FUJITSU Enterprise Postgres (Example C-46 and
Example C-47).
Compare the use of REGEXP_LIKE by Oracle Database in Example C-48 and Example C-49 on
page 403 to the use of REGEXP_LIKE in FUJITSU Enterprise Postgres in Example C-50 on
page 403 and Example C-51 on page 403.
Function
This section describes the following topics:
Migration pattern: SYSDATE
Migration pattern: SYS_CONNECT_BY_PATH
The data type of the STATEMENT_TIMESTAMP result is different from the SYSDATE result. The
runtime result of STATEMENT_TIMESTAMP is a TIMESTAMP WITH TIME ZONE data type. So,
you must cast the result to the appropriate data type, such as the DATE data type, depending
on the requirements of the application.
Compare the use of the function and results in Oracle Database (Example C-52 and
Example C-53 on page 404) with FUJITSU Enterprise Postgres (Example C-54 on page 404
and Example C-55 on page 404).
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 403
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Replace START WITH and CONNECT BY clauses in Oracle Database hierarchical queries with the
clause WITH RECURSIVE when migrating to FUJITSU Enterprise Postgres. Use the string
concatenation operator (||) and add the following two processing methods in the recursive
query:
1. Add processing to the root row in a recursive query.
The root row of a recursive query is specified in the SELECT list before UNION ALL in the
SELECT statement. Add to this SELECT list the equivalent value of the root row of the
SYS_CONNECT_BY_PATH function. The value to add is the string concatenation of the second
argument delimiter and the first argument of this function.
2. Add processing to the repeated row in a recursive query.
The repeated row of a recursive query is specified in the SELECT list after UNION ALL in the
SELECT statement. Add to this SELECT list the equivalent value of the repeated row of the
SYS_CONNECT_BY_PATH function. The value to add is the string concatenation of the parent
row, the second argument delimiter, and the first argument of this function.
Example C-56 and Example C-57 on page 405 demonstrate how Oracle Database handles
the SYS_CONNCT_BY_PATH function. Example C-58 on page 405 and Example C-59 on
page 405 demonstrate how FUJITSU Enterprise Postgres handles the SYS_CONNCT_BY_PATH
function.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 405
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
/A/A2/A21
/A/A2/A22
/A/A2/A22/A221
/A/A3
Others
This section describes the following topics:
Migration pattern: Database object name
Migration pattern: Implicit conversion
Migration pattern: Zero-length strings
Migration pattern: Comparing fixed-length character strings and variable-length character
strings
As a best practice, define database object names with an unquoted identifier and refer to
them by using an unquoted identifier when migrating to FUJITSU Enterprise Postgres
because the quoted identifier is supported on FUJITSU Enterprise Postgres but not accepted
by some tools that manage database objects.
Example C-60 and Example C-61 provide examples when using Oracle Database, and
Example C-62 on page 407 and Example C-63 on page 407 provide examples when using
FUJITSU Enterprise Postgres.
Example C-64 and Example C-65 provide examples when using an Oracle Database, and
Example C-66 and Example C-67 on page 408 provide examples when using FUJITSU
Enterprise Postgres.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 407
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example C-68 and Example C-69 provide examples when using an Oracle Database, and
Example C-70 and Example C-71 provide examples when using FUJITSU Enterprise
Postgres.
FUJITSU Enterprise Postgres behaves differently than Oracle Database when comparing
fixed-length string data of different lengths with variable-length string data. FUJITSU
Enterprise Postgres removes or adds trailing spaces in fixed-length strings to match the
length of value of the fixed-length string to be that of the length of the variable-length string
before comparing. If trailing spaces are removed or added but the length of data does not
match, these values are determined not to match.
When migrating the SQL for a comparison between fixed-length character string data and
variable-length character string data, use the RPAD function for the fixed-length character
string data on FUJITSU Enterprise Postgres. Specify fixed-length character string data in the
first argument of RPAD, and the length of the data in the second argument. The result of this
RPAD function is the same value as a variable-length character string and the same length as a
fixed-length character string with trailing spaces. So, FUJITSU Enterprise Postgres can
compare the values, including trailing spaces, and the comparison result is the same as
thought it were done in an Oracle Database.
Example C-72 and Example C-73 provide examples when using an Oracle Database, and
Example C-74 on page 410 and Example C-75 on page 410 provide examples when using
FUJITSU Enterprise Postgres.
Example: C-72 [Oracle-SQL] Comparing a fixed-length character string and variable-length character
string
CREATE TABLE TBL_1(COL_1 CHAR(5));
CREATE TABLE TBL_2(COL_1 VARCHAR2(5));
INSERT INTO TBL_1 VALUES('AAA');
INSERT INTO TBL_1 VALUES('BBB');
INSERT INTO TBL_1 VALUES('CCC');
INSERT INTO TBL_2 VALUES('AAA ');
INSERT INTO TBL_2 VALUES('BBB');
INSERT INTO TBL_2 VALUES('CCC ');
SELECT TBL_1.COL_1 FROM TBL_1, TBL_2
WHERE TBL_1.COL_1 = TBL_2.COL_1
ORDER BY COL_1;
Example: C-73 [Oracle-Result] Comparing a fixed-length character string and variable-length character
string
COL_1
-----
AAA
CCC
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 409
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example: C-74 [FUJITSU Enterprise Postgres: SQL] Comparing a fixed-length character string and
variable-length character string
CREATE TABLE TBL_1(COL_1 CHAR(5));
CREATE TABLE TBL_2(COL_1 VARCHAR(5));
INSERT INTO TBL_1 VALUES('AAA');
INSERT INTO TBL_1 VALUES('BBB');
INSERT INTO TBL_1 VALUES('CCC');
INSERT INTO TBL_2 VALUES('AAA ');
INSERT INTO TBL_2 VALUES('BBB');
INSERT INTO TBL_2 VALUES('CCC ');
SELECT TBL_1.COL_1 FROM TBL_1, TBL_2
WHERE RPAD(TBL_1.COL_1, 5) = TBL_2.COL_1
ORDER BY COL_1;
Example: C-75 [FUJITSU Enterprise Postgres-Result] Comparing a fixed-length character string and
variable-length character string
col_1
-------
AAA
CCC
PL/SQL
This section describes the following topics:
Database trigger migration pattern
Cursors
Error handling
Stored functions
Stored procedures migration pattern
Other migration patterns
Example C-76 and Example C-77 on page 412 provide examples when using Oracle
Database, and Example C-78 on page 412 and Example C-79 on page 413 provide
examples when using FUJITSU Enterprise Postgres.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 411
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Cursors
Cursors are supported on FUJITSU Enterprise Postgres and on Oracle Database. However,
some functions might need to be modified when migrating because there are some
incompatibilities.
In this section, two major differences are described: One is cursor attributes, and the other is
cursor variables. These examples of FUJITSU Enterprise Postgres enable Oracle
compatibility features.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 413
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
%FOUND
Use the FOUND variable in FUJITSU Enterprise Postgres to get the same information as
%FOUND.
%ROWCOUNT
GET DIAGNOSTICS enables you to get the number of rows that is processed by the last SQL
in FUJITSU Enterprise Postgres. Use GET DIAGNOSTICS to retrieve the information that is
retrieved by %ROWCOUNT.
Example C-80 and Example C-81 on page 415 provide examples when using Oracle
Database, and Example C-82 on page 415 and Example C-83 on page 416 provide
examples when using FUJITSU Enterprise Postgres.
END IF;
EXIT WHEN cur2%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(
'id=' || var_id || ', row_number=' || cur2%ROWCOUNT);
END LOOP;
CLOSE cur2;
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('exception block');
IF cur1%ISOPEN THEN
CLOSE cur1;
END IF;
IF cur2%ISOPEN THEN
CLOSE cur2;
END IF;
END;
/
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 415
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
LOOP
FETCH cur1 INTO var_id, var_val;
IF FOUND THEN
PERFORM DBMS_OUTPUT.PUT_LINE(
'id=' || var_id || ', val=' || var_val);
ELSE
EXIT;
END IF;
END LOOP;
CLOSE cur1;
cur1_isopen := FALSE;
PERFORM DBMS_OUTPUT.PUT_LINE('**** using cur2 ****');
IF NOT cur2_isopen THEN
OPEN cur2 FOR
SELECT id, val FROM cur_tbl WHERE id > 100;
cur2_isopen := TRUE;
var_count := 0;
END IF;
LOOP
FETCH cur2 INTO var_id, var_val;
GET DIAGNOSTICS row_count = ROW_COUNT;
var_count := var_count + row_count;
IF var_count = 0 THEN
PERFORM DBMS_OUTPUT.PUT_LINE('No data found');
EXIT;
END IF;
EXIT WHEN NOT FOUND;
PERFORM DBMS_OUTPUT.PUT_LINE(
'id=' || var_id || ', row_number=' || var_count);
END LOOP;
CLOSE cur2;
cur2_isopen := FALSE;
EXCEPTION
WHEN OTHERS THEN
PERFORM DBMS_OUTPUT.PUT_LINE('exception block');
END;
$$ LANGUAGE plpgsql;
When migrating to FUJITSU Enterprise Postgres, remove the REF CURSOR definition, and
define the variable that was defined as REF CURSOR in Oracle Database as the refcursor
data type.
Example C-84 and Example C-85 provide examples when using Oracle Database, and
Example C-86 and Example C-87 on page 418 provide examples when using FUJITSU
Enterprise Postgres.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 417
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Error handling
This section describes the following topics:
Migration pattern: Predefined exceptions
Migration pattern: SQLCODE
Table C-3 provides the major predefined exceptions of Oracle Database, and examples of
what predefined exceptions should be used in FUJITSU Enterprise Postgres when migrating.
In this section, we show concrete examples of how to convert the five predefined exceptions
that are listed in Table C-3 on page 418 when migrating to FUJITSU Enterprise Postgres.
These examples enable Oracle compatibility features.
CURSOR_ALREADY_OPEN: Shown in Example C-88 through Example C-91 on page 421.
DUP_VAL_ON_INDEX: Shown in Example C-92 on page 421 through Example C-95 on
page 422.
INVALID_CURSOR: Shown in Example C-96 on page 422 through Example C-99 on
page 423.
INVALID_NUMBER: Shown in Example C-100 on page 423 through Example C-103 on
page 424.
ZERO_DIVIDE: Shown in Example C-104 on page 424 through Example C-107 on
page 425.
CURSOR_ALREADY_OPEN
Example: C-88 [Oracle -PL/SQL] Predefined exceptions: CURSOR_ALREADY_OPEN
CREATE TABLE cur_tbl(
id NUMBER,
val VARCHAR2(10)
);
INSERT INTO cur_tbl(id, val) (
SELECT LEVEL, 'data' || LEVEL FROM DUAL
CONNECT BY LEVEL <= 10
);
DECLARE
CURSOR cur IS SELECT id, val from cur_tbl WHERE id < 3;
var_id PLS_INTEGER;
var_val VARCHAR2(10);
BEGIN
DBMS_OUTPUT.PUT_LINE('Anonymous block: BEGIN');
OPEN cur;
LOOP
FETCH cur INTO var_id, var_val;
EXIT WHEN cur%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(
'id=' || var_id || ', val=' || var_val);
END LOOP;
-- various processing
-- Exception occurs
OPEN cur;
-- various processing
CLOSE cur;
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 419
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
DUP_VAL_ON_INDEX
Example: C-92 [Oracle -PL/SQL] Predefined exceptions: DUP_VAL_ON_INDEX
CREATE TABLE sample_tbl(
id NUMBER UNIQUE,
val VARCHAR2(10)
);
BEGIN
INSERT INTO sample_tbl VALUES(1, 'data1');
DBMS_OUTPUT.PUT_LINE('INSERT: data1');
-- Exception occurs
INSERT INTO sample_tbl VALUES(1, 'data2');
DBMS_OUTPUT.PUT_LINE('INSERT: data2');
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
DBMS_OUTPUT.PUT_LINE('DUP_VAL_ON_INDEX');
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
/
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 421
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
EXCEPTION
WHEN UNIQUE_VIOLATION THEN
PERFORM DBMS_OUTPUT.PUT_LINE('DUP_VAL_ON_INDEX');
WHEN OTHERS THEN
PERFORM DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
$$ LANGUAGE plpgsql;
INVALID_CURSOR
Example: C-96 [Oracle -PL/SQL] Predefined exceptions: INVALID_CURSOR
CREATE TABLE cur_tbl(
id NUMBER,
val VARCHAR2(10)
);
INSERT INTO cur_tbl(id, val) (
SELECT LEVEL, 'data' || LEVEL FROM DUAL
CONNECT BY LEVEL <= 10
);
DECLARE
CURSOR cur IS SELECT id, val FROM cur_tbl WHERE id = 1;
var_id PLS_INTEGER;
var_val VARCHAR2(10);
BEGIN
DBMS_OUTPUT.PUT_LINE('Anonymous block: BEGIN');
LOOP
-- Exception occurs
FETCH cur INTO var_id, var_val;
EXIT WHEN cur%NOTFOUND;
DBMS_OUTPUT.PUT_LINE('id:' || var_id);
DBMS_OUTPUT.PUT_LINE('val:' || var_val);
END LOOP;
CLOSE cur;
DBMS_OUTPUT.PUT_LINE('Anonymous block: END');
EXCEPTION
WHEN INVALID_CURSOR THEN
DBMS_OUTPUT.PUT_LINE('INVALID_CURSOR');
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
/
CLOSE cur;
PERFORM DBMS_OUTPUT.PUT_LINE('Anonymous block: END');
EXCEPTION
WHEN INVALID_CURSOR_STATE OR INVALID_CURSOR_NAME THEN
PERFORM DBMS_OUTPUT.PUT_LINE('INVALID_CURSOR');
WHEN OTHERS THEN
PERFORM DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
$$ LANGUAGE plpgsql;
INVALID_NUMBER
Example: C-100 [Oracle -PL/SQL] Predefined exceptions: INVALID_NUMBER
CREATE TABLE sample_tbl(
id NUMBER,
val VARCHAR2(10)
);
DECLARE
id CHAR(5) := 'a001';
val VARCHAR2(10) := 'data';
BEGIN
-- Exception occurs
INSERT INTO sample_tbl VALUES(CAST(id AS NUMBER), val);
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 423
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
EXCEPTION
WHEN INVALID_NUMBER THEN
DBMS_OUTPUT.PUT_LINE('INVALID_NUMBER');
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
/
PERFORM DBMS_OUTPUT.PUT_LINE('INVALID_NUMBER');
WHEN OTHERS THEN
PERFORM DBMS_OUTPUT.PUT_LINE('OTHERS');
END;
$$ LANGUAGE plpgsql;
ZERO_DIVIDE
Example: C-104 [Oracle -PL/SQL] Predefined exceptions: ZERO_DIVIDE
DECLARE
day_num NUMBER := 0;
total_sales NUMBER := 0;
sales_avg NUMBER := 0;
BEGIN
-- Exception occurs
sales_avg := total_sales / day_num;
DBMS_OUTPUT.PUT_LINE(
In addition, there is a difference in data type of the returned value. SQLCODE returns a value of
numeric data type, and SQLSTATE returns an alphanumeric value of a string data type.
Therefore, if the process is created with the expectation of returning a numeric value, you
must modify the process to use the string data type.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 425
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Compare the handling of the SQLCODE function in Example C-108 and Example C-109 from
Oracle Database with the handling of SQLSTATE in Example C-110 and Example C-111 on
page 427 with FUJITSU Enterprise Postgres.
);
END;
$$ LANGUAGE plpgsql;
Stored functions
This section describes the following topics:
Migration pattern: Stored functions
Migration pattern: Stored functions (performance improvements)
The following examples show the basic conversion method of the stored function. These
examples of FUJITSU Enterprise Postgres enable Oracle compatibility features.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 427
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
RETURN ret_val;
END create_message;
/
DECLARE
msg1 VARCHAR2(50);
msg2 VARCHAR2(100);
BEGIN
msg2 := 'sample message';
msg1 := create_message('Tom', msg2);
DBMS_OUTPUT.PUT_LINE(msg1);
DBMS_OUTPUT.PUT_LINE(msg2);
END;
/
These examples of FUJITSU Enterprise Postgres use the EXPLAIN statement and output the
query plan to confirm the run time. These examples show that the run time of a stored
function with IMMUTABLE is shorter than the one of a stored function without IMMUTABLE,
although the difference is small because it is a simple example.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 429
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example: C-119 [FUJITSU Enterprise Postgres- Result] Stored functions without specifying
IMMUTABLE
message
------------------------
NOTICE: sample message
QUERY PLAN
----------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=32) (actual time=0.006..0.008 rows=1
loops=1)
Output: create_message('sample message'::character varying)
Planning Time: 0.012 ms
Execution Time: 0.021 ms
Example C-122 - Example C-125 on page 433 show the basic conversion method of stored
procedures. These examples of FUJITSU Enterprise Postgres enable Oracle compatibility
features.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 431
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
IS
BEGIN
DBMS_OUTPUT.PUT_LINE('---- Login message ----');
END;
/
CREATE OR REPLACE PROCEDURE sample_proc(
user_id NUMBER,
message IN OUT VARCHAR2,
login_time OUT TIMESTAMP
)
IS
user_name VARCHAR2(10);
query VARCHAR2(100) := 'SELECT name FROM user_tbl WHERE user_id = :user_id';
BEGIN
EXECUTE IMMEDIATE query INTO user_name USING user_id;
message := message || ' ' || user_name;
login_time := SYSTIMESTAMP;
END;
/
DECLARE
message VARCHAR2(100) := 'Hello';
login_time TIMESTAMP;
BEGIN
login_message;
sample_proc(1, message, login_time);
DBMS_OUTPUT.PUT_LINE(message);
DBMS_OUTPUT.PUT_LINE('Login Time: ' || login_time);
END;
/
AS $$
DECLARE
user_name VARCHAR(10);
query VARCHAR(100) := 'SELECT name FROM user_tbl WHERE user_id = $1';
BEGIN
EXECUTE query INTO user_name USING user_id;
message := message || ' ' || user_name;
login_time := statement_timestamp();
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
DO $$
DECLARE
message VARCHAR(100) := 'Hello';
login_time TIMESTAMP;
BEGIN
PERFORM DBMS_OUTPUT.SERVEROUTPUT(TRUE);
CALL login_message();
CALL sample_proc(1, message, login_time);
PERFORM DBMS_OUTPUT.PUT_LINE(message);
PERFORM DBMS_OUTPUT.PUT_LINE('Login Time: ' || login_time);
END;
$$ LANGUAGE plpgsql;
Example C-126 - Example C-129 on page 434 of FUJITSU Enterprise Postgres enable
Oracle compatibility features.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 433
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Example: C-129 [FUJITSU Enterprise Postgres- Result] Cursor FOR LOOP statements
id=1, val=data1
id=2, val=data2
DO
The basic function is the same, but there are differences, such as in syntax or other functions.
The major differences are as follows.
FUJITSU Enterprise Postgres does not support the IMMEDIATE keyword. When migrating
from an Oracle Database, remove the keyword IMMEDIATE (see Example C-130 -
Example C-133 on page 436).
The way to write placeholders differs between Oracle Database and FUJITSU Enterprise
Postgres. Specify $1, $2, and so on, when migrating to FUJITSU Enterprise Postgres.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 435
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
user_name VARCHAR
)
AS $$
DECLARE
query VARCHAR(256)
:= 'INSERT INTO user_tbl VALUES(nextval(''user_id_seq''), $1)';
BEGIN
EXECUTE query USING user_name;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
CALL user_registration('Mary');
SELECT * FROM user_tbl;
Example C-134 - Example C-137 on page 437 of FUJITSU Enterprise Postgres enables
Oracle compatibility features.
Appendix C. Converting SQL and PL/SQL to Fujitsu Enterprise Postgres SQL and PL/pgSQL 437
8518ax01.fm Draft Document for Review November 21, 2024 7:17 am
Related publications
The publications that are listed in this section are considered suitable for a more detailed
description of the topics that are covered in this book.
IBM Redbooks
The following IBM Redbooks publications provides more information about the topics in this
document. The publication that is referenced in this list might be available in softcopy only.
Data Serving with FUJITSU Enterprise Postgres on IBM LinuxONE, SG24-8499
Leveraging LinuxONE to Maximize Your Data Serving Capabilities, SG24-8518
You can search for, view, download, or order this document and other Redbooks, Redpapers,
web docs, drafts, and additional materials, at the following website:
ibm.com/redbooks
Online resources
The following websites are also relevant as further information sources:
FUJITSU Enterprise Postgres on IBM LinuxONE
https://www.postgresql.fastware.com/fujitsu-enterprise-postgres-on-ibm-linuxone
?utm_referrer=https%3A%2F%2Ffastware.com%2F
SG24-8518-01
ISBN
Printed in U.S.A.
®
ibm.com/redbooks