0% found this document useful (0 votes)

61 views40 pages

Model Order Reduction Via Matlab Parallel Computing Toolbox: Istanbul Technical University

This document discusses model order reduction via parallel computing using MATLAB. It introduces rational Krylov methods for model order reduction, which can be computationally expensive for dense problems. The document then discusses how parallel computing can help address this issue by breaking down the rational Krylov computation into discrete parts that can be solved concurrently on multiple CPUs. It provides an overview of parallel programming models in MATLAB, including parallel for-loops and distributed arrays, and demonstrates parallelizing the rational Krylov algorithm to improve computational performance.

Uploaded by

Rakesh Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views40 pages

Model Order Reduction Via Matlab Parallel Computing Toolbox: Istanbul Technical University

Uploaded by

Rakesh Dutta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Model Order Reduction via Matlab Parallel Computing

Toolbox

E. Fatih Yetkin & Hasan Dağ

Istanbul Technical University

Computational Science & Engineering Department

September 21, 2009

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 1 / 40
1 Parallel Computation
Why We Need Parallelism in MOR?
What is Parallelism?
Parallel Architectures

2 Tools of Parallelization
Programming Models
Parallel Matlab

3 Parallel Version of Rational Krylov Methods

Rational Krylov Methods
H2 optimality and Rational Krylov methods
An Example System
Parallelization of the Algorithm
Results

4 Conclusions

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 2 / 40
Why We Need Parallelism in MOR?
Computational Complexity

Model reduction methods aim to build a model, which is easy to

handle. However, for some type of methods such as balanced
truncation or rational Krylov reduction process takes lots of time for
dense problems.
Computational Complexity of Rational Krylov Methods
Complexity of the process decomposition of (A − σi E ) for k points is
O(N 3 )

Therefore, especially in dense problems parallelism is an obligation.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 3 / 40
What is Parallelism?
Sequential Programming

A single CPU (core) is available

Problem is composed of series of commands
Each command is executed one after another

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 4 / 40
What is Parallelism?
Parallel Programming

In the simplest sense, parallel computing is the simultaneous use of

multiple computing resources to solve a computational problem:
I with multiple CPUs or cores
I Problem is broken into discrete parts that can be solved concurrently.
Each part is executed on different CPUs simultaneously.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 5 / 40
Parallel Architectures
Shared Memory

Generally shared memory machines have in common the ability for all
processors to access all memory as global address space.
Multiple processors can operate independently but share the same
memory resources.
Shared memory machines can be divided into two main classes based
upon memory access times: UMA and NUMA

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 6 / 40
Parallel Architectures
UMA vs. NUMA

In Uniform Memory Access (UMA) architecture, identical processors

has equal access times to memory. Also called Symmetric
Multiprocessor (SMP).
Non-uniform Memory Access (NUMA) machines, often made by
physically linking two or more SMPs and not all processors have equal
access time to all memories.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 7 / 40
Parallel Architectures
Distributed Memory

Processors have their own local memory. Memory addresses in one

processor do not map to another processor, so there is no concept of
global address space across all processors.
When a processor needs access to data in another processor, it is
usually the task of the programmer to explicitly define how and when
data is communicated. Synchronization between tasks is likewise the
programmer’s responsibility.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 8 / 40
Parallel Architectures
Hybrid Memory

The largest and the fastest computers in the world today employ both
shared and distributed memory architectures.
The shared memory component is usually a cache coherent SMP
machine. Processors on a given SMP can address that machine’s
memory as global.
Network communications are required to move data from one SMP to
another.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 9 / 40
Parallel Programming Models: Threads
POSIX Threads & OpenMP

In the threads model of parallel programming, a single process can

have multiple, concurrent execution paths.
Threads can come and go, but a.out remains present to provide the
necessary shared resources until the application is completed.
Unrelated standardization efforts have resulted in two very different
implementations of threads: POSIX Threads and OpenMP.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 10 / 40
Parallel Programming Models: Message Passing
Interface
MPI

A set of tasks that use their own local memory during computation.
Multiple tasks can reside on the same physical machine as well across
an arbitrary number of machines.
Tasks exchange data through communications by sending and
receiving messages.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 11 / 40
Matlab Distributed Computing Toolbox
Distributed or Parallel

From the view of Matlab terminology parallel jobs run on the internal
workers such as cores and distributed jobs run on the cluster nodes.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 12 / 40
Basics of Parallel Computing Toolbox
parfor

In Matlab you can use parfor to make a parallel loop.

Message passing or some low level communication issues handled by
Matlab itself.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 13 / 40
Basics of Parallel Computing Toolbox
when we can use parfor?

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 14 / 40
Basics of Parallel Computing Toolbox
when we can not use parfor?

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 15 / 40
Basics of Parallel Computing Toolbox
single process multiple data (spmd)

In Matlab you can use spmd blocks to run a process on different data
sets.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 16 / 40
Basics of Parallel Computing Toolbox
single process multiple data (spmd)

Master processor has a right to access for all workers’ data

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 17 / 40
Basics of Parallel Computing Toolbox
distributed arrays

It is possible to distribute any array to workers.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 18 / 40
Basics of Parallel Computing Toolbox
distributed arrays

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 19 / 40
Matrix transposing
MPI-Fortran vs. Matlab -DCT

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 20 / 40
Rational Krylov Methods

If D selected as zero system triple can be selected as Σ = (A, B, C ) for

ẋ = Ax + Bu
y = C T x + Du

Two matrices V ∈ Rnxk and W ∈ Rnxk can be defined where

W ∗ V = Ik and k n
With these two matrices reduced order system can be found as

Â = W ∗ AV B̂ = W ∗ B Ĉ = CV (1)

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 21 / 40
Rational Krylov Method

There are lots of ways to build the projection matrices.

One way is using rational Krylov subspace bases.
Assume that k distinct points in complex plane are selected for
interpolation.
Then interpolation matrices, V and W can be built as shown below.

V = [(s1 I − A)−1 B . . . (sk I − A)−1 B]

Ŵ = [(s1 I − AT )−1 C . . . (sk I − AT 1)−1 C ] (2)

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 22 / 40
Rational Krylov Projectors

Assuming that det(Ŵ ∗ V ) 6= 0, the projected reduced system can be

built as,
Â = W T AV , B̂ = W T B, Ĉ = CV (3)
where W = Ŵ (V̂ ∗ W )−1 to ensure W ∗ V = Ik .
The basic problem is to find a strategy to select the interpolation
points.
As the worst case, the interpolation points can be selected as
randomly from the operating frequency of the system.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 23 / 40
Rational Krylov Projectors

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 24 / 40
H2 norm of a system

This approach is not optimal. To improve this approach several

methods can be used. In this work we use the iterative rational Krylov
approach to achieve H2 norm optimal reduced model.
H2 norm of a system is defined as below,
Z +∞ 1/2
2
||G ||2 := |G (jω)| dω (4)
−∞

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 25 / 40
H2 optimality

Reduced order system Gr (s) is H2 optimal if it minimizes the

Gr (s) = argmindeg (Ĝ )=r ||G (s) − Ĝ (s)||H2 (5)

And there are two important theorems to obtain an H2 optimal

reduced model given by Meier (1967) and Grimme (1997).
Antoulas et.al. combine these two important results to achieve an
Iterative Rational Krylov Algorithm (IRKA) to obtain H2 optimal
reduced order model

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 26 / 40
Iterative Rational Krylov Algorithm (IRKA)

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 27 / 40
Example RLC network

We use a ladder RLC network as benchmark example for the

numerical implementation of the Alg.1 and Alg.2.
Minimal realization of the circuit is given in Fig.1. For this circuit
order of the system n = 5. On the other hand, system matrices of
this circuit can easily be extended

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 28 / 40
Frequency plots of the reduced and original systems
N=201 and the order of reduced system k=20

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 29 / 40
Computational Cost of Methods

Computational cost of the rational Krylov methods can be given as

O(N 3 ) for dense problems
In IRKA rational Krylov methods are used iteratively and the
computational complexity has to be multiplied by the iteration
number r .

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 30 / 40
Parallel Parts of Algorithms

Although both algorithms have k times factorization to compute

(si I − A)−1 B, these factorizations can be computed on different
processors independently.
The matrix-matrix and matrix-vector multiplications in the algorithms
are amenable to parallel processing.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 31 / 40
Parallel Version of Alg. 1

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 32 / 40
Parallel Version of Alg. 1

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 33 / 40
CPU times for Rational Krylov

Table: CPU times of parallel version of Alg.1 for different system orders where
the reduced system order k=200.

Proc no. time (n=2000) time (n=5000)

1 59.8 1485.3
2 31.4 780.7
4 21.2 451.4
8 23.8 374.2

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 34 / 40
CPU times for IRKA

Table: CPU times of parallel version of Alg.2 for different system orders where
the reduced system order k=200.

Proc no. time (n=2000) time (n=5000)

1 512.6 2486.2
2 410.7 1605.9
4 203.9 810.8
8 176.1 648.4

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 35 / 40
Speedup graph for RK

Speedup of a parallel algorithm is defined as

T1
Sp = (6)
Tp

where T1 is the CPU time for one processor and Tp is the CPU time for P
processor.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 36 / 40
Speedup graph for RK

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 37 / 40
Speedup graph for IRKA

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 38 / 40
continued

It can easily be seen from the figures, when we increase the number
of processors processing time decreases appreciably upto some point,
after which it starts to increase.
This is due to communication times becoming dominant over
computation time. But in both algorithm, when the size of the
system matrices are getting larger better speedups are obtained.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 39 / 40
Conclusions

In this work, iterative rational Krylov method based optimal H2 norm

model reduction methods are parallelized.
These methods require huge computation but the algorithm
themselves are suitable for parallel processing.
Therefore, computational time decreases when the number of
processors is increased.
Due to communication needs of the processors, communication time
dominates the overall process time when the system order is small.
But in larger orders, parallel algorithm has better speedup values.

E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, 2009 40 / 40

Applied Parallel Computing Deng 2011
No ratings yet
Applied Parallel Computing Deng 2011
232 pages
PSCAD V5 - Setup Instructions (Detailed)
No ratings yet
PSCAD V5 - Setup Instructions (Detailed)
82 pages
BlackBerry Application Developer Guide Volume 1
No ratings yet
BlackBerry Application Developer Guide Volume 1
234 pages
Parallel Computing A Comparative
No ratings yet
Parallel Computing A Comparative
65 pages
Chapter 3 - Principles of Parallel Algorithm Design
No ratings yet
Chapter 3 - Principles of Parallel Algorithm Design
52 pages
Cruise
No ratings yet
Cruise
73 pages
Abraham Hicks
No ratings yet
Abraham Hicks
121 pages
DC Unit IV
No ratings yet
DC Unit IV
28 pages
Beginners Python Cheat Sheet PCC Django
No ratings yet
Beginners Python Cheat Sheet PCC Django
4 pages
JavaScript Core
No ratings yet
JavaScript Core
206 pages
Parallel Matlab
No ratings yet
Parallel Matlab
51 pages
HPC Scaling
No ratings yet
HPC Scaling
56 pages
Lecture 8
No ratings yet
Lecture 8
18 pages
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
No ratings yet
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
27 pages
Parallel Computing With Matlab UVACSE Short Course: E Hall
No ratings yet
Parallel Computing With Matlab UVACSE Short Course: E Hall
39 pages
Parallel Computing With Matlab UVACSE Short Course: E Hall
No ratings yet
Parallel Computing With Matlab UVACSE Short Course: E Hall
39 pages
Unit - 2 HPC
No ratings yet
Unit - 2 HPC
96 pages
Unit 3
No ratings yet
Unit 3
10 pages
P 1
No ratings yet
P 1
44 pages
LBYCPEI Final Report 6
No ratings yet
LBYCPEI Final Report 6
36 pages
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
No ratings yet
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
22 pages
Report - Viber String
No ratings yet
Report - Viber String
26 pages
My Python Lab Sheet3
No ratings yet
My Python Lab Sheet3
11 pages
TwinCAT PLC
No ratings yet
TwinCAT PLC
316 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
How - To.spot.a.fake - aXXo.or - FXG.release - Before.you - Download KuoottA
No ratings yet
How - To.spot.a.fake - aXXo.or - FXG.release - Before.you - Download KuoottA
2 pages
Parallel Computing Unit 3 - Principles of Parallel Computing Design
No ratings yet
Parallel Computing Unit 3 - Principles of Parallel Computing Design
78 pages
DDL - Data Definition Language
No ratings yet
DDL - Data Definition Language
9 pages
Company Name
No ratings yet
Company Name
10 pages
2 Parallel Processing For Scientific Computing PDF
No ratings yet
2 Parallel Processing For Scientific Computing PDF
422 pages
L2 Parallel Computing Models
No ratings yet
L2 Parallel Computing Models
31 pages
Pda 2
No ratings yet
Pda 2
105 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
Chap 4-7 - Parallel - Abstractions - and - MPI
No ratings yet
Chap 4-7 - Parallel - Abstractions - and - MPI
34 pages
Docker Mastery - 5
No ratings yet
Docker Mastery - 5
87 pages
Area Code AO Type Range Code AO No.: Signature of The Declarant
No ratings yet
Area Code AO Type Range Code AO No.: Signature of The Declarant
2 pages
Intro To Parallel Computing
No ratings yet
Intro To Parallel Computing
127 pages
EER Diagram Slides
No ratings yet
EER Diagram Slides
35 pages
Gallopoulos - Parallelism in Matrix Computations
No ratings yet
Gallopoulos - Parallelism in Matrix Computations
505 pages
Model Question Paper-1 With Effect From 2020-21 (CBCS Scheme)
No ratings yet
Model Question Paper-1 With Effect From 2020-21 (CBCS Scheme)
3 pages
Parallel Computing
No ratings yet
Parallel Computing
30 pages
Sparse 1
No ratings yet
Sparse 1
68 pages
Create Varchar Varchar Varchar Int
No ratings yet
Create Varchar Varchar Varchar Int
3 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
Windows Phone 8 Application Security Slides
No ratings yet
Windows Phone 8 Application Security Slides
43 pages
MCA - Project Documentation Guidelines 2021-2022
No ratings yet
MCA - Project Documentation Guidelines 2021-2022
4 pages
Stanford 2013
No ratings yet
Stanford 2013
36 pages
Data Compression and Cryptography
No ratings yet
Data Compression and Cryptography
1 page
Pub Parallel-Programming PDF
100% (1)
Pub Parallel-Programming PDF
242 pages
Content PDF
No ratings yet
Content PDF
14 pages
Lecture 5 Network Topologies For Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies For Parallel Architectures - Updated
46 pages
Hpclab
No ratings yet
Hpclab
58 pages
L19-20 PA Design Intro
No ratings yet
L19-20 PA Design Intro
31 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
5 Product Metrics CH 23
No ratings yet
5 Product Metrics CH 23
28 pages
Lec 6 7
No ratings yet
Lec 6 7
16 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
34 pages
CSE1021 - Problem Solving and Programming - Unit 2
No ratings yet
CSE1021 - Problem Solving and Programming - Unit 2
34 pages
Parallel Computing With Matlab: Sarah Wait Zaranek Application Engineer Mathworks, Inc
No ratings yet
Parallel Computing With Matlab: Sarah Wait Zaranek Application Engineer Mathworks, Inc
44 pages
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
No ratings yet
Parallel Computing: Overview: John Urbanic Urbanic@psc - Edu
33 pages
Week 3 Parallel Algorithms
No ratings yet
Week 3 Parallel Algorithms
10 pages
Is There A Way To Listen To Multiple Python Sockets at Once - Stack Overflow
No ratings yet
Is There A Way To Listen To Multiple Python Sockets at Once - Stack Overflow
2 pages
Lib Burst Generated
No ratings yet
Lib Burst Generated
8 pages
S R T S: OME Esearch Opics For Tudents
No ratings yet
S R T S: OME Esearch Opics For Tudents
3 pages
Tybca Blackbook Guidelines
No ratings yet
Tybca Blackbook Guidelines
15 pages
BizTalk Server 2010 Runtime Architecture Poster
No ratings yet
BizTalk Server 2010 Runtime Architecture Poster
1 page
2 Introduction To Application Development
No ratings yet
2 Introduction To Application Development
17 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
Android App Pentesting Checklist by @hrishikesh7665
No ratings yet
Android App Pentesting Checklist by @hrishikesh7665
43 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
HPC2
No ratings yet
HPC2
22 pages
07 Parallel Algorithms in Parallel and Distributed Computing
No ratings yet
07 Parallel Algorithms in Parallel and Distributed Computing
13 pages
Sequential Numbering - Adobe Acrobat
No ratings yet
Sequential Numbering - Adobe Acrobat
5 pages
Practical11 Python Programming CkC21BUjW7
No ratings yet
Practical11 Python Programming CkC21BUjW7
10 pages
Project Report (Amazon Review (Sentiment Analysis) )
No ratings yet
Project Report (Amazon Review (Sentiment Analysis) )
31 pages
1 Module 1 Parallelism Fundamentals Motivation Key Concepts and Challenges Parallel Computing
No ratings yet
1 Module 1 Parallelism Fundamentals Motivation Key Concepts and Challenges Parallel Computing
81 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Parallel N Distributed Systems
No ratings yet
Parallel N Distributed Systems
44 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
3-Parallel Software
No ratings yet
3-Parallel Software
35 pages
A Review On Use of MPI in Parallel Algorithms: IPASJ International Journal of Computer Science (IIJCS)
No ratings yet
A Review On Use of MPI in Parallel Algorithms: IPASJ International Journal of Computer Science (IIJCS)
8 pages
Simcenter Nastran 2019.1: Parallel Processing Guide
No ratings yet
Simcenter Nastran 2019.1: Parallel Processing Guide
112 pages
Parallel Matlab
No ratings yet
Parallel Matlab
27 pages
BCSE412L - Parallel Computing 01
No ratings yet
BCSE412L - Parallel Computing 01
27 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Parallel Models of Computation
No ratings yet
Parallel Models of Computation
3 pages
Lecture Notes On Parallel Computation
No ratings yet
Lecture Notes On Parallel Computation
30 pages
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
From Everand
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
Eric Okoth Ogur
No ratings yet
Timer Code
No ratings yet
Timer Code
1 page
Blood Bank Database12
No ratings yet
Blood Bank Database12
14 pages
A Mini Project Report ON Web Based College Admission System: Bachelor of Computer Applications
50% (2)
A Mini Project Report ON Web Based College Admission System: Bachelor of Computer Applications
48 pages
Introduction to Computing DSST Quick Prep Sheet
From Everand
Introduction to Computing DSST Quick Prep Sheet
Justin Orgeron
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Model Order Reduction Via Matlab Parallel Computing Toolbox: Istanbul Technical University

Uploaded by

Model Order Reduction Via Matlab Parallel Computing Toolbox: Istanbul Technical University

Uploaded by

Model Order Reduction via Matlab Parallel Computing

E. Fatih Yetkin & Hasan Dağ

Istanbul Technical University

September 21, 2009

3 Parallel Version of Rational Krylov Methods

Model reduction methods aim to build a model, which is easy to

Therefore, especially in dense problems parallelism is an obligation.

A single CPU (core) is available

In the simplest sense, parallel computing is the simultaneous use of

In Uniform Memory Access (UMA) architecture, identical processors

Processors have their own local memory. Memory addresses in one

In the threads model of parallel programming, a single process can

In Matlab you can use parfor to make a parallel loop.

Master processor has a right to access for all workers’ data

It is possible to distribute any array to workers.

If D selected as zero system triple can be selected as Σ = (A, B, C ) for

Two matrices V ∈ Rnxk and W ∈ Rnxk can be defined where

There are lots of ways to build the projection matrices.

V = [(s1 I − A)−1 B . . . (sk I − A)−1 B]

Assuming that det(Ŵ ∗ V ) 6= 0, the projected reduced system can be

This approach is not optimal. To improve this approach several

Reduced order system Gr (s) is H2 optimal if it minimizes the

Gr (s) = argmindeg (Ĝ )=r ||G (s) − Ĝ (s)||H2 (5)

And there are two important theorems to obtain an H2 optimal

We use a ladder RLC network as benchmark example for the

Computational cost of the rational Krylov methods can be given as

Although both algorithms have k times factorization to compute

Proc no. time (n=2000) time (n=5000)

Proc no. time (n=2000) time (n=5000)

Speedup of a parallel algorithm is defined as

In this work, iterative rational Krylov method based optimal H2 norm

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.