0% found this document useful (0 votes)
41 views12 pages

chap4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views12 pages

chap4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chapter 4: Retiming

Keshab K. Parhi
有环的电路,如何接近迭代边界;(流水线是在前馈&&割集上减小关键路径)
主要是加快速度,对节省功耗的影响是很小的。
Retiming :延时不变,关键路径减小;相比流水线,后者减小关键路径的同时也会增加延时
Moving around existing delays 不再有 loop的限制
• Does not alter the latency of the system
• Reduces the critical path of the system
• Node Retiming 第一类
D 3D
5D +2D 3D 大前提:操作必须是
-2D 线性时不变的!
+2D
2D
•Cutset Retiming 第二类 D
D
2D
B D
D
右侧cutset 2-in 3-out。在一个in上减小1个D,3个out都各增加一个D。
把cutset看做node。

A F
D
C E
D
Chap. 4 2
Retiming vs pipelining

• Generalization of Pipelining
• Pipelining is Equivalent to Introducing
Many delays at the Input followed by
Retiming

Chap. 4 3
• Retiming Formulation
Retiming
r(U) r(V)
ω ω’
U V U V
Source node Destination node

ω’ = ω + r(V) - r(U)

•Properties of retiming
–The weight of the retimed path p = V 0 --> V1 --> …..Vk is given by
ωr(p)= ω(p) + r(Vk) - r(V0) 注意是一个cycle
–Retiming does not change the number of delays in a cycle.
–Retiming does not alter the iteration bound in a DFG as the
number of delays in a cycle does not change
–Adding the constant value j to the retiming value of each node
does not alter the number of delays in the edges of the retimed
graph. 是给边+j还是给Node+j?给node!rv=ru, then w'=w

•Retiming is done to meet the following


2大作用:– Clock period minimization
– Register minimization
Chap. 4 4
• Retiming for clock period minimization
– Feasibility constraint
ω’(U,V) ≥ 0 ⇒ causality of the system
⇒ ω(U,V) ≥ r(U) - r(V) (one inequality per edge)
– Critical Path constraint
r(U) - r(V) ≤ W(U,V) - 1 for all vertices U and V in the graph
such that D(U,V) > c where c = target clock period. The two
quantities W(U,V) and D(U,V) are given as:
W(U,V) = min{w(p) : U→V}
D(U,V) = max{t(p) : U→V and w(p) = W(U,V)
(1)
G
D
(1) (1) 2D
A B C D E
(1) (1) (1)

F
D W(A,E) = 1 & D(A,E) = 5
(2)

Chap. 4 5
• Algorithm to compute W(U,V) and D(U,V):
• Let M = tmaxn, where tmax is the maximum computation time of
the nodes in G and n is the # of nodes in G.
• Form a new graph G’ which is the same as G except the edge
weights are replaced by w’(e) = Mw(e) – t(u) for all edges
UàV.
• Solve for all pair shortest path problem on G’ by using Floyd
Warshall algorithm. Let S’UV be the shortest path form U à
V.
• If U ≠ V, then W(U,V) = S’UV/M and D(U,V) = MW(U,V) -
S’UV + t(V). If U = V, then W(U,V) = 0 and D(U,V) = t(U).
• Using W(U,V) and D(U,V) the feasibility and critical path
constraints are formulated to give certain inequalities.
The inequalities are solved using constraint graphs and if a
feasible solution is obtained then the circuit can be
clocked with a period ‘c’.

Chap. 4 6
• Solving a system of inequalities : Given M inequalities in N
variables where each inequality is of the form ri – rj ≤ k for
integer values of k.
Ø Draw a constraint graph
ØDraw the node i for each of the N variables ri, I= 1, 2,
…, N.
ØDraw the node N+1.
ØFor each inequality ri – rj ≤ k , draw the edge jài of
length k.
ØFor each node i, i = 1, 2, …, n, draw the edge N+1 ài
from the node N+1 to node I with length 0.
Ø Solve using a shortest path algorithm.
ØThe system of inequalities have a solution iff the
constraint graph contains no negative cycles.
ØIf a solution exists, one solution is where ri is the
minimum length path from the node N+1 to node i.

Chap. 4 7
star
• K-slow transformation
– Replace each D by kD
Clock
(1) (1) 0 A0 → B0
A B Titer= 2ut
1 A1 → B1
D 2 A2 → B2

After 2-slow transformation


Clock
(1) (1) 0
A B A0→B0
1 Tclk= 2ut
2D 2 A1→B1 Titer= 2×2ut=4ut
3
4 A2→B2

*Input new samples every alternate cycles.


*null operations account for odd clock cycles.
*Hardware utilized only 50% time
Chap. 4 8
• Retiming 2-slow graph
D

A B

Tclk = 1ut
Titer = 2×1=2ut Titer是什么意思?

*Hardware Utilization = 50 %

*Hardware can be fully utilized if


two independent operations are
available.
Chap. 4 9
2-Slow Lattice Filter (Fig. 4.7)
不能插入pipeline,所有的割集都不是前馈的

critical path

critical loop

A 100 stage Lattice Filter with critical path 2 multiplications and 101 additions

critical path = 6 < 7

Chap. 4
The 2-slow version 10
A retimed version of the 2 slow circuit
with critical path of 2 multiplications
and 2 additions

If Tm = 2 u.t. and Ta = 1 u.t., then


Tclk = 6 u.t., Titer = 2X6 = 12 u.t.

In Original Lattice Filter, T iter = 105 u.t.


Chap. 4 Iteration Period Bound = 7 u.t. 11
Other Applications of Retiming
• Retiming for Register Minimization
(Section 4.4.3)
• Retiming for Folding (Chapter 6)
• Retiming for Power Reduction (Chap. 17)
• Retiming for Logic Synthesis (Beyond
Scope of This Class)
• Multi-Rate/Multi-Dimensional Retiming
(Denk/Parhi, Trans. VLSI, Dec. 98, Jun.99)
Chap. 4 12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy