Implementation With: 170 Mbps (8176, 7156) Quasi-Cyclic LDPC Decoder Fpga
Implementation With: 170 Mbps (8176, 7156) Quasi-Cyclic LDPC Decoder Fpga
I.
II.
A.
INTRODUCTION
5095
H=
LH1,1 H1,2
21 H2,2
H1,16
H2,16
I
Each submatrix is a circulant matrix with both colunm and row
weight of 2.
Among various LDPC codes decoding algorithms, the SumProduct Algorithm (SPA) has the best decoding performance. The
conventional SPA algorithm has unbalanced computation
complexity between the variable-to-check and check-to-variable
message updating phases. This leads to unbalanced datapaths
between the processing units for the two message updating phases.
To balance the computation load and reduce the critical path, a
modified version based on algorithmic transformation was proposed
in [10]. The reformulated algorithm for check node processing,
variable node processing, and tentative decoding is expressed as
follows:
R = H sign (Lcn) E ' (Lcn)
neN(c)\v
neN(c)\v
L=
meM (v)\c
Lv= Y (sign(Rmv)T(Rmv))-
(1)
2rv
2r32
(2)
(3)
ISCAS 2006
III.
A.
(CPU),
which performs ~
message
bothdecodrig
. Adder
Lo gcX
message updating. In the variable-to-check message updating phase,
the two data connected by an arrow are sent to VPU_0 and VPU_1
L
A
F71
1 -in
E theXsame clock cycle, respectively. Here, VPU_0 and VPU_1 are
'2
+
(? ' I 4VPU components for even column and odd column data
computation.
I
:f3e 1* Tdper' H1
r1
Ad t
I I e
I ll
It can be seen that the data located in the even columns are
1* 11
,
ll-MSlIg
l-lll
rlotIgi
LUT
Addber
Ar1
g
LUT
|
1
I11 03
I_WI_I_
Figure . Variale
nodeprocesing uni
02
'Ci
architcture.Figure 3.
5096
MEM_E
IEM-0
P(-) P(i
P(2,7)
R1:)
pIS,A1)
I pi5AO
P(112,2)
F IP(13 3)
I~ ~ ~ 7I
18X. L;J;;11
_ mPUE
. LIL L; ;;
:
--< ---ISt
P(8,13) Ae A P(9,14)
pt~10,0) + E | 1A.
l +*Ximplementation.
L-----------------------
viltt
memory
is
variable-to-check message updating phase. If the
partitioned in a straightforward way, the last data in the even row
should be stored in the last location of memory sub-bank
Consequently, data access confliction indicated by the dashed arrow
occurs when the two data from column 4 and 5 are retrieved from the
memory sub-bank MEME in the same clock cycle. In our design
the last data in the even row is stored in the odd memory sub-bank to
eliminate the data access confliction. In this way, a pair of
multiplexers are needed to steer the displaced data between the
MEM 0 and the CPU_E. In this figure, symbol I represents a fixed
MEM_-E.
eD
Mo
rn
9'
t 010:' jdress
1i
qS
lER.nFn-Ruhifbrmix6-bit
~~~~~~~~~~E.
uniformfixed-point 6:3
XgghtgJX
lo'=|
reading)
1 Control Sinls nl
111
State
2;WitCMemryg Writ
6-bit
nion-uniformifix ed-point6:3
I
ISg
I~~~~~~~~~~ [:
CSoate
3t3
BER,
BER unifor
__
10'
Memory Read
,brread ing)
10
)10
LL-
~~~~~4
4F
~ ~~~~3
5097
B.
uniform Quantization
Resource
The new architectures for CPU and VPU with the non-uniform
quantization scheme are shown in Fig. 7 and Fig. 8, respectively.
The uniform to non-uniform quantization converter, U2NU, is
implemented with simple combinational logic. The look-up table,
i.e., LUT, for both CPU and VPU are the same.
V.
Slices
39%
38l266
56%
4-input LUTs
28229
41%
36848
54%
Block RAMs
128
88%
128
88%
REFERENCES
81%
CONCLUSION
I * I
27460
E8aL9aL11J-
02
268%
01
6-bit non-uniform
quantization
quantization
Used Utilization ratio Used Utilization ratio
FPGA IMPLEMENTATION
VI.
6-bit uniform
parity-check codes," IEEE Trans. on Inform. Theory, vol 47, pp. 638656, Feb. 2001.
U_NU 3_y_[7] D. -U. Lee, W. Luk, C. Wang, C. Jones, "A flexible hardware encoder
for low-density parity-check codes," IEEE Symp. on FCCM'04, pp. 101-
~~~~~~~~~~~~~~111.
[8] Tong Zhang and Keshab Parhi, "A 54 Mbps (3,6)-regular FPGA
CPU
*
0
LDPC decoder," IEEE SiPS'2002, pp 127-132, 2002.
LUT
31
[9] Y. Chen and D. Hocevar, "A FPGA and ASIC implementation of rate
_31
8088-b irregular low density parity check decoder," IEEE
132
UTL1/2,
032
3
| 1|L
|3 2
7
vol. 1, pp. 113-117, Dec. 2003.
GLOBECOM'03,
[10] Z. Wang, Y Chen and K Parhi, "Area-efficient decoding of quasicyclic low density parity check codes," ICASSP 2004, vol. 5, pp.49-52,
May 2004.
[11] M. Karkooti and J. R. Cavallaro, "Semi-parallel reconfigurable
architectures for real-time LDPC decoding," ITCC'2004, vol. 1, pp.
Figure 7. Check node processing unit with non-uniform quantization.
579 - 585, Apr. 2004.
J.
[12] Chen and M. Fossorier, "Near optimum universal belief propagation
based decoding of low-density parity check codes," IEEE Trans.
LUT SM=>2s
2s=>SM U2NU
0
Commun., vol. 50, pp. 406-414, Mar. 2002.
I1 I
a
.
.
.
SM=>2s
022
[13] T. Zhang, Z. Wang, K.K. Parhi, "On finite precision implementation of
vPU
low density parity check codes decoder," The 2001 IEEE International
Core :
U
W
LU
SM-l>2lls
E~>S
U2N
04
[14] Z. Wang and Q. Jia, "Low complexity, high speed decoder architecture
for quasi-cyclic LDPC codes," The 2005 IEEEInternational Symposium
on Circuits and Systems, pp. 5786-5789, May 2005.
5098