Electronics: Efficient QC-LDPC Encoder For 5G New Radio
Electronics: Efficient QC-LDPC Encoder For 5G New Radio
Article
Efficient QC-LDPC Encoder for 5G New Radio
Tram Thi Bao Nguyen and Tuy Nguyen Tan and Hanho Lee *
Department of Information and Communication Engineering, Inha University, Incheon 22212, Korea;
baotram137@gmail.com (T.T.B.N.); nguyentantuy@gmail.com (T.N.T.)
* Correspondence: hhlee@inha.ac.kr; Tel.: +82-32-860-7449
Received: 22 May 2019; Accepted: 11 June 2019; Published: 13 June 2019
Abstract: This paper presents a novel efficient encoding method and a high-throughput
low-complexity encoder architecture for quasi-cyclic low-density parity-check (QC-LDPC) codes for
the 5th-generation (5G) New Radio (NR) standard. By storing the quantized value of the permutation
information for each submatrix instead of the whole parity check matrix, the required memory
storage size is considerably reduced. In addition, sharing techniques are employed to reduce the
hardware complexity. The encoding complexity of the proposed method was analyzed, and indicated
a substantial reduction in the required area as well as memory storage when compared with existing
state-of-the-art encoding approaches. The proposed method requires only 61% gate area, and 11%
ROM storage when compared with a similar LDPC encoder using the Richardson–Urbanke method.
Synthesis results on TSMC 65-nm complementary metal-oxide semiconductor (CMOS) technology
with different submatrix sizes were carried out, which confirmed that the design methodology is
flexible and can be adapted for multiple submatrix sizes. For all the considered submatrix sizes,
the throughput ranged from 22.1–202.4 Gbps, which sufficiently meets the throughput requirement
for the 5G NR standard.
1. Introduction
Low-density parity-check (LDPC) codes [1], which were first proposed by Gallager in the early
1960s and rediscovered by MacKay and Neal [2] in 1996, have attracted widespread attention thanks
to their remarkable error correction capabilities near the Shannon limit, with advancements in very
large-scale integration (VLSI). Moreover, LDPC codes are among the most widely used types of
forward error correction (FEC) codes in several communications standards such as the wireless local
area network (WLAN, IEEE 802.11n), wireless radio access network (WRAN, IEEE 802.22), digital
video broadcast (DVB), and the Advanced Television System Committee (ATSC). Recently, the fifth
generation (5G) communication has been a hotspot of research and development [3]. More specially,
LDPC codes play an important role in 5G communication and have been selected as the coding
scheme for the 5G enhanced Mobile Broad Band (eMBB) data channel [4]. To support compatible rate
and scalable data transmission, 3rd Generation Partnership Project (3GPP) has agreed to consider
two rate-compatible base graphs, BG1 and BG2, for the channel coding [5]. Accordingly, several studies
have been conducted on the 5G LDPC codes. In [6], a low-cost and flexible demonstration platform is
designed and implemented to evaluate the real-time performance of LDPC over the air interface as
defined by 5G New Radio (NR) specifications. An algebra-assisted method for constructing 5G LDPC
codes is presented in [7].
Over recent years, research on LDPC codes has been focused on structured LDPC codes known as
quasi-cyclic low-density parity-check (QC-LDPC) codes [8–12], which exhibit advantages over other
types of LDPC codes with respect to the hardware implementations of encoding and decoding using
simple shift registers and logic circuits. A low-complexity encoder can be realized by using QC-LDPC
codes, due to the sparseness of the parity check matrix. However, it is not straightforward to encode
with low complexity as LDPC codes are defined by their parity check matrix, and the generator matrix
is generally unknown. Various approaches have been suggested to improve the hardware complexity of
LDPC encoders [13–21]. One of the most conventional approaches is systematic encoding, in which the
generator matrix is derived from the parity check matrix by exploiting Gaussian elimination. The main
drawback related to this method is that the storage overhead is dramatically increased for large block
sizes, which limits its practical applicability. The Richardson–Urbanke (RU) algorithm is a widely-used
LDPC codes encoding scheme developed by Richardson and Urbanke [13]. The underlying principle
of the method is the transformation of the parity check matrix into an approximate lower triangular
(ALT) form by using only row and column permutations, which preserves the sparseness of the matrix.
This method suffers from a long critical path, which could make the LDPC encoder unsuitable for
high throughput applications. To overcome the limitations of the previous approaches, the design
proposed in this paper, which is referred to as a low-complexity high-throughput LDPC encoder
architecture for the 5G standard, requires significantly less area and memory storage while maintaining
a high throughput.
This paper targets the design of low-complexity high-throughput QC-LDPC encoders for the
5G NR standard. In LDPC encoders, the memory and interconnecting blocks are considered as
the major influencing factors of the overall area, delay, and power performance of the hardware
design. Hence, the size of the read only memory (ROM) was decreased by storing the quantized
value of the permutation information for each submatrix instead of the entire parity check matrix H.
The proposed architecture requires less matrix multiplications than the RU method, by exploiting
the characteristics of the 5G NR base matrix. In addition, the proposed algorithm does not require
the inverse of the component matrix, which presents a primary advantage over the RU method.
Moreover, block-memories are not required to store the generator matrix G, and the number of
required components is reduced. The ROM size of the proposed method is 98.2% and 88.9% lower
than those of the G matrix method and RU method, respectively.
To assess the benefits of the proposed encoding approach, we further implement and synthesize
several QC-LDPC encoder architectures with different submatrix sizes Z = 30, 64, 96, 144, and 352.
The application specific integrated circuit (ASIC) post synthesis implementation results on TSMC
65-nm complementary metal-oxide semiconductor (CMOS) technology revealed an area efficiency up
to 597 Gbps/mm2 when the proposed encoding method was implemented. Hence, it can be concluded
that a promising encoding architecture design for 5G NR LDPC codes was developed in this study.
The remainder of this paper is organized as follows. Section 2 gives a brief overview of the
characteristics of 5G NR QC-LDPC codes. In Section 3, two conventional LDPC encoding algorithms
from the literature are outlined. A novel 5G NR QC-LDPC encoding approach and a low-complexity
high-throughput QC-LDPC encoder architecture are described in Section 4. Section 5 presents the
implementation and comparison results, followed by the conclusions in Section 6.
2. 5G NR QC-LDPC Codes
The NR access technology marks a transition in FEC coding for the 3GPP of cellular
technologies [22]. In this section, the QC-LDPC codes are reviewed, and the characteristics of standard
5G QC-LDPC codes are summarized. In addition, procedures are presented for the construction of the
parity check matrix of the target LDPC codes.
2.1. Preliminary
Let Z be the size of a circulant permutation matrix and Pi,j be the shift value. For any integer
value Pi,j , 0 ≤ Pi,j ≤ Z, a Z × Z circulant permutation matrix shifts the Z × Z identity matrix I to the
right by Pi,j times for the (i, j)-th non-zero element in a base matrix. This binary circulant permutation
matrix is denoted as Q( Pi,j ). Considering Q(1) as an example,
Electronics 2019, 8, 668 3 of 15
0 1 0 ··· 0
0 0
1 · · · 0
. . .. ..
.. ..
Q (1) = . . (1)
.
· · · 1
0 0 0
1 0 0 ··· 0
For simple notation, Q(−1) denotes the null matrix (all elements equal to zero) of the same size.
larger block lengths (500 ≤ K ≤ 8448) and higher rates (1/3 ≤ R ≤ 8/9), whereas BG2 is targeted for
smaller block lengths (40 ≤ K ≤ 2560) and lower rates (1/5 ≤ R ≤ 2/3). The actual base graph usage
and the definition of the two matrices are detailed in the NR standard specification TS 38.212 [27].
The base graph that supports Kmax should support the following set of shift sizes Z, where Z = a × 2 j
for a ∈ {2, 3, 5, 7, 9, 11, 13, 15} and 0 ≤ j ≤ 7.
Figure 1. Sketch of base parity check structure for the 5G NR QC-LDPC codes.
For base graphs BG1 and BG2, the number of shift coefficient designs is 8. All lift sizes are divided
into eight sets based on parameter a, where a is used for the definition of the lifting-size a × 2 j . The set
of shift coefficients are listed in Table 1.
The shift value Pi,j can be calculated using the function Pi,j = f (Vij , Z ), where Vi,j is the shift
coefficient of the (i, j)-th element in the corresponding shift design. The function f is defined as
Equation (4), in which mod denotes the modulo arithmetic:
(
−1, if Vi,j = −1,
Pi,j = f (Vi,j , z) = (4)
mod(Vi,j , z), else.
The following procedures are the steps of constructing the parity check matrix of the target ( N, K )
QC-LDPC code with a given information block size K and code rate R = K/N. For a base graph, k b
denotes the number of information circulant columns; thus, if the lifting size is Z, K = Z × k b nominally.
Electronics 2019, 8, 668 5 of 15
Step 1: Obtain the base graph BG1 or BG2 and determine the value of k b for the given K and R.
– For BG1: k b = 22.
– For BG2: k b = 10 if K > 640; k b = 9 if 560 < K ≤ 640; k b = 8 if 192 < K ≤ 560;
and k b = 6 elsewhere.
Step 2: Determine Z by selecting the minimum Z value in Table 2, such that k b × Z ≥ K.
Step 3: After the lifting size Z is determined, the corresponding shift coefficient matrix is then selected
from Table 1 {Set 1, Set 2,. . . , Set 8} according to set Z.
Step 4: Calculate the shifting coefficient value Pi,j by the modular Z operation, as discussed
in Equation (4).
Step 5: Replace each entry in the final exponent matrix with the corresponding circulant permutation
matrix or zero matrix of size Z × Z. The QC-LDPC code construction is completed and
a parity check matrix H of size mb Z × nb Z is obtained. In 5G QC-LDPC codes, shortening
and puncturing is carried out to obtain the desired information lengths and rate adaption.
Figure 2 presents an illustration of the encoding process of these codes
a
Z
2 3 5 7 9 11 13 15
0 2 3 5 7 9 11 13 15
1 4 6 10 14 18 22 26 30
2 8 12 20 28 36 44 52 60
3 16 24 40 56 72 88 104 120
j
4 32 48 80 112 144 176 208 240
5 64 96 160 224 288 352
6 128 192 320
7 256 384
HC T = 0T , (5)
Electronics 2019, 8, 668 6 of 15
where C is the systematic codeword, which consists of the information bit vector S and parity
code vector P.
This section presents a review on two generic encoding methodologies for the implementation of
the LDPC encoder: the Gaussian elimination method and the RU method.
The codeword C is then obtained by multiplying the generator matrix G by the systematic bits S
as follows:
C = SG. (8)
The sequential LDPC encoder based on the multiplication of the G matrix requires a ROM to store
the generator matrix used to compute the codeword C. The main drawback of this approach is that,
unlike parity check matrix H, the corresponding generator matrix G will most likely not be sparse.
The complexity of this straightforward encoding algorithm is O( N 2 ), where N is the number of bits
in a codeword. Therefore, the implementation of the matrix multiplication at the encoder results in
a very high complexity. For an arbitrary parity check matrix, the construction of G should be avoided
and encoding should be carried out using back substitution with H.
Here, the matrix T has a lower triangular form with 1s along the diagonal, and all the entries
above the diagonal are 0s. By multiplying H from the left by
" #
I 0
, (10)
− ET −1 I
Electronics 2019, 8, 668 7 of 15
where
C̃ = − ET −1 A + C,
D̃ = − ET −1 B + D, (12)
Ẽ = − ET −1 T + E = 0.
The actual encoding step is performed by matrix-multiplication, forward-substitution and vector
addition operations. Let the codeword C = [s p1 p2 ] where s represents the information bits,
p1 denotes the first G parity check bits, and p2 contains the remaining ( M − G ) parity check bits.
The codeword C must satisfy the parity check equation HC T = 0T . The two equations are then
expressed by:
As T + Bp1T + T p2T = 0T ,
(13)
C̃s T + D̃p1T + 0p2T = 0T .
Using the RU method, the calculation of the parity bits in the first parity portion p1 is only
dependent on the information bits, given that E was cleared. Hence, it can be calculated independently
of the parity bits in p2 . If D̃ is non singular, then p1T can be obtained from Equation (13):
If D̃ is singular in GF(2), then it is necessary to further permute the columns of H̃ to eliminate this
singularity. Once p1 is known, p2 can be determined using Equation (13):
Given that T is the lower triangular form, p2 can be found using back substitution. The complexity
of this encoding procedure can be kept low since A, B and T are sparse. Tables 3 and 4 present the
complexity of calculation of p1T and p2T , respectively. The complexity of the RU algorithm is given by
O( N + G2 ), where N is the block length and G is the gap to linear encoding. The gap is actually the
Electronics 2019, 8, 668 8 of 15
number of rows of the parity check matrix that cannot be set into a triangular form using only row and
column permutations. With a small gap G, the lower encoding complexity for the code is achieved.
The disadvantage of encoding using the RU method is that there is no exact programmable
step-by-step algorithm. The multiple matrix calculations in this algorithm significantly limit the
development of a rapid flexible encoder [28]. In addition, the RU method is subjected to a long critical
path and odd constraints, which could render the LDPC encoder non-systematic [19].
The parity check matrix H of 5G NR QC-LDPC codes can be partitioned into six matrices and
presented in the following form: " #
A B 0
H= , (17)
C1 C2 I
Electronics 2019, 8, 668 9 of 15
C1 s T + C2 p Ta I pcT = 0T . (21)
The proposed algorithm is performed in two steps. In the initial step, the parity bits in the first
portion p a are computed by solving Equation (20). The second step in the encoding process includes
the computation of the pc parity portions using Equation (21).
The first step in the encoder implementation is the determination of the p a part. Initially,
Equation (20) is re-written in block form as follows:
a1,1 a1,2 · · · a1,kb s1
1 0 −1 −1 p a1
a2,1
a2,2 · · · a2,kb s2 0
0 0 − 1 p
a2
. + = 0. (22)
..
a3,1 a3,2 . a3,kb .. −1 −1 0 0 p a3
a4,1 a4,2 · · · a4,kb skb 1 −1 −1 0 p a4
kb
∑ a1,j s j + pa1
(1)
+ p a2 = 0, (23)
j =1
kb
∑ a2,j s j + pa1 + pa2 + pa3 = 0, (24)
j =1
kb
∑ a3,j s j + pa3 + pa4 = 0, (25)
j =1
kb
∑ a4,j s j + pa1
(1)
+ p a4 = 0, (26)
j =1
(α)
where p a1 denotes the αth (right) cyclic shifted version of p a1 for 0 ≤ α ≤ Z. By adding up all the
above equations, the following is obtained:
4 kb
p a1 = ∑ ∑ ai,j s j . (27)
i =1 j =1
It should be noted that a straightforward implementation of ai,j s j can be done with the use of
Z-bit cyclic shifters. Since ai,j s j is a circular right shift of s j with the shift coefficient defined by ai,j ,
the hardware complexity is trivial. Based on the definition below,
Electronics 2019, 8, 668 10 of 15
kb
λi = ∑ ai,j s j for i = 1, 2, 3, 4, (28)
j =0
p a3 = λ 3 + p a4 , (31)
(1)
p a4 = λ 4 + p a1 . (32)
From Equation (28), each λi value is computed by accumulating all the ai,j s j values. In Modulo 2,
λi is obtained by carrying out XOR operations on all the elements of ai,j s j . The λi values can be
estimated per clock cycle in g = 4 cycles. The first block of the parity bits p a1 is then calculated by
accumulating all the λi values. The remaining parity bits p ai can be obtained using a method that can
be easily derived from Equations (30)–(32). This process can be done in two clock cycles since there is
dependency between p a3 and p a4 . All the parity bits p a in the first parity portion are stored in registers.
In a second step, the pc portion can be easily determined based on Equation (21), where matrices
C1 and C2 are given by
c1,1 c1,2 ··· c1,kb c1,kb +1 c1,kb +2 ··· c1,kb + g
c2,1 c2,2 ··· c2,kb
c2,kb +1 c2,kb +2 ··· c2,kb + g
C1 = .. .. .. .. ; C2 = .. .. .. .. (33)
. . . . . . . .
cmb − g,1 cmb − g,2 · · · cmb − g,kb cmb − g,kb +1 cmb − g,kb +2 · · · cmb − g,kb + g .
Upon the application of Equation (21), the elements of pc can be computed using the
following equations:
kb g
p c1 = ∑ c1,j s j + ∑ c1,kb + j paj ,
j =1 j =1
kb g
p c2 = ∑ c2,j s j + ∑ c2,kb + j paj ,
j =1 j =1 (34)
..
.
kb g
p cmb − g = ∑ cmb −g,j s j + ∑ cmb −g,kb + j paj .
j =1 j =1
Similarly, ci,j s j represents a circular shift of s j with the shift coefficient defined by ci,j , and ci,kb + j p a j
represents a circular shift of p a j with the shift coefficient defined by ci,kb + j . As soon as ci,j s j and ci,kb + j p a j
have been obtained, they can be used to determine the value of the corresponding parity bits in the
second parity portion pc . This step can be performed in a single clock cycle. Hence, all the pc parity
bits can be acquired in (mb − g) clock cycles. The encoded codeword is then a combination of the
original message s and the two calculated parity portions p a and pc .
circular shift process. The vector addition of all the λi components is then carried out by the XOR trees.
Each intermediate λi value corresponding to Equation (28) can be estimated per clock cycle and stored
in the λ_memory to be used later. Thus, the value of p a1 can be obtained in g = 4 clock cycles when all
λi values are obtained and stored in memory. The remaining parity bits of p a can be obtained in 2 clock
cycles with the use of XOR gates.The objective of the second step is the calculation of the parity bits in
the second portion pc . According to (34), the parity blocks pci can be achieved by the vector addition of
ci,j s j and ci,kb + j p a j . The value of ci,j s j is also computed by accumulating all the cyclic shift results of
s j . In this step, the overall hardware complexity can be further decreased by exploiting the sharing
technique. More specifically, the barrel shifters and XOR trees are reused for the computation of pc in
this step. Control signals are generated by the controller block. The value of ci,kb + j p a j is estimated by
accumulating all the cyclic shift results of p a j . The required number of Z–bit barrel shifters is g = 4.
The main blocks of the proposed architecture can be described as follows.
(1) Input/ Output Buffer: the input buffer, which is implemented as a number of serial input
parallel output shift registers, is exploited to store the input systematic bits si received by the encoder.
The output buffer is used to store the encoded codeword.
(2) Memory Blocks: two memory blocks are utilized, namely, one for the submatrix permutation
values, and the others for the accumulated values λ that correspond to matrix A. In Figure 4, the AROM,
C1 ROM,and C2 ROM correspond to the ROMs that store the coefficients of matrix A, matrix C1 ,
and matrix C2 , respectively. Under the assumption that q̃ = dlog2 Z e bits represent the required word
length to store the permutation information for each submatrix: q̃gk b , q̃(mb − g)k b , and q̃(mb − g) g
bits are required to store matrix A, C1 , and C2 , respectively. A significant portion of the hardware
complexity of the LDPC encoder consists of the memory required to store the parity check matrix.
Unlike the RU method, the proposed algorithm does not require the inverse of the component matrix,
which reflects its primary advantage over the RU method. Compared with the Gaussian method, the
proposed architecture does not require for block-memories to store the generator matrix G, which
further decreases the number of required components. The λ_memory is implemented as a dual port
random access memory (RAM) for storing λi messages (i = 1, 2, . . . , g). Each memory word λi consists
of Z bits, corresponding to one accumulated message of matrix A. Moreover, a total of ( g × Z ) bits of
λ_memory are required for the proposed encoder.
(3) Barrel Shifters: barrel shifters are used to implement the cyclic shift permutations, according
to the shift values provided by the cyclic shifter controllers. It should be noted that the number of
cyclic shifters is equal to the number of message blocks, and the size of the barrel shifters is equal to
submatrix size Z.
(4) XOR Trees: in Modulo 2, the addition implementation is obtained by carrying out an XOR
operation on all the elements.
(5) Controller: this block generates control signals, such as data_sel to indicate the step being
processed; and mem_en, to enable write access to the λ_memory.
Electronics 2019, 8, 668 12 of 15
the number of clock cycles required per codeword for the encoding of the proposed encoder design
decreased to 14.8% of that of RU method.
Gaussian RU Proposed
Flip-flops kb Z (k b + g) Z (k b + g) Z
XOR gates ( k b Z − 1) m b Z 2mb + (mb − g) Z (k b + 2g − 1) Z
Table 6. Comparison between Gaussian method, RU method, and proposed method for submatrix size
Z = 16.
Gaussian RU Proposed
Flip-flops 352 416 416
XOR gates 258,336 764 464
The ASIC post synthesis implementation results on TSMC 65–nm CMOS technology are shown in
Table 7, for various QC-LDPC encoders with expansion factors Z = 30, 64, 96, 144, and 352, which are
indicated in the table as BG1-Z30, BG1-Z64, BG1-Z96, BG1-Z144, and BG1-Z352, respectively. In Table 7,
q̃ size denotes the word length required to store the shift sizes while CPC stands for the number of
clock cycles required per codeword for encoding. Note that all input data bits were assumed to be
available for encoding, and the serialization factors are not included in the results. In the proposed
design, the CPC is equal to the maximum number of clock cycles required for the calculation of the p a
and pc parity check bits. The computation of p a requires ( g + 2) clock cycles, in which g clock cycles
are used to compute all the λ values and p a1 , and two extra clock cycles are required for estimation
of the remaining parity bits in the p a portion. The computation of pc requires (mb − g) clock cycles.
Hence, this method requires (mb + 2) clock cycles in total. The information throughput reported in
Table 7 is given by the formula
mb × Z × f max
Throughput = , (35)
CPC
where f max is the maximum operating frequency (post synthesis). For different submatrix sizes,
the throughput varied from 22.1–202.4 Gbps. In Table 7, the occupied areas are also reported.
It should be noted that there is a significant increase in the core area when processing higher
submatrix sizes. Since encoder architecture of a higher submatrix size Z requires a higher q̃ size,
additional memory and hardware components are required. It is shown that the encoding complexity
of the proposed design is linearly proportional to the submatrix size Z of the code. To keep the
throughput comparison on equal basis, the throughput-to-area ratio metric was further defined as
Electronics 2019, 8, 668 14 of 15
TAR = Throughput/Area (Gbps/mm2 ). For all the considered submatrix sizes in Table 7, the TAR
ranged from 520–597 Gbps/mm2 .
Based on the implementation results presented above, it is clear that the design methodology
is applicable to different submatrix sizes and offers a significantly high area efficiency and high
information throughput, which is more than enough to satisfy the throughput requirement for the 5G
NR standard.
Table 7. ASIC implementation results of LDPC encoders for different lifting sizes Z = 30, 64, 96, 144,
and 352.
6. Conclusions
In this paper, a novel low-complexity high-throughput encoder approach for the 5G NR standard
is proposed. Based on the proposed encoding algorithm, five encoder architectures with different
submatrix sizes were implemented. The derived architecture exhibited a significantly lower hardware
complexity, as it decreased the memory and logic component requirements. The proposed design
demonstrates a superior performance to the alternative methods. Moreover, the synthesis results
revealed that the proposed design is appropriate for the high throughput 5G standard.
Author Contributions: T.T.B.N. conceptualized the idea of this research, conducted experiments, collected data,
and prepared the original version. T.N.T. reviewed, analyzed data, and updated the manuscript. H.L. supervised,
validated, reviewed, and supported the research with funding.
Funding: This work was supported by the INHA UNIVERSITY Research Grant.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Gallager, R.G. Low-Density Parity-Check Codes; MIT Press: Cambridge, MA, USA, 1963.
2. MacKay, D.J.C.; Neal, R.M. Near Shannon Limit Performance of Low-Density Parity-Check Codes.
Electron. Lett. 1996, 32, 1645–1646. [CrossRef]
3. Huo, Y.; Dong, X.; Xu, W. 5G Cellular User Equipment: From Theory to Practical Hardware Design.
IEEE Access 2017, 5, 13992–14010. [CrossRef]
4. Session Chairman (Nokia). Chairman’s Notes of Agenda Item 7.1.5 Channel Coding and Modulation, 3GPP
TSG RAN WG1 Meeting No. 87, R1-1613710 (2016). Available Online: https://portal.3gpp.org/ngppapp/
CreateTdoc.aspx?mode=view&contributionId=752413 (accessed on 22 May 2019).
5. Richardson, T.; Kudekar, S. Design of Low-Density Parity Check Codes for 5G New Radio. IEEE Commun. Mag.
2018, 56, 28–34. [CrossRef]
6. Ji, W.; Wu, Z.; Zheng, K.; Zhao, L.; Liu, Y. Design and Implementation of a 5G NR System Based on LDPC in
Open Source SDR. In Proceedings of the 2018 IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, UAE,
9–13 December 2018; pp. 1–6.
Electronics 2019, 8, 668 15 of 15
7. Tang, H.; Xu, J.; Kou, Y.; Lin, S.; Abdel-Ghaffar, K. On Algebraic Construction of Gallager and Circulant
Low-Density Parity-Check Codes. IEEE Trans. Inf. Theory 2004, 50, 1269–1279. [CrossRef]
8. Ajaz, S.; Nguyen, T.T.B; Lee, H. An Area-Efficient Half-Row Pipelined Layered LDPC Decoder Architecture.
J. Semicond. Technol. Sci. 2017, 17, 845–853. [CrossRef]
9. Nguyen, T.T.B; Lee, H. Low-Complexity Multi-mode Multi-way Split-row Layered LDPC Decoder for Gigabit
Wireless Communications. Integration 2018. [CrossRef]
10. Ajaz, S.; Lee, H. Efficient multi-Gb/s multi-mode LDPC decoder architecture for IEEE 802.11ad applications.
Integration 2015, 51, 21–36. [CrossRef]
11. Ajaz, S.; Lee, H. An efficient radix-4 Quasi-cyclic shift network for QC-LDPC decoders. IEICE Electron. Express
2014, 11, 1–6. [CrossRef]
12. Ajaz, S.; Lee, H. Reduced-complexity Local Switch Based Multi-mode QC-LDPC Decoder Architecture for
Gigabit Wireless Communications. IET Electron. Lett. 2013, 49, 1246–1248. [CrossRef]
13. Richardson, T.J.; Urbanke, R.L. Efficient Encoding of Low-density Parity-check Codes. IEEE Trans. Inf. Theory
2001, 47, 638–656. [CrossRef]
14. Khodaiemehr, H.; Kiani, D. Construction and Encoding of QC-LDPC Codes Using Group Rings. IEEE Trans.
Inf. Theory 2017, 63, 2039–2060. [CrossRef]
15. Huang, Q.; Tang, L.; He, S.; Xiong, Z.; Wang, Z. Low-Complexity Encoding of Quasi-Cyclic Codes Based on
Galois Fourier Transform. IEEE Trans. Commun. 2014, 62, 1757–1767. [CrossRef]
16. Li, Z.; Chen, L.; Zeng, L.; Lin, S.; Fong, W. Efficient Encoding of Quasi-cyclic Low-density Parity-check Codes.
IEEE Trans. Commun. 2006, 54, 71–81. [CrossRef]
17. Ilani, I. Designing and Encoding QC-LDPC Codes Using Matrices over Commutative Rings. In Proceedings
of the 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE), Eilat, Israel, 16–18
November 2016; pp. 1–5.
18. Jung, Y.; Chung, C.; Jung, Y.; Kim, J. 7.7 Gbps Encoder Design for IEEE 802.11ac QC-LDPC Codes. J. Semicond.
Technol. Sci. 2014, 14, 419–425. [CrossRef]
19. Cohen, A.E.; Parhi, K.K. A Low-Complexity Hybrid LDPC Code Encoder for IEEE 802.3an (10GBase-T)
Ethernet. IEEE Trans. Signal Process. 2009, 57, 4085–4094. [CrossRef]
20. Zhang, P.; Liu, C.; Jiang, L. Efficient Encoding of QC-LDPC Codes Based on Rotate-left-accumulator Circuits.
Electron. Lett. 2013, 49, 810–812. [CrossRef]
21. Jung, Y.; Kim, J. Memory-efficient and High-speed LDPC Encoder. Electron. Lett. 2010, 46, 1035–1036.
[CrossRef]
22. Li, H.; Bai, B.; Mu, X.; Zhang, J.; Xu, H. Algebra-Assisted Construction of Quasi-Cyclic LDPC Codes for 5G
New Radio. IEEE Access 2018, 6, 50229–50244. [CrossRef]
23. Chen, L.; Xu, J.; Djurdjevic, I.; Lin, S. Near-Shannon-limit Quasi- cyclic Low-density Parity-check Codes.
IEEE Trans. Commun. 2004, 52, 1038–1042. [CrossRef]
24. Chen, L.; Lan, L.; Djurdjevic, I.; Lin, S.; Abdel-Ghaffar, K. An Algebraic Method for Constructing Quasi-cyclic
LDPC Codes. In Proceedings of the International Symposium on Information Theory and Its Applications,
Parma, Italy, 10–13 October 2004; pp. 535–539.
25. Li, J.; Lin, S.; Abdel-Ghaffar, K.; Ryan, W.; Costello, D.J., Jr. LDPC Code Designs, Constructions, and Unification;
Cambridge University Press: Cambridge, UK, 2017.
26. Chen, T.; Vakilinia, K.; Divsalar, D.; Wesel, R.D. Protograph Based Raptor-like LDPC Codes. IEEE Trans.
Commun. 2015, 63, 1522–1532. [CrossRef]
27. Ad-Hoc chair (Nokia). Chairman’s Notes of Agenda Item 7.1.4. Channel Coding, 3GPP TSG RAN WG1
Meeting AH 2, R1-1711982 (2017). Available Online: https://portal.3gpp.org/ngppapp/CreateTdoc.aspx?
mode=view&contributionId=805088 (accessed on 22 May 2019).
28. Yasotharan, H.; Carusone, A.C. A Flexible Hardware Encoder for Systematic Low-density Parity-check Codes.
In Proceedings of the 52nd IEEE International Midwest Symposium on Circuits and Systems, Cancun, Mexico,
2–5 August 2009; pp. 54–57.
c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).