Fast Sequential Decoding of Polar Codes
Fast Sequential Decoding of Polar Codes
Abstract
A new score function is proposed for stack decoding of polar codes, which enables one
to accurately compare paths of different lengths. The proposed score function includes bias,
which reflects the average behaviour of the correct path. This enables significant complexity
reduction with respect to the original stack algorithm at the expense of a negligible perfor-
mance loss.
1 Introduction
Polar codes were recently shown to be able to achieve the symmetric capacity of memoryless chan-
nels, while having low-complexity construction, encoding and decoding algorithms [1]. However,
the performance of polar codes of moderate length is substantially worse compared to LDPC and
turbo codes used in today communication systems. This is both due to suboptimality of the classi-
cal successive cancellation (SC) decoding method, and poor minimum distance of polar codes. List
decoding algorithm introduced in [2] enables one to implement near maximum likelihood decoding
of polar codes with complexity O(Ln log n), where n is code length and L is list size. Furthermore,
the performance of polar codes concatenated with a CRC outer code [2] and polar subcodes [3, 4]
under list decoding appears to be better compared to known LDPC and turbo codes.
However, the complexity of the Tal-Vardy list decoding algorithm turns out to be rather high.
It can be reduced by employing sequential decoding techniques [5, 6, 7]. These methods avoid
construction of many useless low-probability paths in the code tree. Processing of such paths
constitutes the most of the computational burden of the list decoder. In this paper we show that
by careful weighting of paths of different length, one can significantly reduce the computational
complexity of the decoder. The proposed path score function aims to estimate the conditional
probability of the most likely codeword of a polar code, which may be obtained as a continuation
of the considered path in the code tree. It turns out that such function can be well approximated
by the path score of the min-sum list SC decoder biased by its expected value. Simulation results
indicate that the proposed approach results in significant reduction of the average number of
iterations performed by the decoder.
The paper is organized as follows. Section 2 provides some background on polar codes and
polar subcodes. Section 3 introduces the proposed sequential decoding method. Its improvements
are discussed in Section 4. Simulation results illustrating the performance and complexity of the
proposed algorithm are provided in Section 5. Finally, some conclusions are drawn.
∗ The authors are with the Distributed Computing and Networking Department, Saint Petersburg Polytechnic
1
2 Polar codes
2.1 Code construction
(n = 2m , k), or (n, k, d) polar code over F2 is a linear block code generated by
k rows of matrix
⊗m 1 0
Am = Bm F , where Bm is the bit reversal permutation matrix, F = , and ⊗m denotes
1 1
m-times Kronecker product of the matrix with itself, and d is the minimum distance of the code [1].
Hence, a codeword of a classical polar code is obtained as c0n−1 = u0n−1 Am , where ats = (as , . . . , at ),
ui = 0 for i ∈ F , F ⊂ {0, . . . , n − 1} is the set of n − k frozen symbol indices, and the remaining
symbols of the information vector u0n−1 are set to the data symbols being encoded.
Let U0 , . . . , Un−1 and Y0 , . . . , Yn−1 be the random variables corresponding to the input symbols
of the polarizing transformation Am , and output symbols of a memoryless output-symmetric chan-
nel, respectively. It is possible to show that matrix Am transforms the original binary input mem-
(0) (i)
oryless output-symmetric channel W0 (Y |C) = W (Y |C) into bit subchannels Wm (Y, U0i−1 |Ui ),
the capacities of these subchannels converge with m to 0 or 1, and the fraction of subchannels
(0)
with capacity close to 1 converges to the capacity of W0 (Y |C). Here Y = Y0n−1 , and C ∈ F2 is
the random variable corresponding to channel input.
The conventional approach to construction of an (n, k) polar code assumes that F is the set of
(i)
n − k indices i of bit subchannels Wm (Y0n−1 , U0i−1 |Ui ) with the highest error probability. It was
suggested in [3] to set some frozen symbols Ui , i ∈ F , not to zero, but to linear combinations of
other symbols, i.e. the random variables should satisfy
i−1
X
Ui = Vji ,s Us , (1)
s=0
where V is a (n − k) × n binary matrix, such that its rows have last non-zero values in distinct
columns, and ji is the index of the row having the last non-zero element in column i. Alternatively,
these constraints can be stated as U0n−1 V T = 0. This special shape of matrix V enables one
to implement decoding of polar subcodes using straightforward generalizations of the successive
cancellation algorithm and its extensions.
The obtained codes are referred to as polar subcodes. Polar codes with CRC [2] can be
considered as a special case of polar subcodes. Polar subcodes were shown to provide substantially
better performance compared to classical polar codes under list decoding. Therefore, we derive
the decoding algorithm for the case of polar subcodes.
2
where a.b denotes a vector obtained by appending b to a. The probabilities used in this expression
can be recursively computed as
n λ
o
(2i)
Wλ v02i |y02 −1 =
X (i) n o n o (4)
2i+1 2i+1 2λ−1 −1 (i) 2i+1 2λ −1
Wλ−1 v0,e ⊕ v0,o |y0 Wλ−1 v0,o |y2λ−1
v2i+1
n λ
o
(2i+1)
Wλ v02i+1 |y02 −1 =
n o n o (5)
(i) 2i+1 2i+1 2λ−1 −1 (i) 2i+1 2λ −1
Wλ−1 v0,e ⊕ v0,o |y0 Wλ−1 v0,o |y2λ−1 ,
where 0 < λ ≤ m, at0,e and at0,o denote subvectors of at0 , consisting of elements with even and odd
indices, respectively. It is convenient to implement these calculations in LLR domain as
(2i) λ
Lλ (v02i−1 |y02 −1
)=
2i−1 (i)
2i−1 2 λ−1 !
−1Lλ−1 (v0,e ⊕ v0,o |y0 −1 )
= 2 tanh tanh
2
(i) 2i−1 2λ −1
!!
Lλ−1 (v0,o |y2λ−1 )
× tanh , (6)
2
(2i+1) 2i 2λ −1 (i) 2i−1 2i−1 2λ−1 −1
Lλ (v0 |y0 ) =(−1)u2i Lλ−1 v0,e ⊕ v0,o |y0
(i) 2i−1 2λ −1
+ Lλ−1 v0,o |y2λ−1 , (7)
(i)
n λ o
(i) λ Wλ v0i−1 .0|y02 −1
where Lλ (v0i−1 |y02 −1 ) = log (i)
n λ
o, 0 ≤ i < n, 0 ≤ λ < m, so that the decision rule
Wλ v0i−1 .1|y02 −1
for i ∈
/ F becomes (
(i)
ui−1
0, Lm (b0 |y0
n−1
)>0
u
bi =
1, otherwise.
3
3 Path score
3.1 Stack decoding algorithm
Let u0n−1 be the information vector used by the transmitter. Given a received noisy vector y0n−1 ,
the proposed decoding algorithm constructs sequentially a number of partial candidate information
vectors v0φ−1 ∈ Fφ2 , φ ≤ n, evaluates how close their continuations v0n−1 may be to the received
sequence, and eventually produces a single codeword, being a solution of the decoding problem.
It is convenient to represent the set of information vectors as a tree. The nodes of the tree
correspond to vectors v0φ−1 , 0 ≤ φ < n, satisfying (1). At depth φ, each node v0φ−1 has two children
v0φ−1 .0 and v0φ−1 .1. The root of the tree corresponds to an empty vector. By abuse of notation,
the path from the root of the tree to a node v0φ−1 is also denoted by v0φ−1 . A decoding algorithm
for a polar (sub)code needs to consider only valid paths, i.e. the paths satisfying constraints (1).
The stack decoding algorithm [8, 9, 5, 7] employs a priority queue1 (PQ) to store paths to-
gether with their scores. A PQ is a data structure, which contains tuples (M, v0φ−1 ), where
M = M (v0φ−1 , y0n−1 ) is the score of path v0φ−1 , and provides efficient algorithms for the following
operations [10]:
• push a tuple into the PQ;
• pop a tuple (M, v0φ−1 ) (or just v0φ−1 ) with the highest M ;
• remove a given tuple from the PQ.
We assume here that the PQ may contain at most D elements.
In the context of polar codes, the stack decoding algorithm operates as follows:
1. Push into the PQ the root of the tree with score 0. Let t0n−1 = 0.
2. Extract from the PQ a path v0φ−1 with the highest score. Let tφ ← tφ + 1.
3. If φ = n, return codeword v0n−1 Am and terminate the algorithm.
4. If the number of valid children of path v0φ−1 exceeds the amount of free space in the PQ,
remove from it the element with the smallest score.
5. Compute the scores M (v0φ , y0n−1 ) of valid children v0φ of the extracted path, and push them
into the PQ.
6. If tφ ≥ L, remove from PQ all paths v0j−1 , j ≤ φ.
7. Go to step 2.
In what follows, one iteration means one pass of the above algorithm over steps 2–7. Variables tφ
are used to ensure that the worst-case complexity of the algorithm does not exceed that of a list
SC decoder with list size L.
The parameter L has the same impact on the performance of the decoding algorithm as the
list size in the Tal-Vardy list decoding algorithm, since it imposes an upper bound on number of
paths tφ considered by the decoder at each phase φ. Step 6 ensures that the algorithm terminates
in at most Ln iterations. This is also an upper bound on the number of entries stored in the PQ.
However, the algorithm can work with PQ of much smaller size D. Step 4 ensures that this size
is never exceeded.
1APQ is commonly called ”stack” in the sequential decoding literature. However, the implementation of the
considered algorithm relies on Tal-Vardy data structures [2], which make use of the true stacks. Therefore, we
employ the standard terminology of computer science.
4
3.2 Score function
There are many possible ways to define a score function for sequential decoding. In general, this
should be done so that one can perform meaningful comparison of paths v0φ−1 of different length
φ. The classical Fano metric for sequential decoding of convolutional codes is given by
n−1
P M, y0n−1
P M|y0 = Qn−1 ,
i=0 W (yi )
where M is a variable-length message (i.e. a path in the code tree), and W (yi ) is the probability
measure induced on the channel output alphabet when the channel inputs follow some prescribed
(e.g. uniform) distribution [11]. In the context of polar codes, a straightforward implementation
of this approach would correspond to score function
n o
M1 (v0φ−1 , y0n−1 ) = log Wm
(φ−1)
v0φ−1 |y0n−1 .
This is exactly the score function used in [5]. However, there are several shortcomings in such
definition:
1. Although the value of the score does depend on all yi , 0 ≤ i < n, it does not take into account
freezing constraints on symbols ui , i ∈ F , i ≥ φ. As a result, there may exist incorrect paths
v0φ−1 6= uφ−1
0 , which have many low-probability continuations v0n−1 , vφn−1 ∈ F2n−φ , such that
the probability n o X n−1 n−1
Wm(φ−1)
v0φ−1 |y0n−1 = Wm (n−1)
v0 |y0
n−1
vφ
becomes quite high, and the stack decoder is forced to expand such a path. Note that this
is not a problem in the case of convolutional codes, where the decoder may recover after an
error burst, i.e. obtain a codeword, which is identical to the transmitted one, except for a
few closely located symbols.
2. Due to freezing constraints, not all vectors v0φ−1 ∈ Fφ2 correspond to valid paths in the code
tree. This does not allow one to fairly compare the probabilities of paths of different lengths,
which include different number of frozen symbols.
n o
(φ−1)
3. Computing probabilities Wm v0φ−1 |y0n−1 involves expensive multiplications and is prone
to numeric errors.
The first of the above problems can be addressed by considering only the most probable
continuation of path v0φ−1 , i.e. the score function can be defined as
n−1 n−1
M2 (v0φ−1 , y0n−1 ) = max (n−1)
log Wm v0 |y0 . (8)
n−1
vφ ∈Fn−φ
2
Observe that maximization is performed over last n−φ elements of vector v0n−1 , while the remaining
ones are given by v0φ−1 . Let us further define
n−1 n−1
V(v0φ−1 , y0n−1 ) = arg max (n−1)
log Wm w0 |y0 ,
w0n−1 ∈Fn2
w0φ−1 =v0φ−1
n o
(n−1)
i.e. M2 (v0φ−1 , y0n−1 ) = log Wm V(v0φ−1 , y0n−1 )|y0n−1 .
As it will be shown below, employing such score function already provides significant reduc-
tion of the average number of iterations at the expense of a negligible performance degradation.
Furthermore, it turns out that this score is exactly equal to the one used in the min-sum version
of the Tal-Vardy list decoding algorithm [12], i.e. it can be computed in a very simple way.
5
To address the second problem, we need to evaluate the probabilities of vectors v0n−1 under
freezing conditions. To do this, consider the set of valid length-φ prefixes of the input vectors of
the polarizing transformation, i.e.
( i−1
)
φ−1
X
n
C(φ) = v0 ∈ F2 |vi = Vji ,s vs , i ∈ F , 0 ≤ i < φ .
s=0
Let us further define the set of their most likely continuations, i.e.
n o
C(φ) = V(v0φ−1 , y0n−1 )|v0φ−1 ∈ C(φ) .
For any v0n−1 ∈ C(φ) the probability of transmission of v0n−1 Am , under condition of v0φ−1 ∈
C(φ) and given the received vector y0n−1 , equals
n−1
(n−1)
Wm U0 = v0n−1 |y0n−1
W v0n−1 |y0n−1 , C(φ) = (n−1) .
Wm U0n−1 ∈ C(φ)|y0n−1
Observe that this function is defined only for vectors v0φ−1 ∈ C(φ), i.e. those satisfying freezing
constraints up to phase φ.
(n−1) n−1
Unfortunately, there is no simple and obvious way to compute π(φ, y0n−1 ) = Wm U0 ∈ C(φ)|y0n−1 .
Therefore, we have to develop an approximation.
It can be seen that
n o
π(φ, y0n−1 ) =Wm(n−1)
V(uφ−1
0 )|y0n−1 +
X n o
Wm(n−1)
V(v0φ−1 )|y0n−1 . (9)
v0φ−1 ∈C(φ)
v0φ−1 6=uφ−1
0
| {z }
µ(uφ−1
0 ,y0n−1 )
h i
µ(un−1 ,Y)
Observe that p = EY 0
π(φ,Y) is the probability of the min-sum version of the Tal-Vardy list
decoding algorithm with infinite list size not obtaining uφ−1
0 as the most probable path at phase φ.
We consider decoding of polar (sub)codes, which are constructed to have low list SC decoding error
probability even for small list size in the considered channel W (y|c). Hence, itn can be assumed o
(φ−1)
that p ≪ 1. This implies that with high probability µ(uφ−1 0 , y0n−1 ) ≪ Wm V(uφ−1
0 )|y0n−1 ,
n o
(φ−1)
i.e. π(φ, y0n−1 ) ≈ Wm V(uφ−1
0 )|y n−1
0 .
However, a real decoder cannot compute this value, since the transmitted vector u0n−1 is not
available at the receiver side. Hence, we propose to further approximate the logarithm of the first
term in (9) with its expected value over Y, i.e.
h n oi
log π(φ, y0n−1 ) ≈ Ψ(φ) = EY log Wm(φ−1)
V(uφ−1
0 )|Y
Observe that this value depends only on φ and underlying channel W (y|c), and can be pre-
computed offline.
Hence, instead of the ideal score function M(v0φ−1 , y0n−1 ) we propose to use an approximate
one
M3 (v0φ−1 , y0n−1 ) = M2 (v0φ−1 , y0n−1 ) − Ψ(φ). (10)
6
3.3 Computing the score function
Consider computing
(φ−1) φ−1 n−1
Rm (v0 , y0 ) = M2 (v0φ−1 , y0n−1 ).
Let the modified log-likelihood ratios be defined as
(φ) φ−1 n−1 (φ) φ−1 (φ) φ−1
Sm (v0 |y0 ) = Rm (v0 .0, y0n−1 ) − Rm (v0 .1, y0n−1 ). (11)
where (
0, if sgn(S) = (−1)v
τ (S, v) =
−|S|, otherwise.
is the penalty function.
Indeed, let ṽ0n−1 = V(v0φ−1
). If vφ = ṽφ , then
the most probable continuations of v0φ−1 and v0φ
(φ) φ−1 n−1
are identical. Otherwise, − Sm (v0 |y0 ) is exactly the difference between the log-probability
of the most likely continuations of v0φ−1 and v0φ .
The initial value for recursion (12) is given by
n−1
Y
(−1) n−1
Rm (y0 ) = log W {C = ĉi |Y = yi } ,
i=0
where ĉi is the hard decision corresponding to yi . However, this value can be replaced with 0,
since it does not affect the selection of paths in the stack algorithm.
In order to obtain a simple expression for the proposed score function, observe that
n−1 n−1 n n
o
(n−1) (n/2−1) n−1 n−1 2 −1
Wm v0 |y0 =Wm−1 v0,e ⊕ v0,o |y0 ·
n o
(n/2−1) n−1 n−1
Wm−1 v0,o |y n .
2
(2i+1) 2i+1 N −1
Rλ (v0 |y0 ) =
N
(i) 2i+1 2i+1 −1 (i) 2i+1 N −1
Rλ−1 v0,e ⊕ v0,o , y02 + Rλ−1 v0,o , yN ,
2
where N = 2λ , 0 < λ ≤ m, and initial values for these recursive expressions are given by
(0) (0)
R0 (b, yj ) = log W0 {b|yj }, b ∈ {0, 1}. From (11) one obtains
(2i) λ
Sλ (v02i−1 |y02 −1
) = max(J(0) + K(0), J(1) + K(1))−
max(J(1) + K(0), J(0) + K(1))
= max(J(0) − J(1) + K(0) − K(1), 0)−
max(K(0) − K(1), J(0) − J(1))
(2i+1) 2i 2λ −1
Sλ (v0 |y0 ) =J(v2i ) + K(0) − J(v2i + 1) − K(1)
7
(i) λ−1 (i) λ
2i−1 2i−1
where J(c) = Rλ−1 ((v0,e ⊕ v0,o ).c|y02 −1 2i−1
), K(c) = Rλ−1 (v0,o .c|y22λ−1
−1
). Observe that
(i) λ−1
2i−1 2i−1 2 −1
J(0) − J(1) = a = Sλ−1 (v0,e ⊕ v0,o |y0 )
and λ
2i−1 2 −1 (i)
K(0) − K(1) = b = Sλ−1 (v0,o |y2λ−1 )
It can be obtained from these expressions that the modified log-likelihood ratios are given by
(2i) λ
Sλ (v02i−1 |y02 −1
) =Q(a, b) = sgn(a) sgn(b) min(|a|, |b|), (13)
(2i+1) 2i 2 −1 λ
v2i
Sλ (v0 |y0 ) =P (v2i , a, b) = (−1) a + b. (14)
W {0|yi }(0)
The initial values for this recursion are given by S0 (yi ) = log W {1|yi } .
These expressions can be readily recognized as the min-sum approximation for (6)–(7), and
(12) coincides with an approximation for M1 (v0φ , y0n−1 ) [13, 14, 12]. However, these are also the
exact values, which reflect the probability of the most likely continuation of a given path v0φ−1 in
the code tree.
Finally, we illustrate the meaning of M2 (v0n−1 , y0n−1 ). Let S0n−1 be the LLR vector correspond-
ing to the received noisy sequence y0n−1 . Let
n−1
X
E(c0n−1 , S0n−1 ) = − τ (Si , ci )
i=0
be the ellipsoidal weight (also known as correlation discrepancy) of vector c0n−1 ∈ Fn2 [15, 16]. It
is possible to show that the ML decoding problem for the case of transmission of codewords of a
code C over a memoryless channel can be formulated as
n−1
E(c2n−1
0 , S02n−1 ) = E(c2n−1
0,e + c2n−1 n−1
0,o , S̃0
n−1
) + E(c0,o , S0 ),
8
0
σ=0.5
σ=0.75
σ=1.0
-50 σ=1.25
-100
-150
Ψ(φ)
-200
-250
-300
-350
-400
0 500 1000 1500 2000
φ
assumption of zero codeword transmission. Indeed, in this case the cumulative density functions
(i) (i)
Fλ (x) of Sλ are given by [17]
(
(i) (i)
(2i) 2Fλ−1 (x)(1 − Fλ−1 (−x)), x<0
Fλ (x) = (i) (i) (i) (16)
2Fλ−1 (x) − (Fλ−1 (−x))2 − (Fλ−1 (x))2 , x ≥ 0
Z ∞
(2i+1) (i) (i)
Fλ (x) = Fλ−1 (x − y)dFλ−1 (y), (17)
−∞
(0)
where F0 (x) is the CDF of the channel output LLRs. Then one can compute
φ Z
X 0
(i)
Ψ(φ) = − Fm (x)dx. (18)
i=0 −∞
The bias function Ψ(φ) depends only on m and channel properties, so it can be used for
decoding of any polar (sub)code of a given length.
Figure 1 illustrates the bias function for the case of BPSK modulation and AWGN channel
with different noise standard deviations σ.
9
3.6 Complexity analysis
The algorithm presented in Section 3.1 extracts from the PQ length-φ paths at most L times.
(φ)
At each iteration it needs to calculate the LLR Sm (v0φ−1 |y0n−1 ). Intermediate values for these
calculations can be reused in the same way as in [2]. Hence, LLR calculations require at most
O(Ln log n) operations. However, simulation results presented below suggest that the average com-
plexity of the proposed algorithm is substantially lower, and at high SNR approaches O(n log n),
the complexity of the SC algorithm.
4 Improvements
4.1 List size adaptation
In the case of infinite size Θ of the priority queue the performance of the proposed algorithm
depends mostly on parameter L. Setting L = ∞ results in near2 maximum likelihood decoding at
the cost of excessive complexity and memory consumption. It was observed in experiments that
in most cases the decoding can be completed successfully with small L, and only rarely a high
value of L is required.
It was also observed that if the decoder needs to kill short paths too many times while processing
a single noisy vector, this most likely means that the correct path is already killed, and decoding
error is unavoidable.
Therefore, in order to reduce the decoding complexity, we propose to change list size L adap-
tively. We propose to start decoding of a noisy vector with small L, and increase it, if the decoder
is likely to have killed the correct path. In order to do this, we propose to keep track of the
number of times κ the algorithm actually removes some paths at step 6 of the algorithm presented
in Section 3.1. If this value exceeds some threshold κ0 , then the decoding may need to be restarted
with larger L, similarly to [18].
However, more efficient approach is possible. In order to avoid repeating the same calculations,
we propose not to kill paths permanently at step 6, but remove them from the PQ, and save
the corresponding pairs (M f, l) in a temporary array, where l is an identifier of some path v φ−1 ,
0
f = M3 (v φ−1 , y n−1 ), and M is the score of the path extracted at step 2 of the algorithm presented
M 0 0
in Section 3.1. Such paths are referred to as suspended paths. If κ exceeds some threshold κ0 ,
instead of restarting the decoder, we propose to double list size L and re-introduce into the PQ
suspended paths with the score better than the score M0 of the current path. This is performed
until L reaches some upper bound Lmax . The value of κ0 should be optimized by simulations.
10
Table 1: Average decoding complexity of (1024, 512, 28) code with L = 32, ×103 operations
Summations Comparisons
Eb /N0 , dB
Proposed Path Proposed Path
path score path score
score from score from
[7] [7]
0.5 63.2 133 122.5 218
1 34.8 73 55.6 122
1.5 16 32 21.9 54
2 8.8 18 12.0 31
we propose to set T to the value of pMAP -quantile of the distribution of µ, where pMAP is the
codeword error probability of the MAP decoder.
It follows from (15) that
n−1
X
M3 (u0n−1 , Y) = (τ (Si , ci ) − EYi [τ (Si , ci )]) ,
i=0
(0)
where Si = S0 (Yi ) and c0n−1 = u0n−1 Am . Assuming that zero codeword was transmitted,
(0)
i.e. u0n−1 = 0, one can derive the probability density function of log-likelihood ratios S0 (Yi ),
n−1 (0)
and hcompute the iPDF of M3 (u0 , Y) as n-times convolution of the PDFs of τ (S0 (Yi ), 0) −
(0)
EYi τ (S0 (Yi ), 0) .
The decoding error probability pMAP of the MAP decoder can be estimated by running sim-
ulations using the proposed sequential decoding algorithm with very large L. This enables one to
derive the termination threshold T , which depends only on channel and code properties. Numeric
results suggest that in the case of AWGN channel such threshold function can be well approximated
by
a C σ 2 + b C tC
T ≈ − min , 2 (19)
σ2 σ
for some parameters aC , bC , tC , which depend on the code C, and can be obtained by curve fitting
techniques.
Let pseq and pT be the codeword error probabilities of the sequential decoding algorithm
presented in Section 3.1, and the algorithm, which additionally discards all paths v0n−1 with
M (v0n−1 , y0n−1 ) < T , respectively. Obviously, pMAP ≤ pseq ≤ pT .
Then one obtains
5 Numeric results
The performance and complexity of the proposed decoding algorithm were investigated in the case
of BPSK modulation and AWGN channel. The results are reported for polar codes with 16-bit
11
(1024,512) polar subcode
0
10
-1
10
-2
10
FER
-3
10
10
-4 L=32, score M1
L=32, score M2
L=32,Proposed score M3
L=256, score M1
L=256, score M2
L=256,Proposed score M3
-5
10
0 0.5 1 1.5 2
Eb/N0, dB
5
10
4
10
3
10
0 0.5 1 1.5 2 2.5 3
Eb/N0, dB
Figure 2: The impact of the score function on the decoder performance and complexity.
CRC (polar-CRC) and randomized polar subcodes3 (PS) [4] . The size of the priority queue was
set in all cases to D = Ln. The complexity of the decoding algorithm is reported in terms of
the average number of iterations and average number of arithmetic operations. The first measure
enables one to assess the efficiency of path selection by the proposed score function, and the second
reflects the practical complexity of the algorithm.
Figure 2 illustrates the decoding error probability and average number of iterations performed
by the sequential decoder (without list size adaptation and early termination) for the case of
the path scores M1 , M2 and M3 . The first two scores correspond to the Niu-Chen stack decoding
algorithm and its min-sum version [5]. Observe, that the Niu-Chen algorithm was shown to achieve
exactly the same performance as the Tal-Vardy list decoding algorithm with the same value of L,
provided that the size of the priority queue D is sufficiently high.
It can be seen that employing score M2 results in a marginal performance loss, but significant
3 In order to ensure reproducibility of the results, we have set up a web site
http://dcn.icc.spbstu.ru/index.php?id=polar&L=2 containing the specifications of the considered polar
subcodes.
12
reduction of the average number of iterations performed by the decoder. This is due to existence
(n−1) n−1 n−1
of multiple paths v0n−1 with lown probabilityo Wm v0 |y0 , which add up (see (2)) to non-
(φ−1)
negligible probabilities Wm v0φ−1 |y0n−1 . Hence, employing path score M1 causes the decoder
to inspect many incorrect paths v0φ−1 . At sufficiently high SNR the most probable continuation
V(v0φ−1 ) of a path extracted at some phase from the PQ with high probability satisfies all freezing
constraints, so that the value given by M2 score function turns out to be close to the final path
score. This enables the decoder to avoid visiting many incorrect paths in the code tree.
Even more significant complexity reduction is obtained if one employs the proposed path score
M3 . The proposed path score enables one to correctly compare the probabilities of paths v0φ−1
of different length φ. This results in an order of magnitude reduction of the average number
of iterations. Observe that the performance of the decoder employing the proposed score M3 is
essentially the same as in the case of score M2 . Table 1 provides comparison of the average number
of arithmetic operations performed by the sequential decoder (without early termination and list
size adaptation) implementing the proposed path score function, and the one presented in our
prior work [7]. It can be seen that employing the proposed score function results substantially
lower average decoding complexity.
Figure 3 presents the performance and average decoding complexity of (2048, 1024) codes. For
comparison, we report also the performance of polar codes with CRC-16 under list decoding with
adaptive list size (ALS) [18], a CCSDS LDPC code under belief propagation decoding, and the
complexity of the min-sum implementation of the Tal-Vardy algorithm with fixed list size. The
complexity is presented in terms of the average number of summations and comparison operations
for polar (sub)codes, and average number of summations and evaluations of log tanh(x/2) for the
LDPC code.
It can be seen that for the case of a polar code with CRC the performance loss of the sequential
decoding algorithm with respect to the Niu-Chen algorithm and list decoding with adaptive list
size is more significant than in the case of (1024, 512) code. Observe that in this case the decoder
needs to perform iterations until vector v0n−1 with valid CRC is extracted from the PQ. Hence, in
this case the assumption that the value of (9) is dominated by the first term4 may be invalid with
high probability
However, the performance loss is much less significant in the case of polar subcodes. Observe
that at high SNR the ALS decoding algorithm has slightly lower average complexity than the
proposed one. However, it is not obvious how to use the ALS decoder in conjunction with polar
subcodes, which provide substantially better performance than polar codes with CRC, since this
algorithm relies on checking CRC of the obtained data vector in order to detect if another decoding
attempt with larger list size is needed. In the low-SNR region, where the frame error rate is at
least 10−3 , the proposed algorithm has lower average complexity compared to the ALS one, and
in the case of polar subcodes provides up to 0.1 dB performance gain.
Observe also, that the average number of summation and comparison operations in the case of
the proposed decoding algorithm quickly converges to the complexity of the min-sum SC algorithm.
It can be also seen that polar subcodes under the proposed sequential decoding algorithm with
L = 32 provide the performance comparable to the state-of-the-art LDPC code, and with larger L
far outperform it. The average complexity of the proposed algorithm turns out to be substantially
less compared to that of BP decoding. Observe also that reducing the maximal number of iterations
for the BP algorithm results in 0.5 dB performance loss with almost no gain in complexity, while
the sequential decoding algorithm enables much better performance-complexity tradeoffs.
Figure 4 illustrates the performance and complexity of the sequential decoding algorithm with
score M with list size adaptation (LSA) method described in Section 4.1. Here list size L was
allowed to grow from 32 to Lmax = 128. It was doubled after κ0 = 20 iterations, such that at least
one path was removed at step 6 of the algorithm described in Section 3.1. It can be seen that the
proposed list size adaptation method enables one to achieve essentially the same performance as
in the case of non-adaptive algorithm with L = Lmax , but with lower average complexity. Observe
4 Recall, that this assumption is only a probabilistic one.
13
Table 2: Early termination parameters
Code aC bC tC
(1024,736) -108.27 50.84 12
(1024, 512) -116.37 121.41 43
that the proposed implementation of list size adaptation does not require restarting the decoder
from scratch, as in the case of the techniques considered in [18, 19].
Figure 5 illustrates the probability distribution of the number of iterations performed by the
decoder in the case of correct and incorrect decoding. It can be seen that the distribution of the
number of iterations in the event of correct decoding has rather heavy tail, and employing list
size adaptation results in higher tail probabilities. In the event of decoding error the number of
iterations may become quite high. However, there is very small probability that the decoding
would be correct if the decoder has performed more than 20000 iterations. This can be also used
for early termination of decoding. As it may be expected, employing list size adaptation results
in heavier tail of the distribution for the case of incorrect decoding.
Figure 6 illustrates the termination threshold for the case of (1024, 736, 16) polar subcode for
different values of 1/σ 2 . It can be seen that (19) indeed represents a good approximation for T .
Figure 7 illustrates the performance and complexity of the sequential decoding algorithm with
and without the proposed early termination method. The parameters of the early termination
threshold function (19) for the considered polar subcodes are given in Table 2. It can be seen
that the early termination condition enables one to significantly reduce the decoding complexity
in the low-SNR region, where decoding error probability is high. This can be used to implement
HARQ and adaptive coding protocols. Observe also, that the performance and complexity of the
decoding algorithms employing the exact and approximate termination threshold functions are
very close.
6 Conclusions
In this paper a novel decoding algorithm for polar (sub)codes was proposed. The proposed ap-
proach relies on the ideas of sequential decoding. The key contribution is a new path score function,
which reflects the probability of the most likely continuation of a paths in the code tree, as well as
the probability of satisfying already processed dynamic freezing constraints. The latter probability
is difficult to compute exactly, so an approximation is proposed, which corresponds to the average
behaviour of the correct path.
The proposed score function enables one to significantly reduce the average number of iterations
performed by the decoder at the cost of a negligible performance loss, compared to the case of
the SCL decoding algorithm. The worst-case complexity of the proposed decoding algorithm is
O(Ln log n), similarly to the case of the SCL algorithm. Furthermore, the proposed algorithm
is based on the min-sum SC decoder, i.e. it can be implemented using only summation and
comparison operations.
It was also shown that the performance of the proposed decoding algorithm can be substantially
improved by recovering previously removed paths, and resuming their processing with increased
list size. The improvement comes at the cost of small increase of the average decoding complexity.
The average decoding complexity in the low-SNR region can be reduced by employing the proposed
early termination method.
References
[1] E. Arikan, “Channel polarization: A method for constructing capacity-achieving codes for
symmetric binary-input memoryless channels,” IEEE Transactions on Information Theory,
vol. 55, no. 7, pp. 3051–3073, July 2009.
14
[2] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Transactions On Information
Theory, vol. 61, no. 5, pp. 2213–2226, May 2015.
[3] P. Trifonov and V. Miloslavskaya, “Polar subcodes,” IEEE Journal on Selected Areas in
Communications, vol. 34, no. 2, pp. 254–266, February 2016.
[4] P. Trifonov and G. Trofimiuk, “A randomized construction of polar subcodes,” in Proceedings
of IEEE International Symposium on Information Theory, 2017, pp. 1863–1867.
[5] K. Niu and K. Chen, “Stack decoding of polar codes,” Electronics Letters, vol. 48, no. 12, pp.
695–697, June 2012.
[6] P. Trifonov and V. Miloslavskaya, “Polar codes with dynamic frozen symbols and their decod-
ing by directed search,” in Proceedings of IEEE Information Theory Workshop, September
2013, pp. 1 – 5.
[7] V. Miloslavskaya and P. Trifonov, “Sequential decoding of polar codes,” IEEE Communica-
tions Letters, vol. 18, no. 7, pp. 1127–1130, 2014.
[8] K. S. Zigangirov, “Some sequential decoding procedures,” Problems of Information Trans-
mission, vol. 2, no. 4, pp. 1–10, 1966, in Russian.
[9] R. Johannesson and K. Zigangirov, Fundamentals of Convolutional Coding. IEEE Press,
1998.
[10] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed.
The MIT Press, 2001.
[11] J. Massey, “Variable-length codes and the Fano metric,” IEEE Transactions on Information
Theory, vol. 18, no. 1, pp. 196–198, January 1972.
[12] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg, “LLR-based successive cancellation
list decoding of polar codes,” IEEE Transactions On Signal Processing, vol. 63, no. 19, pp.
5165–5179, October 2015.
[13] C. Leroux, A. J. Raymond, G. Sarkis, and W. Gross, “A semi-parallel successive-cancellation
decoder for polar codes,” IEEE Transactions on Signal Processing, vol. 61, no. 2, pp. 289–299,
January 2013.
[14] C. Leroux, I. Tal, A. Vardy, and W. Gross, “Hardware architectures for successive cancella-
tion decoding ofpolar codes,” in Proceedings of IEEE International Conference on Acoustics,
Speech and Signal Processing, May 2011, pp. 1665–1668.
[15] A. Valembois and M. Fossorier, “Box and match techniques applied to soft-decision decoding,”
IEEE Transactions on Information Theory, vol. 50, no. 5, pp. 796–810, May 2004.
[16] H. T. Moorthy, S. Lin, and T. Kasami, “Soft-decision decoding of binary linear block codes
based on an iterative search algorithm,” IEEE Transactions On Information Theory, vol. 43,
no. 3, pp. 1030–1040, May 1997.
[17] D. Kern, S. Vorkoper, and V. Kuhn, “A new code construction for polar codes using min-sum
density,” in Proceedings of International Symposium on Turbo Codes and Iterative Informa-
tion Processing, 2014, pp. 228–232.
[18] B. Li, H. Shen, and D. Tse, “An adaptive successive cancellation list decoder for polar codes
with cyclic redundancy check,” IEEE Communications Letters, vol. 16, no. 12, pp. 2044–2047,
December 2012.
[19] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. Gross, “Fast list decoders for polar
codes,” IEEE Journal On Selected Areas In Communications, vol. 34, no. 2, pp. 318–328,
February 2016.
15
(2048,1024) codes
100
L=32, PS, M3 score
L=32, PS, M1 score
L=512, PS, M3 score
L=32, polar-CRC, M3 score
L=512, polar-CRC, M3 score
10-1 Lmax=32, polar-CRC, ALS
Lmax=512, polar-CRC, ALS
LDPC, <=20 iterations
LDPC, <=200 iterations
10-2
FER
10-3
10-4
10-5
1 1.2 1.4 1.6 1.8 2
Eb/N0, dB
(2048,1024) codes
108
L=32, PS, M3 score
L=512, PS, M3 score
L=32, polar-CRC, M3 score
L=512, polar-CRC, M3 score
L=32, polar-CRC, ALS
L=512, polar-CRC, ALS
107 LDPC, <=20 iterations
LDPC, <=200 iterations
Average number of operations
min-sum SC decoding
L=32, min-sum list decoding
L=512, min-sum list decoding
106
105
104
0 0.5 1 1.5 2
Eb/N0, dB
16
0
10
L=32
L=64
L=128
LSA L=32, Lmax=128
-1
10
-2
10
FER
-3
10
-4
10
-5
10
0 0.5 1 1.5 2
Eb/N0, dB
6
10
L=32
L=64
L=128
LSA L=32, Lmax=128
Average number of iterations
5
10
4
10
3
10
0 0.5 1 1.5 2
Eb/N0, dB
Figure 4: Performance and complexity of sequential decoding with list size adaptation for
(2048, 1024) polar subcode.
17
(2048,1024,48) code
0
10
Correct decoding, Eb/N0=1.5 dB, L=32
Incorrect decoding, Eb/N0=1.5 dB, L=32
Correct decoding, Eb/N0=1.5 dB, LSA L=32..128
Incorrect decoding, Eb/N0=1.5 dB, LSA L=32..128
-1
10
-2
10
Probability
-3
10
-4
10
-5
10
-6
10
1000 10000 100000
Number of iterations
80
exact
approximation
60
40
20
T
-20
-40
-60
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4
2
1/σ
Figure 6: Approximation of the early termination threshold for a (1024, 736, 16) polar subcode.
18
0
10
(1024,512), without early termination
(1024,512), exact threshold
(1024,512), approximate threshold
(1024,736), without early termination
(1024,736), exact threshold
(1024,736), approximate threshold
-1
10
FER
-2
10
-3
10
-4
10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Eb/N0, dB
L=32
30000
(1024,512), without early termination
(1024,512), exact threshold
(1024,512), approximate threshold
(1024,736), without early termination
25000 (1024,736), exact threshold
(1024,736), approximate threshold
Average number of iterations
20000
15000
10000
5000
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Eb/N0, dB
19