Fasp
Fasp
Contents
Complex Numbers 2
Signal Spaces 2
Lebesgue spaces for DT signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Lebesgue spaces for CT signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Lebesgue spaces for periodic CT signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Fourier Transform 4
A Fundamental Isomorphism of Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 5
Fourier Transform for Non‑periodic CT signals . . . . . . . . . . . . . . . . . . . . . . . . . 5
Properties of the Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Fourier Transform of DT Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Discrete Fourier Transform (DFT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Filter Design 11
2D Signal Processing 13
2D Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Jan‑Niklas 1
Foundations of Audio Signal Processing 2022‑02‑14
Complex Numbers
⎛𝑎 −𝑏⎞
• As 𝑎 + 𝑖𝑏 or (𝑎, 𝑏) or ⎜
⎜ ⎟
⎟ with 𝑎, 𝑏 ∈ ℝ
⎝ 𝑏 𝑎 ⎠
• In polar coordinates: (|𝑧| , arg(𝑧)) or 𝑟𝑒𝑖𝜑
• 𝑧 is an 𝑛th root of unity if 𝑧𝑛 = 1. It is primitive if it is not an 𝑚th root of unity for any smaller 𝑚.
Signal Spaces
– Continuous
⎧
{1, |𝑡| ≤ 𝑤2
• Box function: 𝑏𝑤 (𝑡) ≔
⎨
{
⎩0, otherwise
• Frequency signal of frequency 𝜔 ∈ [0, 1): 𝑘 ↦ 𝑒2𝜋𝑖𝜔𝑘
Jan‑Niklas 2
Foundations of Audio Signal Processing 2022‑02‑14
Hilbert Basis: ON‑system (𝑒𝑖 ) of a Hilbert space 𝑉 is called a Hilbert basis, if the following equivalent
conditions are true:
• Fourier series: Every 𝑥 ∈ 𝑉 has a representation 𝑥 = ∑𝑖 ⟨𝑥, 𝑒𝑖 ⟩ 𝑒𝑖 , where the number of non‑
zero terms is countable
𝑝
ℓ𝑝 (ℤ) = { 𝑥 ∶ ℤ → ℂ ∣ ∑ |𝑥𝑛 | < ∞ }
𝑛
Jan‑Niklas 3
Foundations of Audio Signal Processing 2022‑02‑14
And ℓ∞ (ℤ) is the space of all bounded signals in ℤ. These have the norms
1
𝑝
𝑝
‖𝑥‖𝑝 ≔ (∑ |𝑥𝑛 | )
𝑛
ℓ2 (ℤ) is the only Hilbert space among these, with scalar product ⟨𝑥, 𝑦⟩ = ∑𝑛 𝑥(𝑛)𝑦(𝑛).
𝑝
𝐿𝑝 (ℝ) = { 𝑓 ∶ ℝ → ℂ measurable ∣ ∫ |𝑓(𝑡)| d𝑡 < ∞ }
ℝ
∞
And 𝐿 (ℝ) is the space of all measurable signals that are essentially bounded in ℝ. These have the
norms 1
𝑝
𝑝
‖𝑓‖𝑝 ≔ (∫ |𝑓(𝑡)| d𝑡)
ℝ
𝐿2 (ℝ) is the only Hilbert space among these, with scalar product ⟨𝑓, 𝑔⟩ = ∫ℝ 𝑓(𝑡)𝑔(𝑡) d𝑡.
Fourier Transform
Hilbert space 𝐿2 ([0, 1]) has (among others) these Hilbert bases:
√ √
• { 1, 𝐴𝑘 , 𝐵𝑘 | 𝑘 ∈ ℕ } with 𝐴𝑘 (𝑡) = 2 cos(2𝜋𝑘𝑡) and 𝐵𝑘 (𝑡) = 2 sin(2𝜋𝑘𝑡)
Jan‑Niklas 4
Foundations of Audio Signal Processing 2022‑02‑14
The function 𝑓 ↦ 𝑓 ̂ ≔ (⟨𝑓, e𝑘 ⟩)𝑘∈ℤ , which assigns every signal the sequence of its Fourier coefficients,
is an isomorphism of the Hilbert spaces 𝐿2 ([0, 1]) and ℓ2 (ℤ), that also keeps the scalar product invari‑
ant.
̂
𝑓(𝜔) ≔ ∫ 𝑓(𝑡)𝑒−2𝜋𝑖𝜔𝑡 d𝑡 .
ℝ
This definition can be continued to all of 𝐿2 (ℝ) to define a unitary transform 𝑓 ↦ 𝑓 ̂ on 𝐿2 (ℝ). Its
inverse is
̌ ≔ ∫ 𝑓(𝜔)𝑒2𝜋𝑖𝜔𝑡 d𝜔 .
𝑓(𝑡)
ℝ
̂
𝑓(𝜔) = ∫ 𝑓(𝑡) ⋅ (cos(2𝜋𝜔𝑡) − 𝑖 sin(2𝜋𝜔𝑡)) d𝑡
ℝ
For 𝑥 ∈ ℓ2 (ℤ):
𝑥(𝜔)
̂ ≔ ∑ 𝑥(𝑘)𝑒−2𝜋𝑖𝜔𝑘 ∈ 𝐿2 ([0, 1])
𝑘
Properties:
Jan‑Niklas 5
Foundations of Audio Signal Processing 2022‑02‑14
̂
• 𝑥(𝜔) = 𝑥(−𝜔)
̂
Table 1: Formulas for Fourier transforms. All of these are unitary isomorphisms.
∞ ∞
Parseval’s Theorem: ∫−∞ 𝑥(𝑡)𝑦(𝑡) d𝑡 = ∫−∞ 𝑋(𝜔)𝑌 (𝜔) d𝜔
Jan‑Niklas 6
Foundations of Audio Signal Processing 2022‑02‑14
1
Table 3: Fourier Transform Pairs, with 𝜃(𝑡) = 2 sgn(𝑡) + 12 .
𝑏1 (𝑡) sinc(𝜔)
sinc(𝑡) 𝑏1 (𝜔)
2 2
𝑒−𝜋𝑡 𝑒−𝜋𝜔
𝛿(𝑡) 1
1 𝛿(𝜔)
𝛿(𝑡 − 𝑡0 ) 𝑒−2𝜋𝑖𝜔𝑡0
𝑒2𝜋𝑖𝜔0 𝑡 𝛿(𝜔 − 𝜔0 )
1
sgn(𝑡) 𝜋𝑖𝜔
1
𝜋𝑡 −𝑖 sgn(𝜔)
1
𝜃(𝑡) 2𝜋𝑖𝜔 + 12 𝛿(𝜔)
1 𝑖
2 𝛿(𝑡) + 2𝜋𝑡 𝜃(𝜔)
Fourier coefficients of a finite DT signal can be computed via matrix‑vector multiplication with the DFT
matrix.
• Analog to digital conversion (ADC): Sampling ((ℝ → ℝ) → (ℤ → ℝ)) and Quantization ((ℤ →
ℝ) → (ℤ → ℤ))
Jan‑Niklas 7
Foundations of Audio Signal Processing 2022‑02‑14
Sampling
𝑥(𝑛) ≔ 𝑓(𝑇 ⋅ 𝑛) .
1
𝑇 is the sampling rate.
Synthesis function: Approximate signal 𝑓 given the sampled signal 𝑥.
Bandlimited function: Signal which only contains frequencies up to a certain threshold.
Shannon Sampling Theorem: Bandlimited function can be perfectly reconstructed from a suitable
1
set of samples. Let 𝑓 ∈ 𝐿2 (ℝ) be Ω‑bandlimited and 𝑥 the 𝑇 ‑sampled version of 𝑓 with 𝑇 = 2Ω . Then
𝑓 can be reconstructed from 𝑥:
∞ ∞
𝑡 − 𝑛𝑇 𝑛
𝑓(𝑡) = ∑ 𝑥(𝑛) sinc ( ) = ∑ 𝑓 ( ) sinc(2Ω𝑡 − 𝑛)
𝑛=−∞
𝑇 𝑛=−∞
2Ω
Sampling rate of 2Ω Hz is sufficient for a perfect reconstruction, called Nyquist‑Rate. Ω itself is called
Nyquist frequency.
𝑥 is 𝑇 ‑sampled 𝑓, then:
1 𝜔+𝑘
𝑥(𝜔)
̂ = ∑ 𝑓 ̂( )
𝑇 𝑘∈ℤ 𝑇
Quantization
Partition ℝ into contiguous intervals, represent each interval by a codeword. Coder maps real value
to codeword, decoder maps back to real value.
Uniform Scalar Quantizer: Uniform width 𝑄 of quantizer levels, except the first and last level which
extend to infinity.
Non‑uniform quantization: Coder compresses the amplitude range before performing a uniform
quantization, decoder expands amplitude range after reversing the quantization. 𝜇‑Law compressor
function for 𝜇 > 0:
ln (1 + 𝑥𝜇|𝑥| )
𝑐𝜇 (𝑥) ≔ 𝑥max max
⋅ sign(𝑥)
ln(1 + 𝜇)
Jan‑Niklas 8
Foundations of Audio Signal Processing 2022‑02‑14
A system or operator is a mapping that transforms signals. In our case only DT‑signals, i.e. the signals
spaces are ℓ𝑝 (ℤ). A system is linear, if the input and output spaces are linear and the mapping is
ℂ‑linear.
Common operators:
𝑇 𝑘 [𝑥](𝑛) ≔ 𝑥(𝑛 − 𝑘) .
↓ 𝑀 [𝑥](𝑛) ≔ 𝑥(𝑀 ⋅ 𝑛) .
⎧
{𝑥 ( 𝑀 ) , 𝑀 |𝑛
↑ 𝑀 [𝑥](𝑛) ≔ ⎨ 𝑛 .
{
⎩0, else
⎧
{𝑥(𝑛), |𝑥(𝑛)| ≤ 𝜆
Cut𝜆 [𝑥](𝑛) ≔ ⎨ .
{
⎩sgn(𝑥(𝑛)) ⋅ 𝜆, else
Jan‑Niklas 9
Foundations of Audio Signal Processing 2022‑02‑14
A linear system 𝑇 is continuous if it transforms a convergent sequence of input signals into a con‑
vergent sequence of output signals. A linear system 𝑇 is time invariant if it commutes with every
time‑shift. Linear, time invariant systems are called LTI‑systems.
Convolution
(𝑥 ∗ 𝑦)(𝑛) ≔ ∑ 𝑥(𝑘)𝑦(𝑛 − 𝑘) .
𝑘∈ℤ
̂
• Convolution theorem: 𝑥 ∗ 𝑦 = 𝑥̂ ⋅ 𝑦 ̂
A continuous LTI‑system 𝑇 coincides with the convolution operator of its impulse response ℎ ≔ 𝑇 [𝛿],
i.e. 𝑇 [𝑥] = 𝐶ℎ [𝑥] = ℎ ∗ 𝑥. ℎ(𝑛) is called the 𝑛‑th filter coefficient of 𝑇 .
A linear system 𝑇 ∶ ℓ𝑝 (ℤ) → ℓ𝑝 (ℤ) is called stable if it is continuous and 𝑇 [𝛿 𝑘 ] ∈ ℓ1 (ℤ) for all 𝑘 ∈ ℤ.
Special case for a system on ℓ∞ (ℤ): BIBO‑stable (bounded input, bounded output). 𝑇 is stable and
time‑invariant iff there exists an ℎ ∈ ℓ1 (ℤ) such that 𝑇 = 𝐶ℎ .
A continuous LTI‑system is called FIR‑system if only finitely many filter coefficients are non‑zero. Oth‑
erwise it is called IIR‑system. The system is called causal if ∀𝑛 < 0 ∶ ℎ(𝑛) = 0. The length ℓ(ℎ) of a
FIR‑filter ℎ is the number of elements from its first to its last non‑zero element (inclusive). The order
of a causal FIR‑filter ℎ is ℓ(ℎ) − 1.
Fourier transform of impulse response ℎ is called frequency response ℎ̂ or 𝐻. For each frequency
𝜔, the frequency sequence 𝑓𝜔 is an eigenvector of the convolution operator 𝐶ℎ for the eigenvalue
𝐻(𝜔):
𝐶ℎ [𝑓𝜔 ] = 𝐻(𝜔)𝑓𝜔 , where 𝑓𝜔 ≔ (𝑒2𝜋𝑖𝜔𝑘 )𝑘∈ℤ
• For real‑valued ℎ: 𝐻(𝜔) = 𝐻(−𝜔), and thus |𝐻(𝜔)| = |𝐻(−𝜔)| and Φℎ (−𝜔) = −Φℎ (𝜔)
Jan‑Niklas 10
Foundations of Audio Signal Processing 2022‑02‑14
𝑧‑Transform
Value of discrete signals interpreted as coefficients of a polynomial (for finite signals) or formal power
series (for infinite signals) in the variable 𝑧. The 𝑧‑transform formalizes this idea: let 𝑥 ∶ ℤ → ℂ. The
𝑧‑transform 𝑍[𝑥] ≕ 𝑋 of 𝑥 is the function 𝑋 ∶ 𝐷𝑥 → ℂ defined by
𝑋(𝑧) = ∑ 𝑥(𝑛) ⋅ 𝑧 −𝑛 ,
𝑛∈ℤ
The 𝑧‑transform of the impulse response of a continuous LTI‑system is called the transfer‑function
of the system.
Filter Design
Coefficients of an ideal low‑pass filter with cut‑off frequency 𝜔0 ∈ (0, 12 ]: 𝑘 ↦ 2𝜔0 sinc(2𝜔0 𝑘)
Amplitude response of real low‑pass filter: pass band oscillates around 1, transition band goes straight
from 1 to 0, stop band oscillates around 0. Pass band is at 1 ± 𝛿1 , stop band at 0 + 𝛿2 . Pass band to
transition band at frequency 𝜔𝑝 , transition band to stop band at frequency 𝜔𝑠 .
In general, the phase response of a filter induces a time‑shift in the filtered signal depending on the
frequency. A filter has linear phase when Φℎ (𝜔) = 𝑐𝜔 mod 1 for some 𝑐 ∈ ℝ. In this case, the time‑
delay is independent of the frequency. This time‑delay is formalized using the group delay defined
Jan‑Niklas 11
Foundations of Audio Signal Processing 2022‑02‑14
by
dΦℎ
𝜏ℎ (𝜔) ≔ − (𝜔) .
d𝜔
Filters with linear phase have a constant group delay.
Even filters have a purely real frequency response, odd filters a purely imaginary. The frequency re‑
sponse of a real‑valued filter has even real part and odd imaginary part.
A causal FIR‑filter ℎ that satisfies the requirement of symmetry ℎ(𝑘) = ℎ(2𝑁 − 𝑘) or the require‑
ment of anti‑symmetry ℎ(𝑘) = −ℎ(2𝑁 − 𝑘) has linear phase.
1
Averaging filter: ℎ𝑁 (𝑛) = 𝑁 if 0 ≤ 𝑛 < 𝑁 and ℎ𝑁 (𝑛) = 0 otherwise
The information encoded in the Fourier coefficients contains no obvious temporal information about
the signal and is thus kind of the averaged frequency content. The windowed Fourier transform is
used to also obtain temporal information.
A window function is any 𝑔 ∈ 𝐿2 (ℝ) with ‖𝑔‖2 > 0. For a window function 𝑔 we define (musical)
notes of frequency 𝜔 and time position 𝑡 as
• The center of 𝑔:
2
𝑡0 (𝑔) ≔ ∫ 𝑡 |𝑔(𝑡)| d𝑡
ℝ
• The width of 𝑔:
2
𝑇 (𝑔) ≔ √∫(𝑡 − 𝑡0 )2 |𝑔(𝑡)| d𝑡
ℝ
Jan‑Niklas 12
Foundations of Audio Signal Processing 2022‑02‑14
1
The product of the widths of 𝑔 and 𝑔 ̂ is at least 4𝜋 .
1 ̃ 𝑡)𝑔 (𝑢) d𝜔 d𝑡
𝑓(𝑢) = 2
∫ ∫ 𝑓(𝜔, 𝜔,𝑡
‖𝑔‖ ℝ ℝ
Discrete WFT
𝑔(𝑢 − 𝑛𝜏 ) ̃
𝑓(𝑢) = 𝜏 𝜈 ∑ ∑ 𝑓(𝑚𝜈, 𝑛𝜏 )𝑒2𝜋𝑖𝑚𝜈𝑢
𝑛∈ℤ 𝑚∈ℤ
𝐻 𝜏 (𝑢)
For:
2D Signal Processing
CT signals 𝑓 ∶ ℝ2 → ℂ, DT signals 𝑥 ∶ ℤ2 → ℂ.
Jan‑Niklas 13
Foundations of Audio Signal Processing 2022‑02‑14
𝑝
ℓ𝑝 (ℤ2 ) = { 𝑥 ∶ ℤ2 → ℂ ∣ ∑ ∑ |𝑥(𝑛1 , 𝑛2 )| < ∞ }
𝑛1 𝑛2
𝑝
𝐿𝑝 (ℝ2 ) = { 𝑓 ∶ ℝ2 → ℂ measurable ∣ ∫ ∫ |𝑓(𝑡1 , 𝑡2 )| d𝑡1 d𝑡2 < ∞ }
ℝ ℝ
Linearity, continuity and time‑invariance of systems also apply in the 2D case, with the 2D shift oper‑
ator 𝜏𝑘1 ,𝑘2 [𝑥](𝑛1 , 𝑛2 ) ≔ 𝑥(𝑛1 − 𝑘1 , 𝑛2 − 𝑘2 ).
2D Fourier Transform
When sampling 2D signals we measure the sampling rate in each dimension separately.
2D sampling theorem: Let 𝑓 ∈ 𝐿2 (ℝ2 ) be an (Ω1 , Ω2 )‑bandlimited function and 𝑥 the (𝑇1 , 𝑇2 )‑
sampled version of 𝑓 with 𝑇1 = 2Ω1 , 𝑇2 = 2Ω1 . Then 𝑓 can be reconstructed from 𝑥 using
1 2
𝑡1 − 𝑛1 𝑇1 𝑡 − 𝑛 2 𝑇2
𝑓(𝑡1 , 𝑡2 ) = ∑ ∑ 𝑥(𝑛1 , 𝑛2 ) sinc ( ) sinc ( 2 ).
𝑛1 ∈ℤ 𝑛2 ∈ℤ
𝑇1 𝑇2
Jan‑Niklas 14
Foundations of Audio Signal Processing 2022‑02‑14
– H2 𝑥 = −𝑥
– H−1 = −H
Jan‑Niklas 15