This paper presents a generic approach to highly efficient image registration in two and three dimensions. Both monomodal and multimodal registration problems are considered. We focus on the important class of affine-linear transformations in a derivative-based optimization framework. Our main contribution is an explicit formulation of the objective function gradient and Hessian approximation that allows for very efficient, parallel derivative calculation with virtually no memory requirements. The flexible parallelism of our concept allows for direct implementation on various hardware platforms. Derivative calculations are fully matrix free and operate directly on the input data, thereby reducing the auxiliary space requirements from \({\mathcal {O}}(n)\) to \({\mathcal {O}}(1)\). The proposed approach is implemented on multicore CPU and GPU. Our GPU code outperforms a conventional matrix-based CPU implementation by more than two orders of magnitude, thus enabling usage in real-time scenarios. The computational properties of our approach are extensively evaluated, thereby demonstrating the performance gain for a variety of real-life medical applications.

J. Rühaak, L. König, F. Tramnitzke and J. Modersitzki received funding from the European Union, European Regional Development Fund, Grant No. 122-10-002. All authors declare that they have no conflicts of interest.
Extension to the three-dimensional case
In this appendix, explicit matrix-free calculation rules will be derived for affine-linear registration of three-dimensional images with the SSD and NGF distance measures. Most definitions of the occuring functions are briefly repeated here to improve readability.
2.1 Sum of squared differences (SSD)
For any \(y:\Omega _{{\mathcal {R}}}\rightarrow {\mathbb {R}}^{3}\), the sum of squared differences (SSD) distance measure [28] is given by
Let \(y_w:{\mathbb {R}^{3}}\rightarrow {\mathbb {R}^{3}}, \quad x\,\mapsto \,Ax+b\) denote a three-dimensional affine-linear transformation with \(w=(w_1,\ldots ,w_{12})\) and
Setting \({\mathcal {D}}_{\text {SSD}}(w):={\mathcal {D}}_{\text {SSD}}({\mathcal {R}},{\mathcal {T}};y_w)\) yields the formulation of affine-linear image registration with SSD as minimization problem
with \({\mathcal {D}}_{\text {SSD}}:{\mathbb {R}^{12}}\rightarrow {\mathbb {R}}\). For discretization, the domain \(\Omega _{{\mathcal {R}}}\) is assumed to be cuboid and decomposed into n cells of equal size with center points \({\mathbf {x}}_{i},\, i=1,\ldots ,n\), arranged in lexicographical ordering. Using the midpoint quadrature rule for numerical integration, a discretized version of (25) reads
where \(\bar{h}\) denotes the volume of each cell. Multilinear interpolation with Dirichlet zero boundary conditions is used to evaluate the discrete template image at arbitrary coordinates.
Let \(({\mathbf {x}}_{i})_j\) denote the j-th component of \({\mathbf {x}}_{i}\in {\mathbb {R}}^{3}\). For transformation parameters \(w\in {\mathbb {R}^{12}}\), we define the vector
to construct the function
Using \({{\mathbf {y}}_{i}} = (y_i,y_{i+n},y_{i+2n})^\top \), we define
With \(R_i := {{\mathcal {R}}}({{\mathbf {x}}}_{i})\), we set
as residual function and finally
as the sum of all squared residual elements. Now, \(D_{\mathrm {SSD}}\) can be written as a concatenation of four functions:
2.1.1 Matrix-based differentiation
The differentiation of (28) is performed with the chain rule as
just as in the two-dimensional case. Again, we define the gradient as a row vector. The first two individual derivatives in (29) are given by
with \(I_{n} \in {\mathbb {R}^{n\;\times\;n}} \) as the identity matrix. Denoting the partial derivative with respect to the i-th component by \(\partial _i\) and defining \(\partial _i {\mathcal {T[y]}}\) as
it holds that
Finally, the derivative of the function y is given by
with the Kronecker product \(\otimes \) and the grid matrix X as
thus completing the analysis of the gradient components from (29). With
the Gauss–Newton approximation \(H_{\text {SSD}}\) of the Hessian matrix is given by
with \({\mathrm {d}}_2\psi =\bar{h}\). Again, note \(\frac{\partial r}{\partial T}=I_n\).
2.1.2 Matrix-free derivative calculation
With (31) and (32), it follows that
Using (30), it holds that
with \({\mathcal {T}_{w}}({{\mathbf {x}}}_i)\,\,{:=}\,\,{{\mathcal {T}}}(y_w({{\mathbf {x}}}_i))\). The explicit calculation rule for the objective function gradient in the three-dimensional case is therefore given by
The Gauss–Newton approximation to the Hessian for the SSD distance measure is defined as
By utilizing (34) and setting
it directly follows that
2.2 Normalized gradient fields (NGF)
We consider the NGF distance measure [16]
\(\langle a,b \rangle _{\alpha ,\beta }:=\sum _{i=1}^{3}a_ib_i+\alpha \beta ,\ a,b\in {\mathbb {R}}^{3}\), \(\Vert a\Vert _\varepsilon :=\sqrt{\sum _{i=1}^3 a_i^2+\varepsilon ^2}\), with separate edge parameters for reference and template image, cf. [35]. Setting \({\mathcal {D}}_{\text {NGF}}(w) := {\mathcal {D}}_{\text {NGF}}({\mathcal {R}},{\mathcal {T}};y_w)\), affine-linear image registration with NGF translates to
For numerical optimization, the continuous formulation in (37) is discretized. For a reference image of size \(n_1\;\times\;n_2\;\times\;n_3\) and an index \(i,\ i=1,\ldots ,n\), let \(i', j',k'\in {\mathbb {N}},1\le i'\le n_1,\ 1\le j'\le n_2,\ 1\le k'\le n_3\) such that \(i = i' + j'n_1 + k'n_1n_2\). The indices of neighboring points with Neumann zero boundary conditions are given by
We define functions
for gradient and scalar product type operations at the position i, respectively. Further setting
the discretized version of (37) is given by
with \((T_w)_i = {\mathcal {T}}(y_w({{\mathbf {x}}}_i))\).
2.2.1 Matrix-based differentiation
Let y and T as in (26) and (27). We define the residual function \(r:{\mathbb {R}^{n}}\rightarrow {\mathbb {R}^{n}}\) by setting the i-th component function \(r_i:{\mathbb {R}^{n}}\rightarrow {\mathbb {R}}\) to
The reduction function \(\psi :{\mathbb {R}^{n}}\rightarrow {\mathbb {R}}\) is given by
yielding the function chain
The derivatives of T and y have already been computed in (31) and (32). For the reduction function \(\psi \), it holds that
The calculation of \(\frac{\partial r}{\partial T}\) is performed by differentiating the component functions \(r_i,\; i=1,\ldots ,n\). The functions \(r_i\) are composed of \(s_i\), \(g_i\) and \(n_\varepsilon \) whose derivatives are given by

with \(\frac{\partial g_i}{\partial T}\in {\mathbb {R}^{3\;\times\;n}}\). Applying the chain rule in both numerator and denominator of \(r_i\) yields
with the vector entries at positions \(i_{-z},i_{-y},i_{-x},i_{+x},i_{+y},\) and \(i_{+z}\) (in that order) as defined in (38). Note that these positions may coincide, in which case the values are added.
The Gauss–Newton approximation \(H_{\text{NGF}}\) to the Hessian is given by
with \({\mathrm {d}}r\) defined as in (33) and \({\mathrm {d}}_2\psi =-{\bar{h}}\).
2.2.2 Matrix-free derivative calculation
Setting \(r_i:=\frac{s_i(g_i(T))}{ n_\varrho ( g_i(R))\ n_\tau ( g_i(T))}\) and \({\mathrm {d}}r_i:=\frac{\partial r_i}{\partial T}\frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\), it holds with (39) that
As \(r_i\in {\mathbb {R}}\) are scalars, it suffices to derive a matrix-free description of the vectors \({\mathrm {d}}r_i\in {\mathbb {R}}^{12}\) to achieve a fully matrix-free formulation of the objective function gradient. Let \(1\le i \le n\) and define indices \(i_{-z},i_{-y},i_{-x},i_{+x},i_{+y},i_{+z}\) as in (38). With the definition
for \(i=1,\ldots ,n\), \(j=1,2,3\), \(k=1,\dots ,4\) and
it follows that
which according to (40) yields
completing the gradient calculation for the three-dimensional case. Since
the calculation of the Hessian approximation can directly be performed with the help of the matrix-free formulation of \({\mathrm {d}}r_i\) from (41). By defining the matrices \(l_k\in {\mathbb {R}^{12}\;\times\;12}\) as
analog to the case of SSD, the matrix-free formulation for the Gauss–Newton approximation to the Hessian is given by
This finalizes the derivation of matrix-free calculation rules for objective function gradient and Gauss–Newton approximation to the Hessian also for the Normalized Gradient Fields distance measure with three-dimensional images.
