Skip to main content
Log in

A matrix-free approach to efficient affine-linear image registration on CPU and GPU

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

This paper presents a generic approach to highly efficient image registration in two and three dimensions. Both monomodal and multimodal registration problems are considered. We focus on the important class of affine-linear transformations in a derivative-based optimization framework. Our main contribution is an explicit formulation of the objective function gradient and Hessian approximation that allows for very efficient, parallel derivative calculation with virtually no memory requirements. The flexible parallelism of our concept allows for direct implementation on various hardware platforms. Derivative calculations are fully matrix free and operate directly on the input data, thereby reducing the auxiliary space requirements from \({\mathcal {O}}(n)\) to \({\mathcal {O}}(1)\). The proposed approach is implemented on multicore CPU and GPU. Our GPU code outperforms a conventional matrix-based CPU implementation by more than two orders of magnitude, thus enabling usage in real-time scenarios. The computational properties of our approach are extensively evaluated, thereby demonstrating the performance gain for a variety of real-life medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Alavi, A., et al.: Is PET-CT the only option? Eur. J. Nucl. Med. Mol. Imaging 34, 819–821 (2007)

    Article  Google Scholar 

  2. Berg, R., König, L., Rühaak, J., Lausen, R., Fischer, B.: Highly efficient image registration for embedded systems using a distributed multicore DSP architecture. J. Real Time Image Process. (2014). doi:10.1007/s11554-014-0457-3

  3. Björck, A.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)

    Book  MATH  Google Scholar 

  4. Bronsert, P., Enderle-Ammour, K., Bader, M., Timme, S., Kuehs, M., Csanadi, A., Kayser, G., Kohler, I., Bausch, D., Hoeppner, J., et al.: Cancer cell invasion and EMT marker expression: a three-dimensional study of the human cancer-host interface. J. Pathol. 234(3), 410–422 (2014)

    Article  Google Scholar 

  5. Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. (CSUR) 24(4), 325–376 (1992)

    Article  Google Scholar 

  6. Buluc, A., Gilbert, J.R.: Parallel sparse matrix-matrix multiplication and indexing: implementation and experiments. SIAM J. Sci. Comput. 34(4), C170–C191 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  7. Castro-Pareja, C.R., Jagadeesh, J.M., Shekhar, R.: FAIR: a hardware architecture for real-time 3-D image registration. IEEE Trans. Inf. Technol. Biomed. 7(4), 426–434 (2003)

    Article  Google Scholar 

  8. Collignon, A., Maes, F., Delaere, D., Vandermeulen, D., Suetens, P., Marchal, G.: Automated multi-modality image registration based on information theory. Inf. Process. Med. Imaging 3, 264–274 (1995)

    Google Scholar 

  9. Davis, T.A.: Direct Methods for Sparse Linear Systems, vol. 2. SIAM, Philadelphia (2006)

    Book  MATH  Google Scholar 

  10. De Luca, V., Benz, T., Kondo, S., König, L., Lübke, D., Rothlübbers, S., Somphone, O., Allaire, S., Bell, M.L., Chung, D., et al.: The 2014 liver ultrasound tracking benchmark. Phys. Med. Biol. 60(14), 5571 (2015)

    Article  Google Scholar 

  11. Dennis Jr, J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, vol. 16. SIAM, Philadelphia (1996)

    Book  MATH  Google Scholar 

  12. Ferroli, P., Franzini, A., Marras, C., Maccagnano, E., D’Incerti, L., Broggi, G.: A simple method to assess accuracy of deep brain stimulation electrode placement: pre-operative stereotactic CT + postoperative MR image fusion. Stereotact. Func. Neurosurg. 82(1), 14–19 (2004)

    Article  Google Scholar 

  13. Fischer, B., Modersitzki, J.: Fast diffusion registration. Contemp. Math. 313, 117–128 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  14. Gigengack, F., Ruthotto, L., Burger, M., Wolters, C.H., Jiang, X., Schafers, K.P.: Motion correction in dual gated cardiac PET using mass-preserving image registration. IEEE Trans. Med. Imaging 31(3), 698–712 (2012)

    Article  Google Scholar 

  15. Haber, E., Modersitzki, J.: A multilevel method for image registration. SIAM J. Sci. Comput. 27(5), 1594–1607 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Haber, E., Modersitzki, J.: Intensity gradient based registration and fusion of multi-modal images. Methods Inf. Med. 46, 292–9 (2007)

    Google Scholar 

  17. Haber, E., Heldmann, S., Modersitzki, J.: An octree method for parametric image registration. SIAM J. Sci. Comput. 29(5), 2008–2023 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  18. Haber, E., Heldmann, S., Modersitzki, J.: Adaptive mesh refinement for nonparametric image registration. SIAM J. Sci. Comput. 30(6), 3012–3027 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  19. Harris, M., et al.: Optimizing parallel reduction in CUDA. NVIDIA Developer Technology 2(4) (2007). http://docs.nvidia.com/cuda/samples/6_Advanced/reduction/doc/reduction.pdf

  20. Kabus, S., Lorenz, C.: Fast elastic image registration. In: Proceedings of the medical image analysis for the clinic: a grand challenge, pp. 81–89. (2010)

  21. Köhn, A., Drexl, J., Ritter, F., König, M., Peitgen, HO.: GPU accelerated image registration in two and three dimensions. In: Bildverarbeitung für die Medizin 2006, Springer, pp. 261–265 (2006)

  22. König, L., Rühaak, J.: A fast and accurate parallel algorithm for non-linear image registration using normalized gradient fields. In: 2014 IEEE 11th international symposium on biomedical imaging (ISBI), pp. 580–583 (2014)

  23. König, L., Kipshagen, T., Rühaak, J.: A non-linear image registration scheme for real-time liver ultrasound tracking using normalized gradient fields. In: Proceedings of MICCAI challenge on liver ultrasound tracking (CLUST 2014) (2014)

  24. König, L., Derksen, A., Hallmann, M., Papenberg, N.: Parallel and memory efficient multimodal image registration for radiotherapy using normalized gradient fields. In: 2015 IEEE 12th international symposium on biomedical imaging (ISBI) (2015)

  25. Lange, T., Papenberg, N., Heldmann, S., Modersitzki, J., Fischer, B., Lamecker, H., Schlag, P.M.: 3D ultrasound-CT registration of the liver using combined landmark-intensity information. Int. J. Comput. Assist. Radiol. Surg. 4(1), 79–88 (2009)

    Article  Google Scholar 

  26. Lombardi, F., Spigler, R.: The evolution of the approach to scientific computing: a survey. J. Parallel Cloud Comput. 3(2), 32–42 (2014)

    Google Scholar 

  27. Maintz, J., Viergever, M.A.: A survey of medical image registration. Med. Image Anal. 2(1), 1–36 (1998)

    Article  Google Scholar 

  28. Modersitzki, J.: Numerical Methods for Image Registration. Oxford University Press, Oxford (2004)

  29. Modersitzki, J.: FAIR: Flexible Algorithms for Image Registration, vol. 6. SIAM, Philadelphia (2009)

    Book  MATH  Google Scholar 

  30. Murphy, K., Van Ginneken, B., Reinhardt, J.M., Kabus, S., Ding, K., Deng, X., Cao, K., Du, K., Christensen, G.E., Garcia, V., et al.: Evaluation of registration methods on thoracic CT: the EMPIRE10 challenge. IEEE Trans. Med. Imaging 30(11), 1901–1920 (2011)

    Article  Google Scholar 

  31. Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (1999)

  32. NVIDIA Corporation: NVIDIA CUDA C Programming Guide. NVIDIA Corporation, Santa Clara (2014)

  33. Powell, M.J.: An efficient method for finding the minimum of a function of several variables without calculating derivatives. The Comput. J. 7(2), 155–162 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  34. Rühaak, J., Heldmann, S., Kipshagen, T., Fischer, B.: Highly accurate fast lung CT registration. In: SPIE Medical Imaging 2013, image processing, pp. 86,690Y–86,690Y–9 (2013a)

  35. Rühaak, J., König, L., Hallmann, M., Papenberg, N., Heldmann, S., Schumacher, H., Fischer, B.: A fully parallel algorithm for multimodal image registration using normalized gradient fields. In: 2013 IEEE 10th international symposium on biomedical imaging (ISBI), pp. 572–575 (2013b)

  36. Rühaak, J., Derksen, A., Heldmann, S., Hallmann, M., Meine, H.: Accurate CT-MR image registration for deep brain stimulation: a multi-observer evaluation study. In: SPIE Medical Imaging 2015: image processing (2015)

  37. Salas Gonzalez, D., Górriz, J., Ramírez, J., Lassl, A., Puntonet, C.: Improved Gauss–Newton optimisation methods in affine registration of SPECT brain images. Electr. Lett. 44(22), 1291–1292 (2008)

    Article  Google Scholar 

  38. Schmitt, O., Modersitzki, J., Heldmann, S., Wirtz, S., Fischer, B.: Image registration of sectioned brains. Int. J. Comput. Vis. 73(1), 5–39 (2007)

    Article  Google Scholar 

  39. Shams, R., Sadeghi, P., Kennedy, R., Hartley, R.: A survey of medical image registration on multicore and the GPU. IEEE Sig. Process. Mag. 27(2), 50–60 (2010a)

    Article  Google Scholar 

  40. Shams, R., Sadeghi, P., Kennedy, R., Hartley, R.: Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images. Comput. Methods Prog. Biomed. 99(2), 133–146 (2010b)

    Article  Google Scholar 

  41. Shi, L., Liu, W., Zhang, H., Xie, Y., Wang, D.: A survey of GPU-based medical image computing techniques. Quant. Imaging Med. Surg. 2(3), 188 (2012)

    Google Scholar 

  42. Sotiras, A., Davatzikos, C., Paragios, N.: Deformable medical image registration: a survey. IEEE Trans. Med. Imaging 32(7), 1153–1190 (2013)

    Article  Google Scholar 

  43. Soza, G., Bauer, M., Hastreiter, P., Nimsky, C., Greiner, G.: Non-rigid registration with use of hardware-based 3D Bézier functions. In: Medical image computing and computer-assisted intervention—MICCAI 2002, Springer, pp. 549–556 (2002)

  44. Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)

    Article  Google Scholar 

  45. Stürmer, M., Köstler, H., Rüde, U.: A fast full multigrid solver for applications in image processing. Numer. Linear Algebra Appl. 15(2–3), 187–200 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  46. Tramnitzke, F., Rühaak, J., König, L., Modersitzki, J., Köstler, H.: GPU based affine linear image registration using normalized gradient fields. In: Proceedings of 7th international workshop on high performance computing for biomedical image analysis (HPC-MICCAI) (2014)

  47. Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45(1), S61–S72 (2009)

    Article  Google Scholar 

  48. Verma, P.S., Wu, H., Langer, M.P., Das, I.J., Sandison, G.: Survey: real-time tumor motion prediction for image-guided radiation treatment. Comput. Sci. Eng. 13(5), 24–35 (2011)

    Article  Google Scholar 

  49. Viola, P., Wells III, W.M.: Alignment by maximization of mutual information. Int. J. Comput. Vis. 24(2), 137–154 (1997)

    Article  Google Scholar 

  50. Wilt, N.: The CUDA handbook: a comprehensive guide to GPU programming. Pearson Education, Upper Saddle River (2013)

  51. Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11), 977–1000 (2003)

    Article  Google Scholar 

Download references

Acknowledgments

J. Rühaak, L. König, F. Tramnitzke and J. Modersitzki received funding from the European Union, European Regional Development Fund, Grant No. 122-10-002. All authors declare that they have no conflicts of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Rühaak.

Appendices

Appendix

Extension to the three-dimensional case

In this appendix, explicit matrix-free calculation rules will be derived for affine-linear registration of three-dimensional images with the SSD and NGF distance measures. Most definitions of the occuring functions are briefly repeated here to improve readability.

2.1 Sum of squared differences (SSD)

For any \(y:\Omega _{{\mathcal {R}}}\rightarrow {\mathbb {R}}^{3}\), the sum of squared differences (SSD) distance measure [28] is given by

$$\begin{aligned} {\mathcal {D}}_{\text {SSD}}({\mathcal {R}},{\mathcal {T}};y) := \frac{1}{2} \int _{\Omega _{\mathcal {R}}} \left( {\mathcal {T}}(y({\mathbf {x}})) - {\mathcal {R}}({\mathbf {x}}) \right) ^2 {\mathrm {d}}{\mathbf {x}}. \end{aligned}$$

Let \(y_w:{\mathbb {R}^{3}}\rightarrow {\mathbb {R}^{3}}, \quad x\,\mapsto \,Ax+b\) denote a three-dimensional affine-linear transformation with \(w=(w_1,\ldots ,w_{12})\) and

$$ A: = \left( {\begin{array}{*{20}c} {w_{1} } & {w_{2} } & {w_{3} } \\ {w_{5} } & {w_{6} } & {w_{7} } \\ {w_{9} } & {w_{{10}} } & {w_{{11}} } \\ \end{array} } \right),\;b: = \left( {\begin{array}{*{20}c} {w_{4} } \\ {w_{8} } \\ {w_{{12}} } \\ \end{array} } \right). $$

Setting \({\mathcal {D}}_{\text {SSD}}(w):={\mathcal {D}}_{\text {SSD}}({\mathcal {R}},{\mathcal {T}};y_w)\) yields the formulation of affine-linear image registration with SSD as minimization problem

$$\begin{aligned} \min _w \ {\mathcal {D}}_{\text {SSD}}(w) \end{aligned}$$
(25)

with \({\mathcal {D}}_{\text {SSD}}:{\mathbb {R}^{12}}\rightarrow {\mathbb {R}}\). For discretization, the domain \(\Omega _{{\mathcal {R}}}\) is assumed to be cuboid and decomposed into n cells of equal size with center points \({\mathbf {x}}_{i},\, i=1,\ldots ,n\), arranged in lexicographical ordering. Using the midpoint quadrature rule for numerical integration, a discretized version of (25) reads

$$\begin{aligned} \min _w \ D_{\mathrm {SSD}}(w) :=\frac{\bar{h}}{2} \displaystyle \sum _{i=1}^{n} \left( {\mathcal {T}}(y_w({\mathbf {x}}_i)) - {\mathcal {R}}({\mathbf {x}}_i) \right) ^2, \end{aligned}$$

where \(\bar{h}\) denotes the volume of each cell. Multilinear interpolation with Dirichlet zero boundary conditions is used to evaluate the discrete template image at arbitrary coordinates.

Let \(({\mathbf {x}}_{i})_j\) denote the j-th component of \({\mathbf {x}}_{i}\in {\mathbb {R}}^{3}\). For transformation parameters \(w\in {\mathbb {R}^{12}}\), we define the vector

$$ v_{i} : = \left( {\begin{array}{*{20}c} {\left( {A{\mathbf{x}}_{1} + b} \right)_{i} } \\ {\left( {A{\mathbf{x}}_{2} + b} \right)_{i} } \\ \vdots \\ {\left( {A{\mathbf{x}}_{n} + b} \right)_{i} } \\ \end{array} } \right) \in {\mathbb{R}^{n}} , \quad i = 1,2,3, $$

to construct the function

$$ y:{\mathbb{R}^{{12}}} \to {\mathbb{R}^{{3n}}} ,\quad w \; \mapsto \left( {\begin{array}{*{20}c} {v_{1} } \\ {v_{2} } \\ {v_{3} } \\ \end{array} } \right). $$
(26)

Using \({{\mathbf {y}}_{i}} = (y_i,y_{i+n},y_{i+2n})^\top \), we define

$$ T:{\mathbb{R}^{{3n}}} \to {\mathbb{R}^{n}} ,\;\left( {\begin{array}{*{20}c} {y_{1} } \\ \vdots \\ {y_{{3n}} } \\ \end{array} } \right) \mapsto \left( {\begin{array}{*{20}c} {{\mathcal{T}}({\mathbf{y}}_{1} )} \\ \vdots \\ {{\mathcal{T}}({\mathbf{y}}_{n} )} \\ \end{array} } \right). $$
(27)

With \(R_i := {{\mathcal {R}}}({{\mathbf {x}}}_{i})\), we set

$$ r:{\mathbb{R}^{n}} \to {\mathbb{R}^{n}} ,\;\left( {\begin{array}{*{20}c} {T_{1} } \\ \vdots \\ {T_{n} } \\ \end{array} } \right) \mapsto \left( {\begin{array}{*{20}c} {T_{1} - R_{1} } \\ \vdots \\ {T_{n} - R_{n} } \\ \end{array} } \right) $$

as residual function and finally

$$ \psi :{\mathbb{R}^{n}} \to {\mathbb{R}},\;\left( {\begin{array}{*{20}c} {r_{1} } \\ \vdots \\ {r_{n} } \\ \end{array} } \right) \mapsto \frac{{\bar{h}}}{2}\sum\limits_{{i = 1}}^{n} {r_{i}^{2} } $$

as the sum of all squared residual elements. Now, \(D_{\mathrm {SSD}}\) can be written as a concatenation of four functions:

$$\begin{aligned} D_{\text {SSD}}: {{\mathbb {R}}^{12}\xrightarrow {y}}{{\mathbb {R}}^{3n}\xrightarrow {T}}{{\mathbb {R}}^{n}\xrightarrow {r}}{{\mathbb {R}}^{n}\xrightarrow {\psi }}{\mathbb {R}}. \end{aligned}$$
(28)

2.1.1 Matrix-based differentiation

The differentiation of (28) is performed with the chain rule as

$$\begin{aligned} \nabla D_{\text {SSD}}(w) = \frac{\partial \psi }{\partial r}\frac{\partial r}{\partial T}\frac{\partial T}{\partial y} \frac{\partial y}{\partial w} \end{aligned}$$
(29)

just as in the two-dimensional case. Again, we define the gradient as a row vector. The first two individual derivatives in (29) are given by

$$\begin{aligned} \frac{\partial \psi }{\partial r}[r]&= {\bar{h}}(r_1,\ldots ,r_{n}) \ \text {and} \nonumber \\ \frac{\partial r}{\partial T}[T]&= I_{n}, \end{aligned}$$
(30)

with \(I_{n} \in {\mathbb {R}^{n\;\times\;n}} \) as the identity matrix. Denoting the partial derivative with respect to the i-th component by \(\partial _i\) and defining \(\partial _i {\mathcal {T[y]}}\) as

$$ \partial _{i} {\mathcal{T}}[y]: = \left( {\begin{array}{*{20}c} {\partial _{i} {\mathcal{T}}({\mathbf{y}}_{1} )} & {} & {} \\ {} & \ddots & {} \\ {} & {} & {\partial _{i} {\mathcal{T}}({\mathbf{y}}_{n} )} \\ \end{array} } \right),\quad i = 1,2,3, $$

it holds that

$$ \frac{{\partial T}}{{\partial y}}[y] = \left( {\begin{array}{*{20}c} {\partial _{1} {\mathcal{T}}} & {\partial _{2} {\mathcal{T}}} & {\partial _{3} {\mathcal{T}}} \\ \end{array} } \right) \in {\mathbb{R}}^{{n\; \times \;3n}} . $$
(31)

Finally, the derivative of the function y is given by

$$\begin{aligned} \frac{\partial y}{\partial w}[w] = I_3 \otimes {\mathbf {X}} \in {\mathbb {R}^{3n\;\times\;12}} \end{aligned}$$
(32)

with the Kronecker product \(\otimes \) and the grid matrix X as

$$ {\mathbf{X}}: = \left( {\begin{array}{*{20}c} {({\mathbf{x}}_{1} )_{1} } & {({\mathbf{x}}_{1} )_{2} } & {({\mathbf{x}}_{1} )_{3} } & 1 \\ {({\mathbf{x}}_{2} )_{1} } & {({\mathbf{x}}_{2} )_{2} } & {({\mathbf{x}}_{2} )_{3} } & 1 \\ \vdots & \vdots & \vdots & \vdots \\ {({\mathbf{x}}_{n} )_{1} } & {({\mathbf{x}}_{n} )_{2} } & {({\mathbf{x}}_{n} )_{3} } & 1 \\ \end{array} } \right) \in {\mathbb{R}^{{n\; \times \;4}}} , $$

thus completing the analysis of the gradient components from (29). With

$$\begin{aligned} {\mathrm {d}} r := \frac{\partial r}{\partial T}\frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\in {\mathbb {R}}^{n\;\times\;12}, \end{aligned}$$
(33)

the Gauss–Newton approximation \(H_{\text {SSD}}\) of the Hessian matrix is given by

$$\begin{aligned} H_{\text {SSD}}(w) := {\mathrm {d}} r^\top {\mathrm {d}}_2\psi {\mathrm {d}} r \end{aligned}$$

with \({\mathrm {d}}_2\psi =\bar{h}\). Again, note \(\frac{\partial r}{\partial T}=I_n\).

2.1.2 Matrix-free derivative calculation

With (31) and (32), it follows that

$$\left( \frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\right) _{i,j} \,= \, \left\{ \begin{array}{ll} \partial _1 {\mathcal{T}}({\mathbf{y}}_i) {\mathbf{X}}_{i,j}&{}1\le j \le 4\\ \partial _2 {\mathcal{T}}({\mathbf{y}}_{i}){\mathbf{X}}_{i,j-4}&{}5\le j \le 8\\ \partial _3 {\mathcal{T}}({\mathbf{y}}_{i}){\mathbf{X}}_{i,j-8}&{}9\le j \le 12 \end{array}\right. . $$
(34)

Using (30), it holds that

$$\begin{aligned} \left( \frac{\partial \psi }{\partial r}\right) _i = {\mathcal {T}_{w}}({{\mathbf {x}}}_i) - {\mathcal R}({{\mathbf {x}}}_i) \end{aligned}$$

with \({\mathcal {T}_{w}}({{\mathbf {x}}}_i)\,\,{:=}\,\,{{\mathcal {T}}}(y_w({{\mathbf {x}}}_i))\). The explicit calculation rule for the objective function gradient in the three-dimensional case is therefore given by

$$ \nabla D_{{{\text{SSD}}}} (w) = \bar{h}\sum\limits_{{i = 1}}^{n} {\left( {{\mathcal{T}}_{w} ({\mathbf{x}}_{i} ) - {\mathcal{R}}({\mathbf{x}}_{i} )} \right)} \left( {\begin{array}{*{20}c} {\partial _{1} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{1} } \\ {\partial _{1} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{2} } \\ {\partial _{1} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{3} } \\ {\partial _{1} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )} \\ {\partial _{2} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{1} } \\ {\partial _{2} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{2} } \\ {\partial _{2} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{3} } \\ {\partial _{2} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )} \\ {\partial _{3} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{1} } \\ {\partial _{3} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{2} } \\ {\partial _{3} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )({\mathbf{x}}_{i} )_{3} } \\ {\partial _{3} {\mathcal{T}}_{w} ({\mathbf{x}}_{i} )} \\ \end{array} } \right)^{{ \top }}. $$
(35)

The Gauss–Newton approximation to the Hessian for the SSD distance measure is defined as

$$\begin{aligned} H_{\mathrm {SSD}}&= {\mathrm {d}}r^\top {\mathrm {d}}_2\psi {\mathrm {d}}r\\&= {\bar{h}}\left( \frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\right) ^\top \left( \frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\right) \in {\mathbb {R}}^{12\;\times\;12}. \end{aligned}$$

By utilizing (34) and setting

$$ l_{k} : = \left( {\begin{array}{*{20}c} {} \\ {\left( {\frac{{\partial T}}{{\partial y}}\frac{{\partial y}}{{\partial w}}} \right)_{{k,i}} \cdot\left( {\frac{{\partial T}}{{\partial y}}\frac{{\partial y}}{{\partial w}}} \right)_{{k,j}} } \\ {} \\ \end{array} } \right)_{{1 \le i,j \le 12}} , $$
(36)

it directly follows that

$$\begin{aligned} H_{\mathrm {SSD}}(w) = \bar{h} \displaystyle \sum _{k=1}^{n} l_k. \end{aligned}$$

2.2 Normalized gradient fields (NGF)

We consider the NGF distance measure [16]

$$ {\mathcal{D}}_{{{\text{NGF}}}} : = \frac{1}{2}\int\limits_{{\Omega _{{\mathcal{R}}} }} 1 - \left( {\frac{{\langle \nabla{\mathcal{R}}({\mathbf{x}}),\nabla {\mathcal{T}}(y({\mathbf{x}}))\rangle _{{{\varrho },\tau }} }}{{|| \nabla {\mathcal{R}}({\mathbf{x}})|| _{{\varrho }} \, || \nabla{\mathcal{T}}(y({\mathbf{x}}))|| _{\tau } }}} \right)^{2} \;{\text{d}}{\mathbf{x}}, $$

\(\langle a,b \rangle _{\alpha ,\beta }:=\sum _{i=1}^{3}a_ib_i+\alpha \beta ,\ a,b\in {\mathbb {R}}^{3}\), \(\Vert a\Vert _\varepsilon :=\sqrt{\sum _{i=1}^3 a_i^2+\varepsilon ^2}\), with separate edge parameters for reference and template image, cf. [35]. Setting \({\mathcal {D}}_{\text {NGF}}(w) := {\mathcal {D}}_{\text {NGF}}({\mathcal {R}},{\mathcal {T}};y_w)\), affine-linear image registration with NGF translates to

$$\begin{aligned} \min _w \ {\mathcal {D}}_{\text {NGF}}(w). \end{aligned}$$
(37)

For numerical optimization, the continuous formulation in (37) is discretized. For a reference image of size \(n_1\;\times\;n_2\;\times\;n_3\) and an index \(i,\ i=1,\ldots ,n\), let \(i', j',k'\in {\mathbb {N}},1\le i'\le n_1,\ 1\le j'\le n_2,\ 1\le k'\le n_3\) such that \(i = i' + j'n_1 + k'n_1n_2\). The indices of neighboring points with Neumann zero boundary conditions are given by

$$\begin{aligned} i_{-x}&= \max (i'-1,1)+ j'n_1 + k'n_1n_2, \nonumber \\ i_{+x}&= \min (i'+1,n_1)+j'n_1+ k'n_1n_2, \nonumber \\ i_{-y}&= i'+\max (j'-1,1)n_1+ k'n_1n_2, \nonumber \\ i_{+y}&= i'+\min (j'+1,n_2)n_1+ k'n_1n_2, \nonumber \\ i_{-z}&= i'+j'n_1+ \max (k'-1,1)n_1n_2, \nonumber \\ i_{+z}&= i'+j'n_1+ \min (k'+1,n_3)n_1n_2. \end{aligned}$$
(38)

We define functions

$$ g_{i} :{\mathbb{R}^{n}} \to {\mathbb{R}^{3}} , \quad T \mapsto \left( {\begin{array}{*{20}c} {\frac{1}{{2h_{1} }}( - T_{{i_{{ - x}} }} + T_{{i_{{ + x}} }} )} \\ {\frac{1}{{2h_{2} }}( - T_{{i_{{ - y}} }} + T_{{i_{{ + y}} }} )} \\ {\frac{1}{{2h_{3} }}( - T_{{i_{{ - z}} }} + T_{{i_{{ + z}} }} )} \\ \end{array} } \right) $$

and

$$\begin{aligned} s_i:&{\mathbb {R}^{3}} \rightarrow {\mathbb {R}}, \quad a \; \mapsto \; \langle g_i(R),a\rangle + \varrho \tau \end{aligned}$$

for gradient and scalar product type operations at the position i, respectively. Further setting

$$\begin{aligned} n_\varepsilon :&{\mathbb {R}^{3}} \rightarrow {\mathbb {R}}, \quad a \; \mapsto\; \sqrt{a_1^2 + a_2^2 + a_3^2 + \varepsilon ^{2}}, \end{aligned}$$

the discretized version of (37) is given by

$$\begin{aligned} \min _w \ D_{\mathrm {NGF}}(w) :=\frac{\bar{h}}{2} \displaystyle \sum _{i=1}^{n} 1 - \left( \frac{s_i(g_i(T_w))}{n_\varrho (g_i(R)) \ n_\tau (g_i(T_w))} \right) ^2 \end{aligned}$$

with \((T_w)_i = {\mathcal {T}}(y_w({{\mathbf {x}}}_i))\).

2.2.1 Matrix-based differentiation

Let y and T as in (26) and (27). We define the residual function \(r:{\mathbb {R}^{n}}\rightarrow {\mathbb {R}^{n}}\) by setting the i-th component function \(r_i:{\mathbb {R}^{n}}\rightarrow {\mathbb {R}}\) to

$$\begin{aligned} r_i:T \; \mapsto \; \frac{s_i(g_i(T))}{ n_\varrho ( g_i(R))\ n_\tau ( g_i(T))}. \end{aligned}$$

The reduction function \(\psi :{\mathbb {R}^{n}}\rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} \psi (r)&= \frac{\bar{h}}{2} \sum _{i=1}^{n} 1 - r_i^2, \end{aligned}$$

yielding the function chain

$$\begin{aligned} D_{\text {NGF}}: {{\mathbb {R}}^{12}\xrightarrow {y}}{{\mathbb {R}}^{3n}\xrightarrow {T}}{{\mathbb {R}}^{n}\xrightarrow {r}}{{\mathbb {R}}^{n}\xrightarrow {\psi }}{\mathbb {R}}. \end{aligned}$$

The derivatives of T and y have already been computed in (31) and (32). For the reduction function \(\psi \), it holds that

$$\begin{aligned} \frac{\partial \psi }{\partial r}&= -\bar{h} r^\top \in {\mathbb {R}^{1\;\times\;n}}. \end{aligned}$$
(39)

The calculation of \(\frac{\partial r}{\partial T}\) is performed by differentiating the component functions \(r_i,\; i=1,\ldots ,n\). The functions \(r_i\) are composed of \(s_i\), \(g_i\) and \(n_\varepsilon \) whose derivatives are given by

$$\begin{aligned} \frac{\partial s_i}{\partial a} = g_i(R)^\top \in {\mathbb {R}^{1\;\times\;3}}, \end{aligned}$$

and

$$\begin{aligned} \frac{\partial n_\varepsilon }{\partial a} = \frac{1}{n_\varepsilon (a)}a^\top \in {\mathbb {R}^{1\;\times\;3}} \end{aligned}$$

with \(\frac{\partial g_i}{\partial T}\in {\mathbb {R}^{3\;\times\;n}}\). Applying the chain rule in both numerator and denominator of \(r_i\) yields

$$ \frac{{\partial r_{i} }}{{\partial T}} = \left( {\begin{array}{*{20}c} \vdots \\ {\frac{1}{{2h_{3} }}\left[ {\frac{{ - g_{i} (R)_{3} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} + \frac{{s_{i} (g_{i} (T))g_{i} (T)_{3} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ \vdots \\ {\frac{1}{{2h_{2} }}\left[ {\frac{{ - g_{i} (R)_{2} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} + \frac{{s_{i} (g_{i} (T))g_{i} (T)_{2} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ \vdots \\ {\frac{1}{{2h_{1} }}\left[ {\frac{{ - g_{i} (R)_{1} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} + \frac{{s_{i} (g_{i} (T))g_{i} (T)_{1} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ 0 \\ {\frac{1}{{2h_{1} }}\left[ {\frac{{g_{i} (R)_{1} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} - \frac{{s_{i} (g_{i} (T))g_{i} (T)_{1} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ \vdots \\ {\frac{1}{{2h_{2} }}\left[ {\frac{{g_{i} (R)_{2} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} - \frac{{s_{i} (g_{i} (T))g_{i} (T)_{2} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ \vdots \\ {\frac{1}{{2h_{3} }}\left[ {\frac{{g_{i} (R)_{3} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))}} - \frac{{s_{i} (g_{i} (T))g_{i} (T)_{3} }}{{n_{{\varrho }} (g_{i} (R))n_{\tau } (g_{i} (T))^{3} }}} \right]} \\ \vdots \\ \end{array} } \right)^{\top} $$

with the vector entries at positions \(i_{-z},i_{-y},i_{-x},i_{+x},i_{+y},\) and \(i_{+z}\) (in that order) as defined in (38). Note that these positions may coincide, in which case the values are added.

The Gauss–Newton approximation \(H_{\text{NGF}}\) to the Hessian is given by

$$\begin{aligned} H_{\text {NGF}}(w) := {\mathrm {d}} r^\top {\mathrm {d}}_2\psi {\mathrm {d}}r \approx \nabla ^2 D_{\text {NGF}}(w) \end{aligned}$$

with \({\mathrm {d}}r\) defined as in (33) and \({\mathrm {d}}_2\psi =-{\bar{h}}\).

2.2.2 Matrix-free derivative calculation

Setting \(r_i:=\frac{s_i(g_i(T))}{ n_\varrho ( g_i(R))\ n_\tau ( g_i(T))}\) and \({\mathrm {d}}r_i:=\frac{\partial r_i}{\partial T}\frac{\partial T}{\partial y}\frac{\partial y}{\partial w}\), it holds with (39) that

$$\begin{aligned} \nabla D_{\text {NGF}}(w) = -\bar{h} \sum _{i=1}^{n}r_i{\mathrm {d}}r_i. \end{aligned}$$
(40)

As \(r_i\in {\mathbb {R}}\) are scalars, it suffices to derive a matrix-free description of the vectors \({\mathrm {d}}r_i\in {\mathbb {R}}^{12}\) to achieve a fully matrix-free formulation of the objective function gradient. Let \(1\le i \le n\) and define indices \(i_{-z},i_{-y},i_{-x},i_{+x},i_{+y},i_{+z}\) as in (38). With the definition

$$ \begin{aligned} d_{i}^{{j,k}}& \,\,{:=}\,\,\,\, \partial r_{i} [i_{{ - z}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ - z}} }} ){\mathbf{X}}_{{i_{{ - z}} ,k}} \\ &\,\, + \,\partial r_{i} [i_{{ - y}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ - y}} }} ){\mathbf{X}}_{{i_{{ - y}} ,k}} \\ & \,\,+ \,\partial r_{i} [i_{{ - x}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ - x}} }} ){\mathbf{X}}_{{i_{{ - x}} ,k}} \\ & \,\, + \, \partial r_{i} [i_{{ + x}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ + x}} }} ){\mathbf{X}}_{{i_{{ + x}} ,k}} \; \\ & \,\, + \, \partial r_{i} [i_{{ + y}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ + y}} }} ){\mathbf{X}}_{{i_{{ + y}} ,k}} \; \\ & \,\, + \, \partial r_{i} [i_{{ + z}} ]\partial _{j} {\mathcal{T}}({\mathbf{y}}_{{i_{{ + z}} }} ){\mathbf{X}}_{{i_{{ + z}} ,k}} \\ \end{aligned} $$

for \(i=1,\ldots ,n\), \(j=1,2,3\), \(k=1,\dots ,4\) and

$$ d_{i}^{j} : = \left( {\begin{array}{*{20}c} {d_{i}^{{j,1}} ,d_{i}^{{j,2}} ,d_{i}^{{j,3}} ,d_{i}^{{j,4}} } \\ \end{array} } \right), $$

it follows that

$$ {\text{d}}r_{i} = \left( {\begin{array}{*{20}c} {d_{i}^{1} } & {d_{i}^{2} } & {d_{i}^{3} } \\ \end{array} } \right)^{{ \top }} \in {\mathbb{R}^{{12}}} , $$
(41)

which according to (40) yields

$$ \nabla D_{{{\text{NGF}}}} (w) = - \bar{h}\sum\limits_{{i = 1}}^{n} {\frac{{s_{i} (g_{i} (T))}}{{n_{{\varrho }} (g_{i} (R))\;n_{\tau } (g_{i} (T))}}} \left( {\begin{array}{*{20}c} {{\text{d}}r_{i} [1]} \\ {{\text{d}}r_{i} [2]} \\ \vdots \\ {{\text{d}}r_{i} [12]} \\ \end{array} } \right)^{{ \top }} , $$
(42)

completing the gradient calculation for the three-dimensional case. Since

$$ H_{{{\text{NGF}}}} (w) = \left( {\frac{{\partial r}}{{\partial T}}\frac{{\partial T}}{{\partial y}}\frac{{\partial y}}{{\partial w}}} \right)^{{ \top }} {\text{d}}_{2} \psi \left( {\frac{{\partial r}}{{\partial T}}\frac{{\partial T}}{{\partial y}}\frac{{\partial y}}{{\partial w}}} \right) = \left( {\begin{array}{*{20}c} {{\text{d}}r_{1}^{{ \top }} } & \ldots & {{\text{d}}r_{n}^{{ \top }} } \\ \end{array} } \right){\text{d}}_{2} \psi \left( {\begin{array}{*{20}c} {{\text{d}}r_{1} } \\ \vdots \\ {{\text{d}}r_{n} } \\ \end{array} } \right), $$

the calculation of the Hessian approximation can directly be performed with the help of the matrix-free formulation of \({\mathrm {d}}r_i\) from (41). By defining the matrices \(l_k\in {\mathbb {R}^{12}\;\times\;12}\) as

$$ l_{k} : = \left( {\begin{array}{*{20}c} {} \\ {{\text{d}}r_{k} [i]\cdot{\text{d}}r_{k} [j]} \\ {} \\ \end{array} } \right)_{{1 \le i,j \le 12}} $$
(43)

analog to the case of SSD, the matrix-free formulation for the Gauss–Newton approximation to the Hessian is given by

$$\begin{aligned} H_{\mathrm {NGF}}(w) = \bar{h} \displaystyle \sum _{k=1}^{n} l_k. \end{aligned}$$

This finalizes the derivation of matrix-free calculation rules for objective function gradient and Gauss–Newton approximation to the Hessian also for the Normalized Gradient Fields distance measure with three-dimensional images.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rühaak, J., König, L., Tramnitzke, F. et al. A matrix-free approach to efficient affine-linear image registration on CPU and GPU. J Real-Time Image Proc 13, 205–225 (2017). https://doi.org/10.1007/s11554-016-0564-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-016-0564-4

Keywords

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy