Infrared and Visible Image Fusion for Highlighting Salient Targets in the Night Scene
Abstract
:1. Introduction
- To reduce the computational burden caused by stacking convolution layers, a global attention module is added to the image fusion network. This module rescales the weights of different channels after capturing global contextual information. Assigning higher weight for the important channels makes the salient targets more prominent;
- In order to improve the comprehensiveness of the loss function, the loss function of the proposed method is divided into the foreground loss and the background loss, forcing the fused image to contain more prominent salient targets and richer texture details;
- To address the problem of the uneven luminance in the night scene, a luminance estimation function is introduced into the training process. It outputs the trade-off control parameters of the foreground loss function based on the nighttime luminance, which helps to retain the foreground information from source images more reasonably.
2. Related Work
2.1. Infrared and Visible Image Fusion
2.2. Attention Mechanism
3. The Proposed Method
3.1. Problem Formulation
3.2. Loss Function
3.3. Network Structure
3.3.1. Feature Extraction Module
3.3.2. Global Attention Module
3.3.3. Image Reconstruction Module
4. Experiments
4.1. Experimental Details
4.1.1. Training Dataset
4.1.2. Training Details
4.2. Results of the MSRS Dataset
4.2.1. Qualitative Results
4.2.2. Quantitative Results
4.3. Results of the M3FD Dataset
4.3.1. Qualitative Results
4.3.2. Quantitative Results
4.4. Ablation Experiments
4.4.1. Global Attention Module
4.4.2. Feature Extraction Module
4.4.3. Trade-Off Control Parameters
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Zhu, D.; Zhan, W.; Jiang, Y. MIFFuse: A multi-level feature fusion network for infrared and visible images. IEEE Access 2021, 9, 130778–130792. [Google Scholar] [CrossRef]
- Zhu, D.; Zhan, W.; Fu, J. RI-MFM: A Novel Infrared and Visible Image Registration with Rotation Invariance and Multilevel Feature Matching. Electronics 2022, 11, 2866. [Google Scholar] [CrossRef]
- Qiu, S.; Luo, J.; Yang, S. A moving target extraction algorithm based on the fusion of infrared and visible images. Infrared Phys. Technol. 2019, 98, 285–291. [Google Scholar] [CrossRef]
- Zhang, A.; Min, Z.; Zhang, Z. Generalized Point Set Registration with Fuzzy Correspondences Based on Variational Bayesian Inference. IEEE Trans. Fuzzy Syst. 2022, 30, 1529–1540. [Google Scholar] [CrossRef]
- Yang, Z.; Zeng, S. TPFusion: Texture preserving fusion of infrared and visible images via dense networks. Entropy 2022, 24, 294. [Google Scholar] [CrossRef] [PubMed]
- Zhu, D.; Zhan, W.; Jiang, Y. IPLF: A Novel Image Pair Learning Fusion Network for Infrared and Visible Image. IEEE Sens. J. 2022, 22, 8808–8817. [Google Scholar] [CrossRef]
- Xu, H.; Ma, J.; Jiang, J. U2Fusion: A unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 502–518. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, Y.; Sun, P. IFCNN: A general image fusion framework based on convolutional neural network. Inf. Fusion 2020, 54, 99–118. [Google Scholar] [CrossRef]
- Long, Y.; Jia, H.; Zhong, Y. RXDNFuse: A aggregated residual dense network for infrared and visible image fusion. Inf. Fusion 2021, 69, 128–141. [Google Scholar] [CrossRef]
- Ma, J.; Zhang, H.; Shao, Z. GANMcC: A generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Trans. Instrum. Meas. 2020, 70, 5005014. [Google Scholar] [CrossRef]
- Ma, J.; Tang, L.; Xu, M. STDFusionNet: An infrared and visible image fusion network based on salient target detection. IEEE Trans. Instrum. Meas. 2021, 70, 5009513. [Google Scholar] [CrossRef]
- Xu, D.; Wang, Y.; Xu, S. Infrared and visible image fusion with a generative adversarial network and a residual network. Appl. Sci. 2020, 10, 554. [Google Scholar] [CrossRef] [Green Version]
- Xu, D.; Wang, Y.; Zhang, X. Infrared and visible image fusion using a deep unsupervised framework with perceptual loss. IEEE Access 2020, 8, 206445–206458. [Google Scholar] [CrossRef]
- Tang, L.; Yuan, J.; Zhang, H. PIAFusion: A progressive infrared and visible image fusion network based on illumination aware. Inf. Fusion 2022, 83, 79–92. [Google Scholar] [CrossRef]
- Ma, J.; Yu, W.; Liang, P. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
- Ma, J.; Xu, H.; Jiang, J. DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans. Image Process. 2020, 29, 4980–4995. [Google Scholar] [CrossRef]
- Hou, J.; Zhang, D.; Wu, W. A generative adversarial network for infrared and visible image fusion based on semantic segmentation. Entropy 2021, 23, 376. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J.; Durrani, T. NestFuse: An infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans. Instrum. Meas. 2020, 69, 9645–9656. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J.; Kittler, J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images. Inf. Fusion 2021, 73, 72–86. [Google Scholar] [CrossRef]
- Guo, M.H.; Xu, T.X.; Liu, J.J. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Wang, L.; Yao, W.; Chen, C. Driving behavior recognition algorithm combining attention mechanism and lightweight network. Entropy 2022, 24, 984. [Google Scholar] [CrossRef] [PubMed]
- Hui, Y.; Wang, J.; Shi, Y. Low Light Image Enhancement Algorithm Based on Detail Prediction and Attention Mechanism. Entropy 2022, 24, 815. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Qiu, K.; Chen, L. SCAttNet: Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 905–909. [Google Scholar] [CrossRef]
- Tao, H.; Geng, L.; Shan, S. Multi-Stream Convolution-Recurrent Neural Networks Based on Attention Mechanism Fusion for Speech Emotion Recognition. Entropy 2022, 24, 1025. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Huo, H.; Li, C. AttentionFGAN: Infrared and visible image fusion using attention-based generative adversarial networks. IEEE Trans. Multimed. 2020, 23, 1383–1396. [Google Scholar] [CrossRef]
- Li, J.; Li, B.; Jiang, Y. MSAt-GAN: A generative adversarial network based on multi-scale and deep attention mechanism for infrared and visible light image fusion. Complex Intell. Syst. 2022, 8, 4753–4781. [Google Scholar] [CrossRef]
- Wang, X.; Girshick, R.; Gupta, A. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7794–7803. [Google Scholar]
- Mishra, S.; Liang, P.; Czajka, A. CC-NET: Image complexity guided network compression for biomedical image segmentation. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; IEEE: Manhattan, NY, USA, 2019; pp. 57–60. [Google Scholar]
- Zhu, Z.; Xu, M.; Bai, S. Asymmetric non-local neural networks for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 593–602. [Google Scholar]
- Piao, J.; Chen, Y.; Shin, H. A new deep learning based multi-spectral image fusion method. Entropy 2019, 21, 570. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Ma, J.; Chen, C.; Li, C.; Huang, J. Infrared and visible image fusion via gradient transfer and total variation minimization. Inf. Fusion 2016, 31, 100–109. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J. DenseFuse: A fusion approach to infrared and visible images. IEEE Trans. Image Process. 2018, 28, 2614–2623. [Google Scholar] [CrossRef] [PubMed]
- Roberts, J.W.; Van Aardt, J.A.; Ahmed, F.B. Assessment of image fusion procedures using entropy, image quality, and multispectral classification. J. Appl. Remote Sens. 2008, 2, 023522. [Google Scholar]
- Qu, G.; Zhang, D.; Yan, P. Information measure for performance of image fusion. Electron. Lett. 2002, 38, 313–315. [Google Scholar] [CrossRef] [Green Version]
- Rao, Y.J. In-fibre Bragg grating sensors. Meas. Sci. Technol. 1997, 8, 355. [Google Scholar] [CrossRef]
- Eskicioglu, A.M.; Fisher, P.S. Image quality measures and their performance. IEEE Trans. Commun. 1995, 43, 2959–2965. [Google Scholar] [CrossRef] [Green Version]
- Han, Y.; Cai, Y.; Cao, Y.; Xu, X. A new image fusion performance metric based on visual information fidelity. Inf. Fusion 2013, 14, 127–135. [Google Scholar] [CrossRef]
- Xydeas, C.S.; Petrovic, V. Objective image fusion performance measure. Electron. Lett. 2000, 36, 308–309. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Manhattan, NY, USA, 2020; pp. 13–19. [Google Scholar]
Module | Layer | Input Channel | Output Channel | Activation |
---|---|---|---|---|
Shallow Feature Extraction Module | 1 | 1 | 16 | None |
2 | 16 | 16 | Leaky ReLU | |
Deep Feature Extraction Module | 1 | 16 | 32 | Leaky ReLU |
2 | 32 | 64 | Leaky ReLU |
Module | Layer | Input Channel | Output Channel | Activation |
---|---|---|---|---|
Image Reconstruction Module | 1 | 224 | 128 | Leaky ReLU |
2 | 128 | 64 | Leaky ReLU | |
3 | 64 | 32 | Leaky ReLU | |
4 | 32 | 1 | Leaky ReLU |
Metric | GTF | DenseFuse | U2Fusion | GANMcC | RFN-Nest | STDFusion | Ours |
---|---|---|---|---|---|---|---|
EN | 5.4836 | 5.4013 | 4.2694 | 4.3417 | 5.7275 | 5.6072 | 6.0108 |
MI | 1.4152 | 2.3624 | 1.7075 | 1.7122 | 1.8086 | 3.5956 | 4.2372 |
SD | 6.0832 | 6.1874 | 4.7325 | 4.7983 | 6.2975 | 6.7386 | 6.9495 |
SF | 0.0231 | 0.0182 | 0.0239 | 0.0248 | 0.0334 | 0.0249 | 0.0350 |
VIF | 0.5170 | 0.6963 | 0.4536 | 0.4887 | 0.9274 | 0.6236 | 0.8951 |
Qab/f | 0.3361 | 0.3546 | 0.3220 | 0.5644 | 0.5874 | 0.4576 | 0.6250 |
Metric | GTF | DenseFuse | U2Fusion | GANMcC | RFN-Nest | STDFusion | Ours |
---|---|---|---|---|---|---|---|
EN | 6.5086 | 6.6316 | 7.0221 | 6.9833 | 6.7907 | 6.7776 | 7.1993 |
MI | 3.0838 | 3.0782 | 3.0634 | 2.9526 | 2.6379 | 3.9293 | 4.4602 |
SD | 9.9339 | 8.9730 | 9.3571 | 8.9571 | 8.9489 | 8.5261 | 9.0074 |
SF | 0.0340 | 0.0204 | 0.0336 | 0.0362 | 0.0390 | 0.0298 | 0.0295 |
VIF | 0.8978 | 0.8294 | 1.0279 | 1.0200 | 1.0047 | 1.0537 | 1.1064 |
Qab/f | 0.5001 | 0.3739 | 0.5174 | 0.5095 | 0.5048 | 0.5132 | 0.5209 |
Metric | Without the 1st Attention Module | Without the 2nd Attention Module | Without Both Attention Modules | Ours |
---|---|---|---|---|
EN | 6.7450 | 6.7584 | 6.6899 | 6.7844 |
MI | 4.4664 | 4.5162 | 4.4048 | 4.7104 |
SD | 8.9386 | 8.9254 | 8.8987 | 8.9747 |
SF | 0.0320 | 0.0271 | 0.0220 | 0.0294 |
VIF | 0.9936 | 0.9921 | 0.9836 | 1.0026 |
Qab/f | 0.5010 | 0.4927 | 0.4793 | 0.5199 |
Metric | Replacing Attention Module with SENet | Replacing Attention Module with ECANet | Ours |
---|---|---|---|
EN | 6.5809 | 6.6445 | 6.7844 |
MI | 4.6827 | 4.6111 | 4.7104 |
SD | 8.9298 | 8.9390 | 8.9747 |
SF | 0.0277 | 0.0283 | 0.0294 |
VIF | 0.9867 | 0.9370 | 1.0026 |
Qab/f | 0.5170 | 0.5007 | 0.5199 |
Metric | 3-Conv-Block | 4-Conv-Block | Ours (2-Conv-Block) |
---|---|---|---|
EN | 6.7299 | 6.3615 | 6.7844 |
MI | 4.6058 | 3.0175 | 4.7104 |
SD | 8.9157 | 8.7353 | 8.9747 |
SF | 0.0293 | 0.0296 | 0.0294 |
VIF | 0.9843 | 0.8739 | 1.0026 |
Qab/f | 0.5310 | 0.4540 | 0.5199 |
Metric | Ours | ||||||||
---|---|---|---|---|---|---|---|---|---|
EN | 6.5890 | 6.6176 | 6.6162 | 6.6245 | 6.6959 | 6.6796 | 6.7081 | 6.6526 | 6.7844 |
MI | 4.2757 | 4.4161 | 4.4141 | 4.4687 | 4.5292 | 4.5340 | 3.9771 | 3.6744 | 4.7104 |
SD | 8.6407 | 8.6807 | 8.6769 | 8.6910 | 8.7493 | 8.5791 | 8.7539 | 8.7759 | 8.9747 |
SF | 0.0289 | 0.0289 | 0.0293 | 0.0269 | 0.0271 | 0.0257 | 0.0218 | 0.0179 | 0.0294 |
VIF | 0.9412 | 0.9467 | 0.9455 | 0.9462 | 0.9625 | 0.9423 | 0.9358 | 0.9341 | 1.0026 |
Qab/f | 0.5290 | 0.5287 | 0.5240 | 0.5082 | 0.4921 | 0.4509 | 0.4201 | 0.3629 | 0.5199 |
Metric | Ours | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
EN | 6.2031 | 6.3875 | 6.5618 | 6.6762 | 6.7122 | 6.7530 | 6.7006 | 6.7238 | 6.7444 | 6.7844 |
MI | 3.1523 | 3.5214 | 3.8086 | 4.7037 | 4.5235 | 4.6019 | 4.6483 | 4.4814 | 4.3923 | 4.7104 |
SD | 8.3903 | 8.8980 | 8.8988 | 8.9110 | 8.9037 | 8.9236 | 8.9331 | 8.9339 | 8.9419 | 8.9747 |
SF | 0.0296 | 0.0283 | 0.0292 | 0.0298 | 0.0286 | 0.0288 | 0.0284 | 0.0285 | 0.0263 | 0.0294 |
VIF | 0.8754 | 0.8825 | 0.9089 | 0.9730 | 0.9754 | 0.9915 | 0.9735 | 0.9819 | 0.9807 | 1.0026 |
Qab/f | 0.5133 | 0.5124 | 0.5275 | 0.5350 | 0.5308 | 0.5261 | 0.5187 | 0.5154 | 0.4890 | 0.5199 |
Metric | Ours | ||||||||
---|---|---|---|---|---|---|---|---|---|
EN | 6.7816 | 6.7521 | 6.7838 | 6.7537 | 6.7061 | 6.7635 | 6.7244 | 6.7521 | 6.7844 |
MI | 4.8015 | 4.5154 | 4.8101 | 4.6564 | 4.5619 | 4.5265 | 4.5265 | 4.7309 | 4.7104 |
SD | 8.9698 | 8.9504 | 8.9545 | 8.9312 | 8.9061 | 8.9546 | 8.9399 | 8.9504 | 8.9747 |
SF | 0.0282 | 0.0290 | 0.0283 | 0.0288 | 0.0291 | 0.0280 | 0.0288 | 0.0290 | 0.0294 |
VIF | 1.0003 | 0.9940 | 0.9946 | 1.0012 | 0.9788 | 0.9944 | 0.9861 | 0.9940 | 1.0026 |
Qab/f | 0.5170 | 0.5248 | 0.5235 | 0.5138 | 0.5264 | 0.5111 | 0.5250 | 0.5248 | 0.5199 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhan, W.; Wang, J.; Jiang, Y.; Chen, Y.; Zheng, T.; Hong, Y. Infrared and Visible Image Fusion for Highlighting Salient Targets in the Night Scene. Entropy 2022, 24, 1759. https://doi.org/10.3390/e24121759
Zhan W, Wang J, Jiang Y, Chen Y, Zheng T, Hong Y. Infrared and Visible Image Fusion for Highlighting Salient Targets in the Night Scene. Entropy. 2022; 24(12):1759. https://doi.org/10.3390/e24121759
Chicago/Turabian StyleZhan, Weida, Jiale Wang, Yichun Jiang, Yu Chen, Tingyuan Zheng, and Yang Hong. 2022. "Infrared and Visible Image Fusion for Highlighting Salient Targets in the Night Scene" Entropy 24, no. 12: 1759. https://doi.org/10.3390/e24121759
APA StyleZhan, W., Wang, J., Jiang, Y., Chen, Y., Zheng, T., & Hong, Y. (2022). Infrared and Visible Image Fusion for Highlighting Salient Targets in the Night Scene. Entropy, 24(12), 1759. https://doi.org/10.3390/e24121759