Abstract
Lung diseases affect the lives of billions of people worldwide, and 4 million people, each year, die prematurely due to this condition. These pathologies are characterized by specific imagiological findings in CT scans. The traditional Computer-Aided Diagnosis (CAD) approaches have been showing promising results to help clinicians; however, CADs normally consider a small part of the medical image for analysis, excluding possible relevant information for clinical evaluation. Multiple Instance Learning (MIL) approach takes into consideration different small pieces that are relevant for the final classification and creates a comprehensive analysis of pathophysiological changes. This study uses MIL-based approaches to identify the presence of lung pathophysiological findings in CT scans for the characterization of lung disease development. This work was focus on the detection of the following: Fibrosis, Emphysema, Satellite Nodules in Primary Lesion Lobe, Nodules in Contralateral Lung and Ground Glass, being Fibrosis and Emphysema the ones with more outstanding results, reaching an Area Under the Curve (AUC) of 0.89 and 0.72, respectively. Additionally, the MIL-based approach was used for EGFR mutation status prediction — the most relevant oncogene on lung cancer, with an AUC of 0.69. The results showed that this comprehensive approach can be a useful tool for lung pathophysiological characterization.









Similar content being viewed by others
References
GBD 2015 Mortality and Causes of Death Collaborators (2016) Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980?2015: a systematic analysis for the global burden of disease study 2015. The lancet 388(10053):1459–1544
World Health Organization. Global status report on noncommunicable diseases (2014) Number WHO/NMH/NVI/15.1. World Health Organization, 2014
Ostridge K, Wilkinson TMA (2016) Present and future utility of computed tomography scanning in the assessment and management of COPD. ISSN: 13993003
Pinheiro G, Pereira T, Dias C, Freitas C, Hespanhol V, Costa JL, Cunha A, Oliveira HP (2020) Identifying relationships between imaging phenotypes and lung cancer-related mutation status: EGFR and KRAS. Scientific Reports. ISSN: 20452322
Gevaert O, Echegaray S, Khuong A, Hoang CD, Shrager JB, Jensen KC, Berry GJ, Guo H, Lau C, Plevritis SK, Rubin DL, Napel S, Leung AN (2017) Predictive radiogenomics modeling of EGFR mutation status in lung cancer. Scientific Reports. ISSN: 20452322
Mayo Clinic: Diseases and Conditions. https://www.mayoclinic.org/diseases-conditions/. Last accessed on 04/02/2020
Infante M, Lutman RF, Imparato S, Di Rocco M, Ceresoli GL, Torri V, Morenghi E, Minuti F, Cavuto S, Bottoni E, Inzirillo F, Cariboni U, Errico V, Incarbone MA, Ferraroli G, Brambilla G, Alloisio M, Ravasi G (2009) Differential diagnosis and management of focal ground-glass opacities. Europ Resp J 33(4):821–827. ISSN: 09031936
Lung Cancer Guide — What You Need to Know. https://www.cancer.org/cancer/lung-cancer. Last accessed on 23/01/2020
Li C, Nie S, Wang Y, Sun X (2012) Experimental investigation of fuzzy enhancement for nonsolid pulmonary nodules. In: Proceedings - 2012 IEEE symposium on robotics and applications, ISRA 2012, pp 756–759. ISBN 9781467322072
Wang Z, Xu H, Sun M (2018) Deep learning based nodule detection from pulmonary CT images. In: Proceedings - 2017 10th international symposium on computational intelligence and design, ISCID 2017, volume 2018-January, pp 370–373. Institute of Electrical and Electronics Engineers Inc. ISBN 9781538 636749
Bakr S, Gevaert O, Echegaray S, Ayers K, Zhou M, Shafiq M, Zheng H, Benson JA, Zhang W, Leung ANC, Kadoch M, Hoang CD, Shrager J, Quon A, Rubin DL, Sa K, Napel S (2018) Plevritis data descriptor: a radiogenomic dataset of non-small cell lung cancer. Scientific Data, 5. ISSN: 20524463
Sharma SV, Bell DW, Settleman J, Haber DA (2007) Epidermal growth factor receptor mutations in lung cancer. ISSN: 1474175X
Jorge SEDC, Kobayashi SS, Costa DB (2014) Epidermal growth factor receptor (EGFR) mutations in lung cancer: preclinical and clinical data. ISSN: 16784510
Zou J, Lv T, S Zhu Z L u, Shen Q, L Xia J W u, Song Y, Liu H (2017) Computed tomography and clinical features associated with epidermal growth factor receptor mutation status in stage I/II lung adenocarcinoma. Thoracic Cancer 8(3):260–270. ISSN: 17597714
Cheng Z, Shan F, Yang Y, Shi Y, Zhang Z (2017) CT characteristics of non-small cell lung cancer with epidermal growth factor receptor mutation: a systematic review and meta-analysis. BMC Medical Imaging, 17(1). ISSN: 14712342
Li XY, Xiong JF, Jia TY, Shen TL, Hou RP, Zhao J, Fu XL (2018) Detection of epithelial growth factor receptor (EGFR) mutations on CT images of patients with lung adenocarcinoma using radiomics and/or multi-level residual convolutionary neural networks. J Thor Dis 10(12):6624–6635. ISSN: 20776624
Cao Y, Xu H, Liao M, Qu Y, Xu L, Zhu D, Wang B, Tian S (2018) Associations between clinical data and computed tomography features in patients with epidermal growth factor receptor mutations in lung adenocarcinoma. Int J Clinl Oncol 23(2):249–257. ISSN: 14377772
Rizzo S, Raimondi S, de Jong EEC, van Elmpt W, De Piano F, Petrella F, Bagnardi V, Jochems A, Bellomi M, Dingemans AM, Lambin P (2019) Genomics of non-small cell lung cancer (NSCLC): association between CT-based imaging features and EGFR and K-RAS mutations in 122 patients—An external validation. Europ J Radiol 110:148–155. ISSN: 18727727
Das A, Nair MS, Peter SD (2020) Computer-aided histopathological image analysis techniques for automated nuclear atypia scoring of breast cancer: a review. ISSN: 1618727X
Safta W, Frigui H (2019) Multiple instance learning for benign vs. malignant classification of lung nodules in CT scans. In: 2018 IEEE International symposium on signal processing and information technology, ISSPIT 2018. ISBN 9781538675687
Asif A, Abbasi WA, Munir F, Ben-Hur A, ul Amir Afsar Minhas F (2017) pyLEMMINGS: large margin multiple instance classification and ranking for bioinformatics applications
Zhou ZH, Sun YY, Li YF (2009) Multi-instance learning by treating instances as non-I.I.D. samples. In: ACM International conference proceeding series. ISBN 9781605585161, vol 382. ACM Press, New York, pp 1–8
Doran G, Ray S (2014) A theoretical and empirical analysis of support vector machine methods for multiple-instance classification. In: Machine learning, vol 97, pp 79–102. Kluwer Academic Publishers
Carbonneau MA, Cheplygina V, Granger E, Gagnon G (2018) Multiple instance learning: a survey of problem characteristics and applications. Pattern Recogn 77:329–353. ISSN: 00313203
Cheplygina V, Sørensen L, Tax DMJ, Pedersen JH, Loog M, De Bruijne M (2014) Classification of COPD with multiple instance learning. In: Proceedings - international conference on pattern recognition. ISBN 9781479952083. Institute of Electrical and Electronics Engineers Inc., pp 1508–1513
Gang J, Yuan F, Bing Z (2013) Medical image semantic annotation based on MIL. In: 2013 ICME International conference on complex medical engineering, CME 2013. ISBN 9781467329699, pp 85–90
Ramos J, Kockelkorn T, Van Ginneken B, Viergever MA, Grutters J, Ramos R, Campilho A (2013) Learning Interstitial Lung Diseases CT Patterns from Reports Keywords. Technical report
Peña IP, Cheplygina V, Paschaloudi S, Vuust M, Carl J, Møller Weinreich U, Østergaard LR, de Bruijne M (2018) Automatic emphysema detection using weakly labeled HRCT lung images. Plos One 13:10
Cheplygina V, Pena IP, Pedersen JH, Lynch DA, Sorensen L, De Bruijne M (2018) Transfer learning for multicenter classification of chronic obstructive pulmonary disease. IEEE J Biomed Health Inform 22(5):486–1496. ISSN: 21682194
Orting SN, Petersen J, Thomsen LH, Wille MMW, De Bruijne M (2018) Detecting emphysema with multiple instance learning. In: Proceedings - international symposium on biomedical imaging, volume 2018-April. ISBN 9781538636367. IEEE Computer Society, pp 510–513
Depeursinge A, Vargas A, Platon A, Geissbuhler A, Poletti PA, Müller H (2012) Building a reference multimedia database for interstitial lung diseases. Comput Med Imag Graph 36(3):227–238. ISSN: 08956111
Park SH, Ha YG (2014) Large imbalance data classification based on MapReduce for traffic accident prediction. In: Proceedings - 2014 8th international conference on innovative mobile and internet services in ubiquitous computing, IMIS 2014, ISBN 9781479943319
Park Sh, Kim Sm, Ha Yg (2016) Highway traffic accident prediction using VDS big data analysis. Journal of Supercomputing. ISSN: 15730484
Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N (2018) A survey on addressing high-class imbalance in big data. Journal of Big Data. ISSN: 21961115
Rendon-Gonzalez E, Ponomaryov V (2016) Automatic Lung nodule segmentation and classification in CT images based on SVM. In: 9th International Kharkiv symposium on physics and engineering of microwaves, millimeter and submillimeter waves, MSMW 2016. Institute of Electrical and Electronics Engineers Inc., ISBN 9781509022663
Hebb AO, Poliakov AV (2009) Imaging of deep brain stimulation leads using extended hounsfield unit CT. Stereotactic and Functional Neurosurgery 87(3):155–160. ISSN: 10116125
Aresta GM (2016) Detection of juxta-pleural lung nodules in computed tomography images. Master’s thesis. Faculdade de Engenharia da Universidade do Porto, 7
Bunescu RC, Mooney RJ (2007) Multiple instance learning for sparse positive bags. In: ACM International conference proceeding series, vol 227, pp 105–112
Gärtner T, Flach PA, Kowalczyk AA, AlexSmola JS, Rsise A (2002) Multi-Instance Kernels. Technical report
Zhou Z-H Multi-instance learning: a survey. Technical report
Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classiication
Wei XS, Zhou ZH (2016) An empirical study on image bag generators for multi-instance learning. Mach Learn 105(2):155–198. ISSN: 15730565
Zhu B, Luo W, Li B, Chen B, Yang Q, Xu Y, Wu X, Chen H, Zhang K (2014) The development and evaluation of a computerized diagnosis scheme for pneumoconiosis on digital chest radiographs. BioMedical Engineering Online. ISSN: 1475925X
El Ayachy R, Giraud N, Giraud P, Durdux C, Giraud P, Burgun A, Bibault JE (2021) The role of radiomics in lung cancer: from screening to treatment and follow-up. ISSN: 2234943X
Van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan RGH, Fillion-Robin JC, Pieper S, Aerts HJWL (2017) Computational radiomics system to decode the radiographic phenotype. Cancer Res 77(21):e104–e107. ISSN: 15387445
DenOtter TD, Schubert J (2021) Hounsfield Unit. Treasure Island (FL): StatPearls Publishing
Konkol M, Śniatała K, Śniatała P, Wilk S, Baczyńska B, Milecki P (2021) Computer tools to analyze lung CT changes after radiotherapy. Applied Sciences (Switzerland). ISSN: 20763417
Mera C, Arrieta J, Orozco-Alzate M, Branch J (2015) A bag oversampling approach for class imbalance in multiple instance learning. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). ISBN 9783319257501, vol 9423. Springer, pp 724–731
Intro to Model Tuning: Grid and Random Search — Kaggle. https://www.kaggle.com/willkoehrsen/intro-to-model-tuning-grid-and-random-searchhttps://www.kaggle.com/willkoehrsen/intro-to-model-tuning-grid-and-random-search. Last accessed on 06/06/2020
Pereira T, Freitas C, Costa JL, Morgado J, Silva F, Negrão E, de Lima BF, da Silva MC, Madureira AJ, Ramos I, Hespanhol V, Cunha A, Oliveira HP (2020) Comprehensive Perspective for Lung Cancer Characterisation Based on AI Solutions Using CT Images. Journal of Clinical Medicine
Acknowledgements
We acknowledged The Cancer Imaging Archive (TCIA) for the open-access NSCLC-Radiogenomics dataset publicly available.
We acknowledged the University Hospitals of Geneva (HUG) for the access of ILD DATABASE - Multimedia database of interstitial lung diseases.
Funding
This work is financed by the ERDF - European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project POCI-01-0145-FEDER-030263 and a PhD Grant Number: 2021.05767.BD.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Julieta Frade and Tania Pereira contributed equally to this work
Rights and permissions
About this article
Cite this article
Frade, J., Pereira, T., Morgado, J. et al. Multiple instance learning for lung pathophysiological findings detection using CT scans. Med Biol Eng Comput 60, 1569–1584 (2022). https://doi.org/10.1007/s11517-022-02526-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-022-02526-y