Response Estimation Through Spatially Oriented Neural Network and Texture Ensemble (RESONATE)

Eben, Jeffrey E.; Braman, Nathaniel; Madabhushi, Anant

doi:10.1007/978-3-030-32251-9_66

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11767))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

9209 Accesses
7 Citations

Abstract

Neoadjuvant chemotherapy (NAC) is considered to be the standard treatment for locally advanced breast cancer, but less than half of all recipients achieve pathological complete response (pCR), necessitating a way to predict pCR prior to NAC. Previous work has shown that pCR prediction is viable via either radiomic or deep learning classification methods when applied to the tumoral region on breast MRI. Others have shown that analysis within the peritumoral region directly outside of the tumor can contribute unique value to pCR prediction. In this work we present Response Estimation through Spatially Oriented Neural Network and Texture Ensemble (RESONATE): an approach to spatially invoke different types of analytic representations in different tumor compartments to create a multi-representation based prediction of response to NAC in breast cancer. A total of 114 NAC recipients with pre-treatment MRI were retrospectively analyzed, with 80 of the patients used for training and 34 held out as an independent testing set. Deep learning and radiomic classifiers were trained separately within the tumor and the peritumoral region, with separate classifier predictions then being fused together via a logistic regression classifier. In the testing set, individual radiomics and deep learning classifiers achieved area under the curve (AUC) values of 0.69 and 0.75 within the tumor, respectively, and 0.69 and 0.66 within the peritumoral region. A weighted fusion of these four classifiers, however, best predicted pCR with an AUC of 0.79. This approach also outperformed fusions incorporating radiomic (AUC = 0.77) or deep learning (AUC = 0.75) only, as well as combinations of representations only within (AUC = 0.78) or outside (AUC = 0.70) the tumor.

J. E. Eben and N. Braman—Equal contribution.

Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under award numbers 1F31CA221383-01, 1U24CA199374-01, R01CA202752-01A1, R01CA208236-01A1, R01 CA216579-01A1, R01 CA220581-01A1, 1U01 CA239055-01, National Center for Research Resources under award number 1 C06 RR12463-01, VA Merit Review Award IBX004121A from the United States Department of Veterans Affairs Biomedical Laboratory Research and Development Service, the DOD Prostate Cancer Idea Development Award (W81XWH-15-1-0558), the DOD Lung Cancer Investigator-Initiated Translational Research Award (W81XWH-18-1-0440), the DOD Peer Reviewed Cancer Research Program (W81XWH-16-1-0329), the Ohio Third Frontier Technology Validation Fund, the Wallace H. Coulter Foundation Program in the Department of Biomedical Engineering and the Clinical and Translational Science Award Program (CTSA) at Case Western Reserve University. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the U.S. Department of Veterans Affairs, the Department of Defense, or the United States Government.

You have full access to this open access chapter, Download conference paper PDF

Prior to Initiation of Chemotherapy, Can We Predict Breast Tumor Response? Deep Learning Convolutional Neural Networks Approach Using a Breast MRI Tumor Dataset

Article 25 October 2018

Machine learning classification of texture features of MRI breast tumor and peri-tumor of combined pre- and early treatment predicts pathologic complete response

Article Open access 28 June 2021

Multimodal Prediction of Breast Cancer Relapse Prior to Neoadjuvant Chemotherapy Treatment

1 Introduction

Neoadjuvant chemotherapy (NAC) is the standard treatment for locally advanced breast cancer [1]. Favorable response to NAC is best measured by pathological complete response (pCR), where the patient has no residual invasive disease in the breast and lymph nodes following treatment [2]. Only 10–50% of patients receiving NAC will experience pCR, leading to many unnecessary side effects and detriments to quality of life for non-responsive patients [3]. However, a growing body of work has suggested the potential of computational medical image analysis to enable the prediction of response to NAC from dynamic contrast-enhanced (DCE) MRI. Radiomics, the high-throughput extraction and analysis of algorithmically defined features from radiology, has been shown to predict response on pre-treatment DCE-MRI through attributes such as image texture [4]. Deep learning, involving the training of a convolutional neural network to identify novel predictive image patterns, has successfully been applied to response prediction prior to treatment [5].

An emerging trend is that prediction of response to NAC is not only a matter of selecting the right computational tools, but also where they are deployed. For instance, previous work has shown that supplementing radiomic features extracted from the tumor itself with texture features computed within the peri-tumoral region - the tumor’s surrounding environment - enables improved prediction of pCR from pre-treatment DCE-MRI [4] and the ability to identify genotypes associated with response to targeted therapy [6]. A related study, using deep learning for predicting response to esophageal cancer based on the tumor and peritumoral region on PET, was reported in [7].

While several studies have investigated the fusion of radiomics and deep learning, a large portion employ naive ensembling strategies [8, 9] that weight each equally without consideration of their relative strengths and weaknesses. However, it is not clear that the different types of representations are ideally suited uniformly across the different parts of the region of interest (ROI). For instance, in the case of a lung nodule on CT, while CNNs might capture unique heterogeneity patterns pertaining to ground glass opacity within the nodule, radiomic edge operators might accentuate unique attributes relating to margin spicularity. In other words, spatially invoking specific representations might be a better mechanism for feature fusion compared to combining multiple representations within the same ROI.

In this work, we present Response Estimation through Spatially Oriented Neural Network and Texture Ensemble (RESONATE): a novel approach for the fusion of radiomic and deep learning data streams by invoking them in the spatial regions at which they are most discriminative.

2 Previous Works and Novel Contributions

Despite the individual promise of radiomics and deep learning approaches, relatively few have explored how these approaches can be combined. Antropova et al. [8] and Paul et al. [9] averaged the outputs and predictions of deep learning and radiomic based classifiers for diagnostic problems relating to breast and lung cancer, respectively. Others have explored more complex fusion strategies that attempt to account for differences in model performance through training a model across ensembled classifiers and data streams [10, 11]. A common theme of these approaches is the fact that they leverage CNNs and radiomics as ways of providing different types of representations from within the same region to boost classifier performance.

The approach presented in this work, RESONATE, is unique from these previous approaches by using a spatial preference to invoke different types of representations, radiomics and deep learning, within different tumor subcompartments. The approach differentially weights representations based on their relative regional strengths within a fused regression model.

3 Methodology

3.1 Spatial Localization of Tumor Habitat

We define an image scene $\mathcal {I}$ as a 3-dimensional spatial grid of voxels, corresponding to one phase of a DCE-MRI volume. Let $\mathcal {T}$ represent a sub-volume of $\mathcal {I}$ corresponding to a segmentation of a tumor. From $\mathcal {T}$ we further define a peri-tumoral volume, $\mathcal {P}$, which corresponds to the region surrounding the tumor, originating at the tumor border, and extending out to all voxels within some user-specified maximum physical distance.

3.2 Spatial-Specific Model Representation

The spatial regions, $\mathcal {T}$ and $\mathcal {P}$, were separately analyzed using deep learning and radiomic classification methods. For each $\mathcal {I}$, let y indicate an accompanying binary outcome to treatment, where a value of 1 indicates successful response.

Deep Learning. Two models were created, one for each region $\mathcal {S} \in \left\{ \mathcal {T}, \mathcal {P}\right\} $, using the following procedure. We define $\mathcal {I}_\mathcal {S}$ as a volumetric box, which is a sub-volume of $\mathcal {I}$ large enough to contain all points of $\mathcal {S}$. To isolate the ROI, the intensity value of all voxels contained within $\mathcal {I}_\mathcal {S}$ but not in $\mathcal {S}$ are set equal to the mean intensity within $\mathcal {S}$ in order to prevent deep learning models from relying too heavily on annotation boundaries. A convolutional neural network, $\mathcal {D}_\mathcal {S}$ is then trained using $\mathcal {I}_\mathcal {S}$ from a number of different image volumes to predict a probability of response $p_\mathcal {S}^\mathcal {D}$ as close as possible to y, as measured by some loss function $\textit{L}(y,p_\mathcal {S}^\mathcal {D})$.

Radiomics. Radiomic models were created for each region $\mathcal {S} \in \left\{ \mathcal {T}, \mathcal {P}\right\} $. For every voxel within $\mathcal {S}$, a set of unique radiomic descriptors is computed. The distribution of each radiomic descriptor across all voxels of $\mathcal {S}$ are statistically summarized into a feature vector, and a feature selection algorithm is applied to the feature vector to create an optimal reduced set of features. A classifier $\mathcal {R}_\mathcal {S}$ accepts this reduced set of features and outputs a response prediction $p_\mathcal {S}^\mathcal {R}$.

3.3 Model Fusion

A logistic regression classifier, $\mathcal {L}$, was designed to fuse the predictions of classifiers $\mathcal {C_{\mathcal {S}}}$, where $\mathcal {C} \in \left\{ \mathcal {D}, \mathcal {R}\right\} $, $\mathcal {S} \in \left\{ \mathcal {T}, \mathcal {P}\right\} $. Each classifier, $\mathcal {C_{\mathcal {S}}}$ is first trained independently, with $\mathcal {L}$ then being trained using the predictions of each individual classifier as

$$\begin{aligned} \ln (\frac{p_{\mathcal {L}}}{1-p_{\mathcal {L}}}) = W_0 + \sum _{n=1}^{\mathcal {N}} W_np_\mathcal {S}^\mathcal {C} \end{aligned}$$

(1)

where $p_\mathcal {S}^\mathcal {C}$ represents a prediction output for classifier $\mathcal {C}$ in region $\mathcal {S}$. This model fusion approach allows $\mathcal {L}$ to learn a weighted combination between per-patient predictions from each classifier based on the relative strengths of representation and location, giving the ability for a stronger ensemble prediction.

4 Experimental Design

4.1 Data Description

The dataset used consists of axial-plane breast DCE-MRIs of 114 patients with biopsy-proved breast cancer, collected prior to administration of neoadjuvant chemotherapy with a 1.5 or 3 T magnet [4, 12]. DCE-MRI acquisitions were collected across six separate contrast enhancement sequences for each patient, and three-frame intratumoral masks were annotated at the peak-enhancement sequence by a trained radiologist. Peritumoral masks were generated by expanding the intratumoral mask 3 mm outward. For all experiments, the patients were divided into a training (N = 80) and held out testing (N = 34) cohort. The training cohort was further stratified into three training (N = 53) and validation (N = 27) folds for cross-validation.

4.2 Implementation Details

Deep Learning: CNN inputs, of size $150\times 110\times 3\times 3$ for the peritumoral network and $146\times 104\times 3\times 3$ for the tumoral network, consist of 3-dimensional blocks centered at the region of interest with a fourth, temporal dimension corresponding to different phases of DCE-MRI acquisition. The network (depicted in Fig. 1) consists of three convolutional blocks, with a block containing two convolutional layers with ReLU activation, followed by a batch normalization and max pooling layer. Each block sequentially widens the network by increasing the number of filters used in the convolutional layers, going from 8 to 16 to 32 filters, all with kernels of size $3\times 3\times 1$. After flattening the final convolutional outputs, a small dense block consisting of two dense layers (with 128 and 64 filters, respectively), each followed by ReLU activation and 50% dropout. A final sigmoid layer computes final probability output. Data augmentation was performed, applying random rotations and spatial zooming, as well as random sampling preserving temporal order from 5 available DCE-MRI post-contrast acquisitions to include with the pre-contrast scan as input for training. Training was performed using a binary cross-entropy loss function, and a stochastic gradient descent optimizer with a nesterov momentum of 0.9, a learning rate of 0.0005, and a learning rate decay of 0.01. Visual attention with guided back-propogation [13], via keras-vis [14], was implemented post-hoc for evaluation of image regions corresponding to a prediction of response.

Radiomics Classifier: Within the tumor and peri-tumoral region, a total of 99 radiomic texture features were extracted voxelwise on the DCE-MRI phase of peak contrast enhancement, including 25 Laws, 48 Gabor, 13 gray level co-occurrence matrix (GLCM), and 13 co-occurrence of local anisotropic gradient orientation (CoLlAGe) features. See [4] for greater description on the radiomic feature set explored. Five first order statistics - mean, median, standard deviation (SD), skewness, and kurtosis - were computed to describe the distribution of features within each region. A set of top features were chosen with a two-part feature selection scheme. First, the feature set was pruned to eliminate correlated features based on a minimum allowable spearman correlation between features (with the retained feature chosen by wilcoxon rank sum test). Second, two rounds of minimum redundancy maximum relevance (mRMR) feature selection were used to identify between 1–20 top features optimized over 1000 iterations in cross-validation [4]. Top features within each fold were used to train several classifiers: naive bayes, support vector machine (SVM), and diagonal linear discriminant analysis (DLDA). The optimal combination of correlation threshold for pruning, number of top features, and type of classifier was chosen based on cross-validated AUC within the training set, then applied to the testing set.

Model Fusion: The fused model was developed in the following fashion:

1.
Tuning individual models: Region-specific radiomic and deep learning classifiers were optimized via 3-fold cross-validation within the training set. Specific hyper-parameter details are provided in Sect. 5.1.
2.
Evaluating model fusion by cross-validation: For each optimized model, validation fold predictions were accumulated into a set of predicted response probabilities for the full training set. Cross-validation was repeated, this time training logistic regression model weights based on predictions from the training fold and evaluating on predictions from the validation fold.
3.
Creating and testing final fusion models: The final regression model was trained on accumulated cross-validation predictions from each of the 3 folds. A final version of each individual model was retrained using the full training set with the optimal hyper-parameters discovered in cross-validation. Fusion model predictions were then collected by passing in the prediction outputs of each individual final model to the logistic regression classifier.

5 Results and Discussion

5.1 Individual pCR Prediction Models

Deep Learning: Final deep learning models were trained on the full training set for 67 epochs, based on average convergence time observed in cross-validation. Two variants of the deep learning classifier were trained, one using intratumoral segmentations and the other using peritumoral segmentations. The deep learning model focused on the tumor outperformed the peritumoral model in both the training and testing set (Table 1).

Radiomics: Best performance was achieved in cross-validation of the training set when using a SVM classifier, initially pruning features with correlation higher than 90%, and choosing a final set of 11 features via mRMR feature selection. Two variants of the SVM classifier, $\mathcal {R_\mathcal {T}}$ and $\mathcal {R_\mathcal {P}}$, were trained within each region. Performance between tumoral and peri-tumoral models was found to be comparable (Table 1).

Table 1. Classification results for single model and multi-model fusion

Full size table

5.2 Experiment 2: Multi-region, Multi-representation Response Prediction via RESONATE

A fusion of predictions from all spatially-oriented classifiers, $\mathcal {L}$($\mathcal {D_\mathcal {T}}$, $\mathcal {D_\mathcal {P}}$, $\mathcal {R_\mathcal {T}}$, $\mathcal {R_\mathcal {P}}$), was found to best identify pCR, achieving an AUC of $0.78 \pm 0.05$ in cross-validation and 0.79 in the testing set. Confidence intervals (CI) and p-values were computed on the test set AUC via 50,000 iteration permutation testing [15], giving a 95% testing set CI = $.62-.96$, p = .003. Note that, for the full RESONATE model, as well as some individual models, performance increased in the testing set relative to the training set: likely a result of using the full training set to derive final models, as compared to models evaluated in cross-validation which leveraged only a portion of the training data. We found that, based on the weights of $\mathcal {L}$, the ensemble prediction relied primarily on $\mathcal {R_\mathcal {P}}$, $\mathcal {R_\mathcal {T}}$, and $\mathcal {D_\mathcal {T}}$, which had weights of 0.80, 0.99, and 0.77 respectively. The difference in these representations between patients identified as pCR, as compared to non-responders, are depicted visually in Fig. 2. Meanwhile, the peritumoral deep learning model, $\mathcal {D_\mathcal {P}}$, was found to contribute the least to pCR prediction relative to other models, with a weight of −0.08.

5.3 Experiment 3: Comparative Strategies - Pairwise Fusion

Each pairwise combination of classifiers was similarly combined into a fused model for comparison against the full RESONATE model (Table 1), none of which matched its performance. A radiomics model combining information from both the tumor and peritumoral region, previously shown to be an effective pCR prediction strategy [4], was found to under-perform relative to the model also incorporating multi-region deep learning, with an AUC of $0.75 \pm 0.04$ in the training set and 0.77 in the testing set (95% CI = $.58-.97$, p = .006). Likewise, fusion of deep learning models from both regions achieved AUC of $0.73 \pm 0.07$ in the training set and 0.75 in the test set (95% CI = $.50-.99$, p = .026).

Similarly, fusion of representations in a single region was found to less effectively predict pCR, with combinations of models fusing deep learning and radiomic representations only inside or outside the tumor achieving AUCs of 0.78 (95% CI = $.55-1.00$, p = .012) and 0.70 (95% CI = $.45-.94$, p = .052) within the testing set, respectively. Of all pairwise combinations, the best performance was observed when combining the tumoral deep learning model and the peritumoral radiomics model (AUC = 0.78, 95% CI = $.55-1.00$, p = .014). This finding emphasizes the value of considering both multiple representations and multiple regions of analysis in pre-treatment response prediction.

6 Conclusion

Our results show that an ensemble of classifiers oriented spatially in the tumor habitat is a viable method for predicting favorable response to NAC in patients with biopsy-proven breast cancer. We applied deep learning and radiomic classifiers with attention focused in either the intratumoral or peritumoral regions of the breast, with individual classifier predictions being further boosted by a logistic regression ensemble model and achieving an AUC of 0.79 in a held out test set. This work is the first to present a methodology for the fusion of radiomics and deep learning approaches across multiple regions of biological significance and emphasizes the importance of multi-region, multi-representation in the pre-treatment determination of which patients will benefit from therapeutic intervention.

References

Thompson, A.M., et al.: Neoadjuvant treatment of breast cancer. Ann. Oncol. 23, x231–x236 (2012)
Article Google Scholar
Kong, X., et al.: Meta-analysis confirms pathological complete response after neoadjuvant chemotherapy predicts favourable breast cancer prognosis. Eur. J. Canc. 47(14), 2084–2090 (2011)
Article Google Scholar
Earl, H., et al.: Neoadjuvant trials in early breast cancer: pathological response at surgery and correlation to longer term outcomes. BMC Med. 13(1), 234 (2015)
Article Google Scholar
Braman, N.M., et al.: Intratumoral and peritumoral radiomics for the pretreatment prediction of pathological complete response to neoadjuvant chemotherapy based on breast DCE-MRI. BCR 19(1), 57 (2017)
Article MathSciNet Google Scholar
Ravichandran, K., et al.: A deep learning classifier for prediction of pathological complete response to neoadjuvant chemotherapy from baseline breast DCE-MRI. In: SPIE (2018)
Google Scholar
Braman, N., et al.: Association of peritumoral radiomics with tumor biology and pathologic response to preoperative targeted therapy for HER2 (ERBB2)-positive breast cancer. JAMA Netw. Open 2(4), e192561–e192561 (2019)
Article Google Scholar
Amyar, A., et al.: 3-D RPET-NET: development of a 3-D pet imaging convolutional neural network for radiomics analysis and outcome prediction. IEEE TRPMS 3(2), 225–231 (2019)
Google Scholar
Antropova, N., et al.: A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets. Med. Phys. 44(10), 5162–5171 (2017)
Article Google Scholar
Paul, R., et al.: Predicting nodule malignancy using a CNN ensemble approach. In: Proceedings of the International Joint Conference on Neural Networks (2018)
Google Scholar
Bizzego, A., et al.: Integrating deep and radiomics features in cancer bioimaging. bioRxiv (2019)
Google Scholar
Liu, S., et al.: Pulmonary nodule classification in lung cancer screening with three-dimensional convolutional neural networks. J. Med. Imaging 4(4), 041308 (2017)
Article Google Scholar
Braman, N., Prasanna, P., Alilou, M., Beig, N., Madabhushi, A.: Vascular network organization via hough transform (VaNgOGH): a novel radiomic biomarker for diagnosis and treatment response. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 803–811. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_89
Chapter Google Scholar
Springenberg, J.T., et al.: Striving for simplicity: the all convolutional net. In: ICLR (2015)
Google Scholar
Kotikalapudi, R., et al.: keras-vis (2017). https://github.com/raghakot/keras-vis
Pauly, M., et al.: Permutation based inference for the AUC: a unified approach for continuous and discontinuous data. Biometrical 58(6), 1319–1337 (2016)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Case Western Reserve University, Cleveland, OH, USA
Jeffrey E. Eben, Nathaniel Braman & Anant Madabhushi
Louis Stokes Cleveland Veterans Administration Medical Center, Cleveland, OH, USA
Anant Madabhushi

Authors

Jeffrey E. Eben
View author publications
You can also search for this author in PubMed Google Scholar
Nathaniel Braman
View author publications
You can also search for this author in PubMed Google Scholar
Anant Madabhushi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jeffrey E. Eben .

Editor information

Editors and Affiliations

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dinggang Shen
University of Georgia, Athens, GA, USA
Tianming Liu
Western University, London, ON, Canada
Terry M. Peters
Yale University, New Haven, CT, USA
Lawrence H. Staib
University of Strasbourg, Illkirch, France
Caroline Essert
United Imaging Intelligence, Shanghai, China
Sean Zhou
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Pew-Thian Yap
Western University, London, ON, Canada
Ali Khan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eben, J.E., Braman, N., Madabhushi, A. (2019). Response Estimation Through Spatially Oriented Neural Network and Texture Ensemble (RESONATE). In: Shen, D., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science(), vol 11767. Springer, Cham. https://doi.org/10.1007/978-3-030-32251-9_66

Download citation

DOI: https://doi.org/10.1007/978-3-030-32251-9_66
Published: 10 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32250-2
Online ISBN: 978-3-030-32251-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)