0% found this document useful (0 votes)
3 views3 pages

Deep Learning Based

This study presents a deep learning approach for classifying breast cancer molecular subtypes using hematoxylin and eosin (H&E) stained whole-slide images, achieving an F1 score of 0.95 for tumor detection and 0.73 for subtype classification. The proposed method enhances accessibility and reduces costs compared to traditional techniques like gene expression profiling and immunohistochemistry. Future work aims to improve dataset diversity and integrate multimodal data for better model performance and validation in clinical settings.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

Deep Learning Based

This study presents a deep learning approach for classifying breast cancer molecular subtypes using hematoxylin and eosin (H&E) stained whole-slide images, achieving an F1 score of 0.95 for tumor detection and 0.73 for subtype classification. The proposed method enhances accessibility and reduces costs compared to traditional techniques like gene expression profiling and immunohistochemistry. Future work aims to improve dataset diversity and integrate multimodal data for better model performance and validation in clinical settings.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Deep learning-based classification of breast cancer molecular subtypes from

H&E whole-slide images


Masoud Tafavvoghi, Anders Sildnes, Mehrdad Rakaee , Nikita Shvetsov, Lars Ailo Bongo, Lill-Tove
Rasmussen Busund, Kajsa Møllersen
Journal of Pathology Informatics, January 2025

Abstract

This study uses hematoxylin and eosin (H&E) stained whole-slide images (WSIs) for molecular
subtyping of breast cancer. The pipeline involves tumor vs. non-tumor classification and
molecular subtype prediction using a two-step approach. The proposed method achieved an F1
score of 0.95 for tumor detection and a macro F1 score of 0.73 for subtype classification. These
findings underscore the potential of deep learning models as cost-effective and efficient
alternatives to traditional methods like immunohistochemistry and gene expression profiling.

Introduction

One of the main causes of cancer-related fatalities is breast cancer, and molecular subtyping is
essential for individualised treatment plans. Conventional techniques such as gene expression
profiling and immunohistochemistry (IHC) are costly and arbitrary. An automatic and
economical method for classifying subtypes straight from H&E-stained whole-slide images
(WSIs) is deep learning. A multi-step pipeline utilizing cutting-edge deep learning algorithms is
proposed in this paper for precise and effective categorisation. By guaranteeing tumor-focused
analysis and better subtype prediction, the method improves accessibility and lowers diagnostic
expenses.

Objectives

1. Create a workflow that uses H&E-stained WSIs to classify the molecular subtypes of
breast cancer (Luminal A, Luminal B, HER2-enriched, and Basal-like).
2. Make sure subtype classification is based only on tumour tissue by training a deep
learning model to differentiate between tumour and non-tumor regions.
3. Use binary classifiers in a One-vs-Rest (OvR) approach, and use an XGBoost model to
aggregate the findings for subtype prediction.
4. Improve model accuracy by using preprocessing methods including data balance, color
normalization, and tile extraction.
5. Analyse the pipeline's performance on a sizable dataset using measures such as recall,
precision, and F1 scores. Verify the model's capacity for generalizability and usefulness
in research and clinical contexts.
Methodology

 Data Collection: The study utilized publicly available datasets such as TCGA-BRCA,
BRACS, CPTAC-BRCA, and HER2-Warwick, collectively comprising 1433 WSIs.
These datasets were annotated and categorized based on breast cancer molecular subtypes
(LumA, LumB, HER2, and BL).
 Preprocessing:
 WSI regions were divided into tiles of 512x512 pixels at a fixed magnification.
 Tumor regions were identified using a binary classifier to exclude irrelevant tiles.
 Color normalization was performed using the Macenko method to reduce
variations in image acquisition.
 Segmentation: Tumor tiles were separated from non-tumor tiles using Inception_V3, a
convolutional neural network architecture optimized for hierarchical feature extraction.
 Feature Extraction and Classification:
 The molecular subtype classification used a One-vs-Rest (OvR) strategy with four
binary classifiers.
 Results were aggregated using an eXtreme Gradient Boosting (XGBoost) model.
 Class imbalances were addressed by augmenting HER2 data through tile overlaps
and balanced sampling.

Techniques and Algorithms

 Deep Learning Models:


 Inception_V3: Used for tumor vs. non-tumor classification due to its multi-scale
feature capture capabilities.
 XGBoost: Aggregated predictions from the OvR classifiers for molecular
subtyping.
 Macenko Color Normalization: Ensured consistent color representation across WSIs,
mitigating acquisition-related biases.

Results and Discussion

 Tumor Detection: Achieved an F1 score of 0.95, demonstrating high precision in


differentiating tumor regions.
 Molecular Subtyping: Achieved a macro F1 score of 0.73. Performance varied across
subtypes, with better results for well-represented classes like LumA.
 Advantages:
 Reduced reliance on expensive gene expression profiling.
 Improved accessibility for resource-limited settings.
 Streamlined diagnostic workflows with automated image analysis.
 Challenges:
 Dataset imbalance affected minority classes (e.g., HER2).
 Model generalizability requires further validation across diverse datasets.
Research Gaps

 Limited generalization to datasets with different imaging conditions or scanner types.


 Difficulty in accurately classifying underrepresented subtypes.
 Lack of integration with multimodal data, such as genetic and clinical records.

Future Scope

 Enhancing the dataset with diverse and balanced samples to improve minority class
performance.
 Expanding the pipeline to include multimodal inputs like immunohistochemistry or
radiological data.
 Deploying the system in real-world clinical settings for large-scale validation.

Conclusion
The study highlights the feasibility of using deep learning for breast cancer molecular subtyping
via H&E WSIs. The approach reduces cost, time, and dependency on specialized tools, making it
suitable for broader adoption in clinical settings. With improvements in data diversity and model
robustness, this methodology could transform precision oncology practices

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy