Interpreting single-cell and spatial omics data using deep neural network training dynamics

Karin, Jonathan; Mintz, Reshef; Raveh, Barak; Nitzan, Mor

doi:10.1038/s43588-024-00721-5

Download PDF

Article
Open access
Published: 04 December 2024

Interpreting single-cell and spatial omics data using deep neural network training dynamics

Nature Computational Science volume 4, pages 941–954 (2024)Cite this article

8756 Accesses
76 Altmetric
Metrics details

Subjects

Abstract

Single-cell and spatial omics datasets can be organized and interpreted by annotating single cells to distinct types, states, locations or phenotypes. However, cell annotations are inherently ambiguous, as discrete labels with subjective interpretations are assigned to heterogeneous cell populations on the basis of noisy, sparse and high-dimensional data. Here we developed Annotatability, a framework for identifying annotation mismatches and characterizing biological data structure by monitoring the dynamics and difficulty of training a deep neural network over such annotated data. Following this, we developed a signal-aware graph embedding method that enables downstream analysis of biological signals. This embedding captures cellular communities associated with target signals. Using Annotatability, we address key challenges in the interpretation of genomic data, demonstrated over eight single-cell RNA sequencing and spatial omics datasets, including identifying erroneous annotations and intermediate cell states, delineating developmental or disease trajectories, and capturing cellular heterogeneity. These results underscore the broad applicability of annotation-trainability analysis via Annotatability for unraveling cellular diversity and interpreting collective cell behaviors in health and disease.

MARS: discovering novel cell types across heterogeneous single-cell experiments

Article 19 October 2020

Cell clustering for spatial transcriptomics data with graph neural networks

Article 27 June 2022

Multi-omics single-cell data integration and regulatory inference with graph-linked embedding

Article Open access 02 May 2022

Main

Single-cell omics provides a detailed view of gene expression patterns in individual cells. This level of granularity is used to decode complex biological processes and disease states. It facilitates the identification of distinct types and states of cells¹, the inference of cellular interactions within a tissue or organism², and the characterization of temporal dynamics during processes such as development³, disease progression⁴ or treatment-induced response⁵.

To organize the vast, noisy and high-dimensional raw data from single-cell omics experiments into interpretable patterns, individual cells can be labeled or annotated in terms of their type, state, location or phenotypic association. The resulting annotations serve as a linchpin in unraveling the intricate details of cellular identities, function and responses to various stimuli.

Cell annotations can be assigned directly on the basis of experimental conditions (for example, samples originating from patients versus healthy controls) or inferred by computational analysis of cellular gene expression and additional measured attributes⁶. While continuous, hierarchical⁷ or otherwise structured annotations exist in some cases^8,9,10,11, the computational annotation process can be inherently ambiguous, compounded by the assignment of discrete labels to heterogeneous cell populations based on noisy, sparse and high-dimensional data. As a result, annotations must often be refined manually on the basis of prior information relevant to a particular system, followed by arduous verification based on independent statistical analyses, expert knowledge or additional experimental measurements⁶. Even then, due to their ambiguous nature, a substantial subset of the assigned annotations may be fully or partially incongruent with the cells they aim to describe¹².

Here, we show that the level of congruence between cells and their original annotations carries critical information for downstream interpretation of single-cell data. To evaluate the match between a cell population and its given annotations, we exploit the dynamics of training deep neural networks (DNNs) on such data—an information source that is usually discarded. Specifically, we do not use DNNs for direct predictions of cell types, states or other types of annotation, as has been done in many recent works^13,14. Instead, we focus on the time it takes and the stability with which a DNN learns to predict the original input annotation for each cell in the data.

DNNs were shown, in many cases, to have the capacity to memorize practically any training dataset, including noisy data points and inaccurate labels¹⁵. Although DNNs that are expressive enough will eventually learn all input labels or annotations provided to them, empirical evidence suggests a progressive learning pattern across epochs. DNNs tend to first learn decision boundaries that capture the general structure underlying the annotated data. This implies that the DNN will first learn data points that were correctly annotated and associated with low noise and then progress to those characterized by high noise and/or incorrect annotations^16,17,18. Indeed, in the context of image classification, the learning time of an input data point has been successfully used to determine its correct labeling^17,19. In the context of natural language processing, ‘data maps’ were introduced to analyze a model’s behavior during training, an approach that was shown to be useful for predicting the reliability of the input label for each data point¹⁸.

Here, we leverage the structure of annotated data revealed by neural network training dynamics to identify meaningful patterns in single-cell omics. We present Annotatability, a method for quantifying the congruence between a cell and its input annotation by annotation-trainability analysis. Annotatability is equipped with several modules that address diverse challenges of single-cell omics analysis, including auditing and rectifying erroneous annotations, identifying ambiguous or intermediate cell states, identifying cellular communities with shared gene expression profiles and annotation trainability via graph embedding, and detecting annotation-associated genes (Fig. 1). We demonstrate the utility of Annotatability in different real-world scenarios, including the identification of false annotations of cell types and intermediate cell types in human peripheral blood mononuclear cells (PBMCs) (Fig. 2) analyzing different annotations in the mouse small intestine (Supplementary Section A.1 and Extended Data Fig. 1); correction of false cell type annotations in spatial transcriptomics of MERFISH (multiplexed error-robust fluorescence in situ hybridization) mouse hypothalamic preoptic region (Fig. 3); embedding of cells along the epithelial-to-mesenchymal transition (EMT) (Fig. 4); and embedding of pancreatic β cells according to disease-related cell states (Fig. 5), with applications for screening of disease-associated genes, evaluation of treatment effectiveness and sensitive detection of rare subpopulations of healthy-like cells (Fig. 5).

**Fig. 1: Annotatability schematic workflow.**

**Fig. 2: Identification of erroneous and ambiguous annotations in human PBMC scRNA-seq data.**

**Fig. 3: Reannotation of erroneous annotations in spatial transcriptomics MERFISH dataset of mouse hypothalamic preoptic region (n = 64,373).**

**Fig. 4: Capturing the epithelial to mesenchymal transition via trainability-aware graph analysis, based on EMT scRNA-seq data (n = 25,806).**

**Fig. 5: Inferring disease progression, state heterogeneity and treatment responses, based on pancreatic islets scRNA-seq data (n = 32,888).**

Results

Characterizing cells and annotations with Annotatability

We developed Annotatability, a framework for annotation-trainability analysis for single-cell omics data, achieved by monitoring the training dynamics of DNNs. Annotatability quantifies these training dynamics as follows. First, it takes as input an annotated dataset consisting of observations such as gene expression profiles of single cells and corresponding annotations such as cell types or disease states per cell (Fig. 1a, step 1). Second, it trains a DNN, in our case, a simple multilayer perceptron, to predict the input annotation of each cell (‘Annotatability workflow’ section in Methods; Fig. 1a, step 2). Third, it analyzes the DNN’s predictions along the training procedure (Fig. 1a, step 3). Finally, it computes two scores for each cell¹⁸: (1) the cell’s confidence score is the mean probability assigned to its input annotation across epochs, and (2) the cell’s variability score is the standard deviation of the probability assigned to its input annotation across epochs (‘Annotatability workflow’ section in Methods; Fig. 1a, step 4).

Based on previous results¹⁸, we expect learning to become easier, quicker and less volatile as the fit between a cell and its input annotation increases, and vice versa. Consequently, we expect correctly annotated cells to have high confidence and low variability scores, erroneously annotated cells to have low confidence and low variability scores, and ambiguously annotated cells to have mid-confidence and high variability scores. By setting appropriate thresholds (‘Classifying cells on the basis of confidence and variability scores and setting thresholds’ section in Methods), we classify each cell to one of these three categories: correctly annotated (easy to learn), erroneously annotated (hard to learn) or ambiguously annotated (Fig. 1b,c).

To reduce the sensitivity of Annotatability to the details of the DNN, we use a general and nonspecialized form of DNNs, specifically a fully connected network²⁰. Moreover, the known tendency of DNNs to overfit is a key aspect of our approach, implying that the data will be eventually memorized by the network without needing to impose particular assumptions on its structure or the network hyperparameters. Indeed, Annotatability was found to be robust across a range of hyperparameter and architecture settings (Supplementary Section A.5 and Supplementary Fig. 1).

Annotatability uses the trainability-based cell classifications for downstream biological analysis in several ways. First, it allows us to focus any downstream analysis solely on correctly annotated cells and study cells that were classified as ambiguously annotated as candidates representing intermediate cell states. It also allows us to identify distinct subpopulations of cells that do not fit their input annotations, such as in the case of healthy-like cells in a diseased sample. Second, the identification of correctly and erroneously annotated cells can be used to amend erroneous annotations; we equipped Annotatability with an optional reannotation module that trains a DNN solely on correctly annotated cells, and the network is then used to reannotate the entire dataset (‘Reannotation of erroneously annotated cells’ section in Methods; Fig. 1d). Third, Annotatability incorporates a trainability-aware graph-embedding module, utilizing the trainability of a given annotation as a proxy for biological variation, or signal, encoded by the cell. Recent years have seen the development of many single-cell RNA sequencing (scRNA-seq) graph-based analysis pipelines, where nodes represent cells and edges between cell pairs are weighted by their proximity in gene expression space^{10,21,22,23,24}. Annotatability extends this approach by also considering the similarity of cells’ Annotatability confidence scores when weighting edges (‘Trainability-aware graph embedding’ section in Methods). This reweighting enhances signals of interest, such as a cell’s developmental stage or disease state, enabling signal-specific downstream analyses (‘Trainability-aware graph embedding’ section in Methods; Fig. 1e,f). Finally, Annotatability is equipped with a training dynamics-based score that captures either positive or negative association of genes relative to a given biological signal, revealed by their correlation or anticorrelation with the confidence in a particular annotation (‘Annotation-trainability score’ section in Methods; Fig. 1g).

Identifying erroneous annotations and ambiguous cell states in human PBMCs

We begin by demonstrating how annotation-trainability analysis via Annotatability can be used to identify both erroneously and ambiguously annotated cells, the latter potentially corresponding to intermediate cell types. We apply Annotatability to a scRNA-seq dataset of human PBMCs²⁵. In the original dataset, each cell was preannotated to one of multiple cell types (Fig. 2a). We used Annotatability to train a DNN classifier that predicts cell types from the annotated PBMC scRNA-seq data, and assign confidence and variability scores to each cell (Fig. 2b,c).

We first assess the set of cells classified as erroneously annotated by Annotatability on the basis of their low confidence and low variability scores (‘Classifying cells on the basis of confidence and variability scores and setting thresholds’ section in Methods). These erroneously annotated cells indeed exhibit underexpression (or no expression) of marker genes corresponding to their input cell type annotation (Fig. 2d). For example, cells classified as erroneously annotated CD14⁺ monocytes underexpress CD14, yet ~80% of them express the T cell marker IL7R and ~60% express the natural killer (NK)/T cell marker NKG7. Similarly, cells classified as erroneously annotated B cells underexpress the B cell marker CD79A, yet a substantial fraction expresses various T cell or dendritic cell markers (Fig. 2d–f and Supplementary Fig. 2a,b).

We followed by assessing whether cells classified as ambiguously annotated indeed correspond to intermediate or otherwise ambiguous cell types, focusing first on the monocytes population. Monocytes are broadly divided into three major subtypes: classical (CD14⁺⁺ CD16⁻), intermediate (CD14⁺⁺ CD16⁺) and nonclassical (CD14⁺ CD16⁺⁺)²⁶. However, in the original scRNA-seq data, monocytes were annotated as either classical (CD14⁺) or nonclassical (FCGR3A⁺), where FCGR3A is a common alias for CD16. Thus, we expect intermediate cells in the PBMC dataset to be classified as ambiguously annotated. To support our approach, we used previously established marker genes²⁵, specifically, the top two highly expressed marker genes for each monocyte subtype that were not filtered out during preprocessing (classical monocyte markers: S100A12/ALOX5AP; intermediate monocyte markers: GPR35 and MARCO; nonclassical monocyte markers: ICAM4/CD79b). Indeed, CD14⁺ monocytes classified by Annotatability as ambiguously annotated express lower levels of classical monocyte markers and higher levels of intermediate monocyte markers than CD14⁺ monocytes classified as correctly annotated (Fig. 2d,h, Supplementary Fig. 3 and Supplementary Section A.2.2). Similary, FCGR3A⁺ monocytes classified as ambiguously annotated express lower levels of nonclassical monocyte markers and higher levels of intermediate monocyte markers than FCGR3A⁺ monocytes classified as correctly annotated (Fig. 2d,h, Supplementary Fig. 3 and Supplementary Section A.2.2). In a semi-simulated dataset in which cells from the PBMC dataset were randomly blended, Annotatability outperformed other classifiers in identifying ambiguous cell states (Supplementary Section A.4). Similar analysis of peripheral blood NK cells revealed intermediate cell states classified by Annotatability as ambiguously annotated (Fig. 2g and Supplementary Section A.2.1).

Having validated Annotatability’s categorization of cells to either correctly, erroneously or ambiguously annotated cells, we next directly evaluate differences in training dynamics among these three categories as a function of the learning epoch. We focus on the group of cells originally annotated as classical (CD14⁺) monocytes. For each epoch, we compute the mean probability that cells in each category are recognized as classical monocytes by the neural network. We identify three distinct regimes along training: early (0–5 epochs), middle (5–50 epochs) and late (50–100 epochs). For cells in the correctly annotated category, the mean inferred probability to keep the classical (CD14⁺) annotation increases to a value close to 1.0 during the early epoch regime and remains high throughout the training process (Fig. 2i, blue). In that sense, correctly annotated cells are easy to learn. For cells in the incorrectly annotated category, the mean inferred probability decreases to a value close to 0.0 during the early regime and remains low throughout the middle regime, yet gradually increases during the late regime, reaching a value of 0.25 (Fig. 2i, red). In that sense, erroneously annotated cells are hard to learn, but the network begins to memorize their incorrect annotations during the late regime, which is therefore also termed the overfitting regime. For cells in the ambiguously annotated category, the mean inferred probability to keep the classical (CD14⁺) annotation increases gradually during the early and middle regimes, saturating only upon reaching the late regime (Fig. 2i, orange). For the same set of cells, the mean inferred probability that they are in fact nonclassical monocytes (FCGR3A⁺) increases during the early regime, but reverts to 0.0 during the middle regime. This example demonstrates how the training dynamics patterns of ambiguously annotated cells differ from those of correctly and erroneously annotated cells, and why they can inform an intermediate transcriptional state that is neither classical nor nonclassical (Fig. 2i).

Reannotating spatial transcriptomics data

After identifying erroneously annotated cells using Annotatability, one option is to filter them from their corresponding single-cell dataset, based on the assumption that the remaining cells retain sufficiently rich information. However, in some cases, discarding all misclassified cells may disrupt the biological interpretation of the data. For example, in spatially aware single-cell transcriptomic measurements²⁷, preserving comprehensive spatial maps of the measured regions is crucial for correctly interpreting spatial gene expression and for inferring associated collective cell behavior. Therefore, we equipped Annotatability with a reannotation module for the robust reannotation of cells identified as erroneously annotated (‘Reannotation of erroneously annotated cells’ section in Methods). Furthermore, the annotation of spatial data poses unique challenges compared with nonspatial data, as many of the spatially aware transcriptomics experimental methods are limited in the number of genes that are measured and/or in their spatial resolution, where each spatial location can contain mixed measurements of multiple cells ²⁸. While Annotatability, in its current form, does not simultaneously infer confidence for mixtures of cell types, the confidence score in a cell type annotation appears to be strongly correlated with the relative contribution of the corresponding type to the gene expression mixture (Supplementary Fig. 4e,f).

We evaluated the capability of Annotatability to identify erroneously annotated cells and reannotate them in spatial transcriptomics data for a MERFISH dataset of mouse hypothalamic preoptic region²⁹ (Fig. 3a–d). To benchmark our approach, we first used a semi-synthetic dataset, created by randomly perturbing the annotation of varying fractions of cells, with the underlying assumption that the majority of cells were correctly classified. We used Annotatability to identify the erroneously annotated cells in the perturbed MERFISH data, and compared its performance with scReClassify, a state-of-the-art method for the identification of misclassified cells³⁰ based on either a random forest (RF) or a support vector machine (SVM) classifier, in addition to comparing with k-nearest neighbors (KNN) classifier based on the scReClassify framework (Supplementary Section A.4). Annotatability consistently outperformed baselines in distinguishing between correctly and erroneously annotated cells in the perturbed data, its advantage increasing with the fraction of synthetically misclassified cells (Fig. 3e). Similarly, Annotatability outperformed baselines in identifying misclassified cells in the context of three additional synthetically perturbed spatial datasets, including 10x Genomics Visium dataset of coronal section of mouse brain³¹, 4i dataset of subcellular, human tissue cell culture³² and a seqFISH (sequential fluorescence in situ hybridization) mouse embryonic dataset³³ (Supplementary Fig. 5). The primary advantage of Annotatability becomes evident in ’noisy’ scenarios, for example, in cases with a large proportion of misclassified cells (Fig. 3 and Supplementary Fig. 5).

Progressing to a real-world setting, we inspected cells identified as erroneously annotated in the unperturbed hypothalamic MERFISH dataset (1,138 out of 62,880 cells; Fig. 3f–i and Supplementary Fig. 5a–d), of which the majority were initially annotated as either inhibitory neurons (587/24,761 erroneously annotated; Fig. 3f, j) or excitatory neurons (314/11,757 erroneously annotated; Fig. 3g,k). To reannotate these erroneously annotated cells, we next trained Annotatability’s reannotation module (‘Reannotation of erroneously annotated cells’ section in Methods) and used the trained reannotation classifier to predict annotations for the set of erroneously annotated cells (‘Reannotation of erroneously annotated cells’ section in Methods). Indeed, following this procedure, the annotations of cells identified as erroneously annotated were transformed into annotations overall consistent with their marker gene characterization, based on established marker genes for the different cell types²⁹ (Fig. 3j,k and Supplementary Table 1). For example, the mean normalized expression of Gad1, a marker gene for inhibitory neurons²⁹, is 6.830 in the correctly annotated Inhibitory neurons, 1.458 in the erroneously annotated inhibitory neurons and 6.795 in cells reannotated as inhibitory neurons. More generally, for cells that Annotatability identifies as erroneously annotated, marker gene expression, on average, aligns with the new cell type annotations following reannotation, rather than the original input annotations (Supplementary Table 1). Thus, error detection based on training dynamics combined with retraining on a subset of high-confidence/low-variability cells enables robust reannotation of spatial transcriptomics data.

Uncovering the epithelial-to-mesenchymal pseudotime trajectory

Going beyond reasoning about annotations of static cell properties and their interpretation, we next demonstrate that annotation-trainability analysis can reveal dynamic trajectories and temporal shifts in cellular states. We illustrate this concept using single-cell data related to the EMT³⁴. Traditionally perceived as a binary transition, the EMT has been conceptually generalized to a continuous temporal transition through intermediate cellular phenotypes³⁵. Computationally, these intermediate states can be inferred using techniques such as graph-based clustering and pseudotime reconstruction³⁵. However, graph-based clustering that relies solely on gene expression, as commonly done²¹, can be sensitive to sources of variation other than the EMT, such as treatment effects. In contrast, we use Annotatability to construct a trainability-aware gene expression graph that specifically enhances the EMT signal.

We used Annotatability to analyze scRNA-seq datasets of MCF10A and HuMEC cell lines³⁶, which serve as models for benign mammary epithelial cells. Our analysis included both untreated cells (mock condition) and cells treated with the cytokine TGF-β, which induces the EMT³⁶ (Fig. 4a), among other effects³⁷. To generate initial input labels for Annotatability, we clustered the cells on the basis of their gene expression using Louvain clustering and labeled cells in the resulting 22 clusters as either epithelial (E) or mesenchymal (M) on the basis of known marker genes³⁶ (epithelial: CDH1, CRB3 and DSP; mesenchymal: VIM, FN1 and CDH2) (Fig. 4b). Following standard preprocessing (‘Data preprocessing’ section in Methods) and without requiring batch integration, we then used Annotatability to compute confidence and variability scores for each cell (‘Annotatability workflow’ section in Methods; Fig. 4c,d). We also used Annotatability to identify erroneously annotated cells and filter them out.

In principle, the assigned confidence scores provide a continuous measure of each cell’s congruence with its input label. Therefore, we hypothesized that these scores reflect a cell’s state along the gradual EMT. Indeed, for cells that were originally labeled as epithelial, the mean expression of epithelial (mesenchymal) marker genes increases (decreases) monotonically with Annotability’s confidence score (Spearman’s rank correlations: 0.410 and −0.421, respectively) (Fig. 4e,f). In comparison, the correlation of the mean expression of the same genes to pseudotimes computed using diffusion pseudotime trajectory inference algorithm^21,22 is 0.260 and −0.127, respectively (Supplementary Section A.4). Similarly, the mean expression of mesenchymal (epithelial) marker genes increases (decreases) monotonically with the confidence score for cells that were originally labeled as mesenchymal (Spearman correlation: 0.606 and −0.210, respectively, compared with 0.555 and 0.031 for diffusion pseudotimes, respectively).

To focus on the EMT signal, we applied our trainability-aware graph embedding algorithm (‘Trainability-aware graph embedding’ section in Methods) to the MCF10A cell line scRNA-seq data, individually for the epithelial and mesenchymal cells, followed by Louvain clustering of each graph (analogous results were obtained for the HuMEC cell line; Supplementary Fig. 6). The trainability-aware graphs, integrating information about both training dynamics statistics and gene expression, indeed capture the progression of cells along the EMT as the confidence in the input E/M labels changes gradually along the graphs (Fig. 4g–i). This is in contrast to baseline expression graphs, that is, graphs constructed on the basis of gene expression alone, resulting in graph structures that capture treatment effects (mock versus TGF-β) instead of the EMT process (‘Trainability-aware graph embedding’ section in Methods; Fig. 4g–i and Supplementary Fig. 6a–c). Specifically, in the mesenchymal (epithelial) baseline expression graph, 0.99% (100%) of cells in one cluster and 0.98% (0.99%) of cells in the second cluster were treated with mock or TGF-β, respectively. This is in contrast to 0.69% (0.69%) and 0.77% (0.81%) of cells in each of the corresponding clusters in the trainability-aware graph that were treated with mock or TGF-β, respectively. Furthermore, E/M marker genes are differentially expressed at different stages along the trainability-aware expression graph structure, and not along the expression graph structure, further supporting the correspondence between the trainability-aware graph and the EMT process (Methods, Fig. 4j and Supplementary Fig. 6d).

Inference of disease-related cell states and treatment responses

We next use Annotatability to infer disease states, disease-associated genes and individual cell responses to treatment. We analyzed scRNA-seq data collected from pancreatic islets isolated from both healthy mice and from multiple low-dose Streptozotocin (mSTZ)-induced diabetic models³⁸. In the mSTZ-induced diabetic models, STZ is applied in multiple low doses, partially degrading β cell activity³⁹, where the level of degradation may differ between individual cells⁴⁰. To quantify the damage to individual α, β and δ cells in the mSTZ mice, we used Annotatability to infer confidence and variability scores with respect to the mSTZ label for each cell, against the backdrop of control cells labeled as healthy (‘Annotatability workflow’ section in Methods). We hypothesized that learning the mSTZ label is easiest for the most severely damaged cells and hardest for the least severely damaged cells. Indeed, the inferred confidence scores are negatively correlated with the expression of β cell activity markers (Ins1, Ins2, Slc2a2, Trpm5, G6pc2, Slc30a8 and Ucn3; Spearman’s ρ = −0.662 for sum of expression; Fig. 5a and Supplementary Fig. 7a) and positively correlated with the expression of β cell dedifferentiation markers (ρ = 0.573; Supplementary Fig. 7b,c)⁴¹. In comparison, for the diffusion pseudotime trajectory inference algorithm^21,22 (Supplementary Section A.4), the corresponding correlation coefficients are −0.021 for the β cell activity markers and −0.222 for the dedifferentiation markers. In the opposite direction, Annotatability’s confidence score has also enabled us to identify genes associated with the disease state of individual mSTZ β cells, in the sense that their expression levels are correlated (positively associated) or anticorrelated (negatively associated) with the confidence score assigned to each cell’s disease annotation (‘Annotation-trainability score’ section in Methods; Fig. 1g, Supplementary Section A.3.1 and Supplementary Table 2).

To refine the classification of mSTZ cells by disease level, we constructed a trainability-aware graph (‘Trainability-aware graph embedding’ section in Methods) followed by Louvain clustering, similarly to the above EMT analysis. Initially, we tuned the clustering resolution to guarantee an output of three clusters (Fig. 5b). The three clusters are distinguished by the expression patterns of the nine disease progression markers (Fig. 5b–d and Supplementary Fig. 7d–f): a disease cluster expressing low levels of activity and maturation markers and high levels of dedifferentiation markers; a healthy-like cluster expressing high levels of activity and maturation markers and low levels of dedifferentiation markers, resembling those in the healthy control; and an intermediate cluster expressing intermediate levels of all markers relative to the disease and healthy-like clusters. We verified that the difference in the expression of disease-associated genes across clusters indeed corresponds to changes in the training dynamics scores, specifically, decreasing confidence in the disease annotation along the disease, intermediate and healthy-like clusters (Fig. 5e and Supplementary Fig. 7g). The clustering quality is robust with respect to clustering granularity, consistently manifesting a graded progression from a disease-like to a healthy-like expression pattern of marker genes when the cells are divided into three, four or five clusters (Fig. 5f and Supplementary Fig. 7h–m). Thus, the clustering of the trainability-aware graph provides a refined quantification of the damage to individual mSTZ β cells.

Finally, the islets dataset includes six subgroups of mSTZ-induced diabetic models, each treated with a different compound or combinations thereof³⁸. The distribution of the confidence and variability score ranks for β cells in each treatment subgroup reveals both treatment-independent and treatment-dependent cellular heterogeneity (Fig. 5g,h, Supplementary Fig. 7a–c and Supplementary Section A.3.2). Specifically, each treatment group can be subdivided to a disease, intermediate and healthy-like cluster using a trainability-aware graph as above (Fig. 5h and Supplementary Fig. 7n). However, insulin-treated cells have a larger proportion of healthy-like cells, and the combination of glucagon-like peptide-1 and estrogen was shown to be more effective than either treatment alone, in line with previous studies on treatment effects⁴² Furthermore, our analysis revealed a small subset of 43 hard-to-learn β cells, almost all of which are insulin treated. Marker gene expression for these hard-to-learn cells aligns particularly closely with that of healthy cells, to a degree that is not observed in other treatment groups and cannot be explained by treatment-independent heterogeneity (Fig. 5i,j and Supplementary Section A.3.2). Thus, annotation-trainability analysis provides a highly sensitive tool for evaluating treatment effectiveness and inferring individual cell responses.

Discussion

Annotations of single cells enrich raw biological data with descriptive information that provides context, explanation and detail. However, discrete annotations also force heterogeneous cell populations into rigid molds, whose interpretation is inherently subjective. Consequently, a cell can be incongruent with its input annotation for reasons ranging from simple assignment errors to fundamental ambiguities associated with underlying biological processes. In this work, we introduced Annotatability, which identifies annotation mismatches by tapping into the commonly neglected signal encoded in a DNN’s training dynamics, and, subsequently, relies on these mismatches to enhance the interpretation of single-cell and spatial omics data.

Annotatability is a generic approach, in the sense that, given a dataset of cells and their corresponding annotations (or, in principle any biological entities and their annotations), it allows to leverage training dynamics to categorize cells according to their ‘goodness-of-match’ to their annotations. While fully connected DNN are utilized for Annotatability’s internal implementation, the approach itself is oblivious to how the input annotations were assigned externally.

At the dataset level, the convergence rate of training a DNN can be influenced by various properties of the data and annotations, such as geometric structure and noise levels. Consequently, these differences may manifest as variations in the distribution of annotated data points in terms of their confidence and variability scores, even for the same dataset using different annotation classes. For example, when the mSTZ dataset (‘Inference of disease-related cell states and treatment responses’ section in Results)³⁸ was annotated by experimental condition (healthy versus mSTZ mouse model) or treatment type, the confidence and variability scores displayed a relatively large spread and a relatively low proportion of easy-to-learn cells (Extended Data Fig. 2a,c), in line with the marked heterogeneity introduced by different disease states and treatment effectiveness (Fig. 5). In contrast, when the same dataset was annotated by cell type, the spread was much lower, with a high proportion of easy-to-learn cells (Extended Data Fig. 2b), probably due to the expression of distinct marker genes that define differentiated α, β and δ cells. Changes in the distribution of confidence and variability scores may reveal biological differences even within a single annotation class. For example, in a dataset of mouse small-intestine cells⁴³ annotated by cell type, the DNN had more difficulty learning stem cell labels than labels associated with more differentiated cells (Extended Data Fig. 1). In future work, we aim to extend Annotatability to enable comparative analysis of the global properties of annotated datasets based on their collective training dynamics. For example, training dynamics can serve as a practical proxy for complexity measures developed in the context of machine learning theory⁴⁴.

We next discuss several limitations of Annotatability and how to address them in the future. First, Annotatability does not directly handle convolved or aggregated data, such as that obtained from certain spatial transcriptomics technologies, and future work can extend the methodology explicitly to fractional or probabilistic annotations. Second, while Annotatability is robust to the precise choice hyperparameter values (Supplementary Fig. 1), the determination of hyperparameters such as training time need to be standardized, as they may influence the determination of the confidence and variability scores. Third, the use of thresholding (‘Classifying cells on the basis of confidence and variability scores and setting thresholds’ section in Methods) on these scores for categorizing introduces arbitrary cutoffs that could influence the interpretation of the data and will be further automated in the future. Fourth, Annotatability currently treats input annotations as nominal categories. For the special but common binary case (for example, epithelial versus mesenchymal, healthy versus diseased tissue), the confidence in a cell’s annotation naturally gives rise to ordinal or numerical interpretations, where lack of confidence (and low variability) in one annotation may be considered as increased confidence in its complement. In future work, we aim to generalize Annotatability to probabilistic and continuous annotations, which will enable us to leverage the internal structure of such annotated datasets, including those continuously annotated according to a developmental or disease trajectory. Furthermore, while the measures of confidence and variability of DNN predictions along the training process provided ample information to reveal cellular heterogeneity, structure, and other patterns related to label incongruence, future work can incorporate additional measures derived from the training dynamics of a DNN to characterize additional aspects of the congruence of cells with their input annotations and potentially provide a wider interpretation lens into single-cell heterogeneity. Finally, in this work, we demonstrate the application of annotation-trainability analysis to scRNA-seq and spatial transcriptomics datasets. However, our approach is general and can readily extend to other single-cell multi-omic techniques and annotated biological datasets.

Methods

Annotatability workflow

Given input data that include single-cell or spatial omics observables (for example, scRNA-seq gene expression profiles) and corresponding annotations (for example, cell type annotation per cell), the Annotatability general-case workflow is as follows:

(1)
preprocess the data;
(2)
train a DNN on the input data, monitoring the training dynamics and recording the prediction of the DNN after each epoch;
(3)
calculate the training dynamics scores confidence and variability;
(4)
classify each observable (for example, cell) into one of three categories: easy to learn (correctly annotated), hard to learn (erroneously annotated) or ambiguous;
(5)
optional step: filter out erroneously annotated cells;
(6)
optional step: reannotate erroneously annotated cells using the reannotation module;
(7)
optional step: construct trainability-aware graph embedding, integrating information from gene expression and training dynamics statistics, followed by signal-specific analysis;
(8)
optional step: score and rank genes according to their (positive or negative) annotation trainability association.

Rationale

One of the main aims of Annotatability is to detect cells that are erroneously annotated and cells that are in ambiguous cell states. For both of those tasks, we train a deep learning-based classifier to assign each cell to its input annotation and monitor the training dynamics. We set the model’s level of complexity to be sufficiently high to ensure that the network will eventually assign the input annotations to all provided examples; the underlying assumption is that a complex network such as the one we are using is capable of memorizing any input annotation, even if it is incorrect⁴⁵. Formally, our approach is based on a model that involves selecting parameters to minimize empirical risk by stochastic gradient-based optimization procedure over E epochs. The model is assumed to establish a probability distribution over labels for a given observation (cell). The training dynamics of a given instance i, following previous work in the context of natural language processing¹⁸, are described by statistical metrics computed over E epochs, which are then used as coordinates on a map. The first metric aims to quantify the level of confidence with which the learning algorithm assigns the correct label (y_i) to an observation (the gene expression of the cell, x_i) based on its probability distribution. Specifically, we define confidence, $\widehat{{u}_{i}}$, as the average probability assigned by the model to the correct label (y_i) across all epochs:

$$\widehat{{u}_{i}}=\frac{1}{E}\mathop{\sum }\limits_{e=1}^{E}{p}_{{\theta }^{e}}({y}_{i}| {x}_{i}),$$

(1)

where ${p}_{{\theta }^{e}}$ denotes the model’s probability with parameters θ^e for the eth epoch. In addition, variability is defined using the standard deviation of ${p}_{{\theta }^{e}}({y}_{i}| {x}_{i})$ across epochs:

$${\hat{\sigma}}_{i}=\sqrt{\frac{\mathop{\sum }\nolimits_{e = 1}^{E}{({p}_{{\theta }^{e}}({y}_{i}| {x}_{i})-{\hat{u}}_{i})}^{2}}{E}}.$$

(2)

Based on previous studies in the fields of computer vision^46,47 and natural language processing¹⁸, our underlying assumption is that cells with high confidence and low variability (easy to learn) probably have correctly labeled annotations, which the network learns early in the training process. Conversely, cells with low confidence and low variability (hard to learn) are probably mislabeled and would be learned only at later stages of the training procedure. Cells with moderate confidence and high variability are in an ambiguous or intermediate state associated with at least two different annotations.

Initial training phase

For training, we constructed a simple DNN consisting of three fully connected layers, each utilizing the ReLU activation function⁴⁸. For the output layer, we used a log softmax activation function. To calculate the confidence and variability scores, we exponentiate the resulting log probabilities to transform them into a standard probability distribution. As a loss function, we used cross-entropy loss, which is minimized using a stochastic gradient descent optimizer with an initial learning rate of 0.001 and momentum of 0.9. We used PyTorch⁴⁹ to perform the training procedure described above. To mitigate the effects of class imbalance (for example, rare cell types) we used a weighted sampler, ensuring that rare annotations are given higher importance during the training process. The number of epochs is chosen to be sufficiently large, such that the empirical risk will be minimized and will reach a value close to zero by the end of the training procedure. In cases where the training process fails to converge, often characterized by high loss and only a small fraction of easy-to-learn samples, it is advisable to deepen the neural network. This option is exemplified in the Annotatability code package. The full procedure can be found in the publicly available code via GitHub at https://github.com/nitzanlab/Annotatability.

Annotation-trainability score

We define the annotation-trainability positive association score (g_j per gene j), which utilizes training dynamics to identify genes with higher expression levels in cell states associated with a given annotation. Using the annotation-trainability association score, genes can be ranked on the basis of the correlation between their expression in each cell and the corresponding confidence scores:

$$\mathbf{g}={A}^{T}\cdot \mathbf{f}(\mathbf{\hat{u}}),$$

(3)

where $\mathbf{g}$ is the vector of gene association scores, A is the gene expression matrix (rows correspond to cells and columns correspond to genes), f: R → R is a monotonically increasing function for rescaling confidence scores (by default, Annotatability uses the identity function $f(\hat{{u}_{i}})=\hat{{u}_{i}}$), and $\mathbf{f}$ is a vector-valued function applying f(x) to every entry in the vector of cell confidence scores $\mathbf{\hat{u}}$. As a preprocessing step, the expression level of each gene is scaled to have variance 1 (L2 normalization).

Similarly, we define an annotation-trainability negative association (${\widetilde{g}}_{j}$), which identifies genes with lower expression levels in cell states associated with a given annotation based on training dynamics:

$$\mathbf{\widetilde{g}}={A}^{T}\cdot \mathbf{f}(1-\mathbf{\hat{u}}).$$

(4)

For the special case where $\mathbf{f}(x)$ is the identity function, $\mathbf{\widetilde{g}}$ equals $C-\mathbf{g}$ for some constant C and, thus, $\mathbf{\widetilde{g}}$ ranks genes in the reverse order relative to g. However, for the general case where f(x) may be a nonlinear monotonically increasing function, genes ranked by $\mathbf{\widetilde{g}}$ may be sorted differently from the reverse of their rankings by g.

Classifying cells on the basis of confidence and variability scores and setting thresholds

To classify cells as easy to learn, hard to learn or ambiguous, Annotatability applies a threshold on the computed confidence and variability scores. This threshold is dataset dependent, similarly to the case of area under the margin ranking statistic in the context of computer vision¹⁷.

In many cases, the training process generates well-separated groups of cells that can, in turn, be clustered in the confidence–variability plane to three groups corresponding to low confidence/low variability, mid-confidence/high variability and high confidence/low variability, a threshold can be set manually to distinguish the groups to hard to learn, ambiguous and easy to learn, respectively. We will next discuss how to set each one of the thresholds when the cells cannot be easily clustered as described above.

Threshold for cells that are hard to learn

In the general case, we infer the threshold for hard-to-learn cells as follows. (1) Randomly sample c cells (5–10%) and change their annotations (sample a different annotation out of the set of input annotations uniformly at random). (2) Next, train a DNN classifier (as described in ‘Initial training phase’ section) and record the training dynamics. Set the threshold over the confidence and variability scores for hard-to-learn cells as the q percentile of the confidence and variability scores over the c reannotated cells. q is chosen in a dataset-dependent manner to be between 25% (for datasets with a relatively low fraction of erroneously annotated cells, such as the PBMC scRNA-seq dataset) and 90% (for datasets with a relatively high fraction of erroneously annotated cells, such as spatial transcriptomics).

Threshold between easy-to-learn and ambiguously annotated cells

When the underlying process captured by the annotations is continuous, we can bypass the need for a threshold altogether by ranking the cells to capture the transition between the easy-to-learn and ambiguous regions in the confidence–variability plane (such as in the case of the EMT process; ‘Uncovering the epithelial-to-mesenchymal pseudotime trajectory’ section in Results). A threshold can be chosen manually if those groups are separated in the confidence–variability plane (as in Fig. 3d). Or, if marker genes for these groups are available, their expression and transition over the confidence–variability plane can be used to inform thresholding. In this manuscript, the threshold for easy-to-learn cells was set consistently for all datasets to be at confidence score 0.95 and variability score 0.15. The threshold can also be tuned, keeping in mind that, with more epochs, the confidence will increase and the variability will decrease. An alternative way to set the threshold is to use the signal-aware graph embedding (‘Trainability-aware graph embedding’ section) and perform graph-based clustering on the inferred graph (as in ‘Characterizing cells and annotations with Annotatability’ and ‘Inference of disease-related cell states and treatment responses’ sections in Results), to generate structure that can potentially distinguish cells by the integration of both gene expression and trainability-based measures.

Reannotation of erroneously annotated cells

To correct the annotations of cells that were identified by Annotatability as erroneously annotated, a DNN classifier (as described in ‘Initial training phase’ section) is trained exclusively on the subset of cells identified as correctly annotated. The erroneously annotated cells are then reannotated according to the predictions of the newly trained DNN.

Trainability-aware graph embedding

To construct a trainability-aware gene expression graph, we first compute pairwise distances between cells by integrating gene expression information and training dynamics statistics. Specifically, the trainability-aware gene expression distance matrix $\widetilde{W}$ is computed as

$${\widetilde{W}}_{ij}=\alpha \times {W}_{ij}+(1-\alpha )\times \frac{(| {\hat{u}}_{i}-{\hat{u}}_{j}| )}{N},$$

(5)

where W is the Euclidean distance matrix, W_ij is the Euclidean distance between cells i and j in gene expression space following dimensionality reduction using PCA (default number of principal components: 50), ${\hat{u}}_{i}$ is the confidence score for cell i, N is the mean value of $| {\hat{u}}_{i}-{\hat{u}}_{j}|$ across all pairs 0 ≥ i, j ≥ n (where n is the total number of cells) and α is a tunable parameter 0 ≤ α ≤ 1, which interpolates between a gene expression-based distance matrix (α = 1) and a trainability-based distance matrix (α = 0).

Next, the trainability-aware expression distance matrix $\widetilde{W}$ is transformed into an affinity matrix M using a Gaussian kernel:

$${M}_{ij}=\exp ({\widetilde{W}}_{ij}/\bar{\widetilde{W}}),$$

(6)

where $\bar{\widetilde{W}}$ is the mean over all values in $\widetilde{W}$. Finally, the trainability-aware expression graph is constructed by computing a KNN graph (k = 15) over the affinity matrix M.

Data preprocessing

We used standard scRNA-seq preprocessing, which includes per-cell normalization (to 10,000 counts), and a log transformation that is applied to stabilize variance ($\log ({\rm{normalized}}\,{\rm{expression}}\,{\rm{value}}+1)$). For the EMT dataset, we use the raw data and filtered out low-quality cells²¹. The cells that were filtered were those with high amounts of mitochondrial genes (≥5% total counts), cells with a high total count number (number of expressed genes higher than 4,000) and cells expressing less than 200 genes. In addition, we retained only genes that were expressed in more than 3 cells and were highly variable genes (3,000 genes; sc.pp.filter_gene_disppersion²¹). In the remaining datasets, since they are already preprocessed, we did not perform feature selection. For the PBMC dataset, we excluded cells annotated as megakaryocytes owing to their low number (88/11,990).

Distribution across the confidence–variability plane

The distribution across the confidence–variability plane produced by Annotatability for a given dataset of annotated cells can reflect the difficulty of learning that annotated dataset and the relative fractions of cells in ambiguous and hard-to-learn states within it. Therefore, we suggest a top-down measure for comparing the hardness of learning different classes of annotations for the same dataset, specifically, comparing those classes, following identical training procedures, by calculating the variance of both the confidence and variability scores, as inferred by Annotatability, across all cells.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The EMT and mSTZ scRNA-seq datasets were acquired from the Gene Expression Omnibus database with the following accession numbers: EMT, GSE114687 (ref. ³⁶); mSTZ, GSE114687 (ref. ³⁸); small intestine, GSE201859 (ref. ⁴³). The PBMC data^50,51 were downloaded using scvi-tools²⁵. The MERFISH²⁹, 4i³², seqFISH³³ and Visium³¹ datasets were downloaded using Squidpy⁵².

Code availability

The code is publicly available via GitHub at https://github.com/nitzanlab/Annotatability (ref. ⁵³).

References

Ji, X. et al. scannotate: an automated cell-type annotation tool for single-cell RNA-sequencing data. Bioinform. Adv. 3, vbad030 (2023).
Article Google Scholar
Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. Cellphonedb: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020).
Article Google Scholar
Lange, M. et al. Cellrank for directed single-cell fate mapping. Nat. Methods 19, 159–170 (2022).
Article Google Scholar
Afriat, A. et al. A spatiotemporally resolved single-cell atlas of the plasmodium liver stage. Nature 611, 563–569 (2022).
Article Google Scholar
Hsieh, C.-Y. et al. scdrug: from single-cell RNA-seq to drug response prediction. Comput. Struct. Biotechnol. J. 21, 150–157 (2023).
Article Google Scholar
Clarke, Z. A. et al. Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat. Protoc. 16, 2749–2764 (2021).
Article Google Scholar
Lange, M. et al. Mapping lineage-traced cells across time points with moslin. Genome Biol. 25, 277 (2024).
Article Google Scholar
Schwabe, D., Formichetti, S., Junker, J. P., Falcke, M. & Rajewsky, N. The transcriptome dynamics of single cells during the cell cycle. Mol. Syst. Biol. 16, e9946 (2020).
Article Google Scholar
Moriel, N. et al. Novosparc: flexible spatial reconstruction of single-cell gene expression with optimal transport. Nat. Protoc. 16, 4177–4200 (2021).
Article Google Scholar
Nitzan, M., Karaiskos, N., Friedman, N. & Rajewsky, N. Gene expression cartography. Nature 576, 132–137 (2019).
Article Google Scholar
Karin, J., Bornfeld, Y. & Nitzan, M. scPrisma infers, filters and enhances topological signals in single-cell data using spectral template matching. Nat. Biotechnol. 41, 1645–1654 (2023).
Article Google Scholar
Burkhardt, D. B. et al. Quantifying the effect of experimental perturbations at single-cell resolution. Nat. Biotechnol. 39, 619–629 (2021).
Article Google Scholar
Shao, X. et al. scdeepsort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network. Nucleic Acids Res. 49, e122–e122 (2021).
Article Google Scholar
Jia, S., Lysenko, A., Boroevich, K. A., Sharma, A. & Tsunoda, T. scdeepinsight: a supervised cell-type identification method for scRNA-seq data with deep learning. Brief. Bioinform. 24, bbad266 (2023).
Article Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64, 107–115.
Arpit, D. et al. A closer look at memorization in deep networks. In International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 233–242 (PMLR, 2017).
Pleiss, G., Zhang, T., Elenberg, E. & Weinberger, K. Q. Identifying mislabeled data using the area under the margin ranking. Adv. Neural Inform. Process. Syst. 33, 17044–17056 (2020).
Google Scholar
Swayamdipta, S. et al. Dataset cartography: mapping and diagnosing datasets with training dynamics. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (eds Webber, B. et al.) 9275–9293 (Association for Computational Linguistics, 2020); https://aclanthology.org/2020.emnlp-main.746
Arazo, E., Ortego, D., Albert, P., O’Connor, N. & McGuinness, K. Unsupervised label noise modeling and loss correction. In International Conference on Machine Learning, (eds Chaudhuri, K. & Salakhutdinov, R.) 312–321 (PMLR, 2019).
Lê, M. T., Wolinski, P. & Arbel, J. Efficient neural networks for tiny machine learning: a comprehensive review. Preprint at https://arxiv.org/abs/2311.11883v1 (2023).
Wolf, F. A., Angerer, P. & Theis, F. J. Scanpy: large-scale single-cell gene expression data analysis. Genome Biol. 19, 1–5 (2018).
Article Google Scholar
Wolf, F. A. et al. Paga: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 1–9 (2019).
Article Google Scholar
Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019).
Article Google Scholar
Feng, C. et al. Dimension reduction and clustering models for single-cell rna sequencing data: a comparative study. Int. J. Mol. Sci. 21, 2181 (2020).
Article Google Scholar
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
Article Google Scholar
Wong, K. L. et al. Gene expression profiling reveals the defining features of the classical, intermediate, and nonclassical human monocyte subsets. Blood 118, e16–e31 (2011).
Article Google Scholar
Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 596, 211–220 (2021).
Article Google Scholar
Cable, D. M. et al. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat. Biotechnol. 40, 517–526 (2022).
Article Google Scholar
Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, eaau5324 (2018).
Article Google Scholar
Kim, T. et al. screclassify: post hoc cell type classification of single-cell RNA-seq data. BMC Genomics 20, 1–10 (2019).
Article Google Scholar
V1 Adult Mouse Brain Coronal Section 2. Version Cell Ranger 1.1.0 (10x Genomics, 2020).
Gut, G., Herrmann, M. D. & Pelkmans, L. Multiplexed protein maps link subcellular organization to cellular states. Science 361, eaar7042 (2018).
Article Google Scholar
Lohoff, T. et al. Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nature Biotechnol. 40, 74–85 (2022).
Article Google Scholar
Roche, J. The epithelial-to-mesenchymal transition in cancer. Cancers 10, 52 (2018).
Article Google Scholar
MacLean, A. L., Hong, T. & Nie, Q. Exploring intermediate cell states through the lens of single cells. Curr. Opin. Syst. Biol. 9, 32–41 (2018).
Article Google Scholar
McFaline-Figueroa, J. L. et al. A pooled single-cell genetic screen identifies regulatory checkpoints in the continuum of the epithelial-to-mesenchymal transition. Nat. Genet. 51, 1389–1398 (2019).
Article Google Scholar
Weiss, A. & Attisano, L. The TGFβ superfamily signaling pathway. Wiley Interdisc. Rev. Dev. Biol. 2, 47–63 (2013).
Article Google Scholar
Tritschler, S. et al. A transcriptional cross species map of pancreatic islet cells. Mol. Metab. 66, 101595 (2022).
Article Google Scholar
Like, A. A. & Rossini, A. A. Streptozotocin-induced pancreatic insulitis: new model of diabetes mellitus. Science 193, 415–417 (1976).
Article Google Scholar
Hrovatin, K. et al. Delineating mouse β-cell identity during lifetime and in diabetes with a single cell atlas. Nat. Metab. 5, 1615–1637 (2023).
Article Google Scholar
Hahn, M. et al. Topologically selective islet vulnerability and self-sustained downregulation of markers for β-cell maturity in streptozotocin-induced diabetes. Commun. Biol. 3, 541 (2020).
Article Google Scholar
Sachs, S. et al. Targeted pharmacological therapy restores β-cell function for diabetes remission. Nat. Metab. 2, 192–209 (2020).
Article Google Scholar
Zwick, R. K. et al. Epithelial zonation along the mouse and human small intestine defines five discrete metabolic domains. Nat. Cell Biol. 26, 250–262 (2024).
Article Google Scholar
Sontag, E. D. et al. Vc dimension of neural networks. NATO ASI Ser. F 168, 69–96 (1998).
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64, 107–115 (2021).
Article Google Scholar
Hacohen, G., Choshen, L. & Weinshall, D. Let’s agree to agree: neural networks share classification order on real datasets. In International Conference on Machine Learning (eds Daumé, H. III & Singh, A.) 3950–3960 (PMLR, 2020).
Paul, M., Ganguli, S. & Dziugaite, G. K. Deep learning on a data diet: finding important examples early in training. Adv. Neural Inform. Process. Syst. 34, 20596–20607 (2021).
Google Scholar
Agarap, A. F. Deep learning using rectified linear units (ReLU). Preprint at https://arxiv.org/abs/1803.08375v2 (2018).
Paszke, A. et al. Automatic differentiation in PyTorch. In Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 8024–8035 (MIT Press, 2017).
4k PBMCs from a healthy donor. 10x Genomics https://www.10xgenomics.com/datasets/4-k-pbm-cs-from-a-healthy-donor-2-standard-2-1-0 (2017).
8k PBMCs from a healthy donor. 10x Genomics https://www.10xgenomics.com/datasets/8-k-pbm-cs-from-a-healthy-donor-2-standard-2-1-0 (2017).
Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods 19, 171–178 (2022).
Article Google Scholar
Karin, J. Interpreting single-cell and spatial omics data using deep neural networks training dynamics. Zenodo https://doi.org/10.5281/zenodo.13838816 (2024).

Download references

Acknowledgements

We thank D. Ben-Zvi, R. Schwartz and members of the Raveh and Nitzan labs for their thoughtful feedback. We thank O. Mittelpunkt for her assistance in the graphic design. This work was funded by the Center for Interdisciplinary Data Science Research at the Hebrew University of Jerusalem (J.K. and M.N.), Alon Fellowship, the Israel Science Foundation (grant no. 1079/21), the European Union (ERC, DecodeSC, 101040660) (M.N.), The Israeli Council for Higher Education Ph.D. fellowship (J.K.) and the Minerva Center on Cell Intelligence at the Hebrew University (R.M. and B.R.). Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.

Author information

These authors contributed equally: Jonathan Karin, Reshef Mintz.

Authors and Affiliations

School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
Jonathan Karin, Reshef Mintz, Barak Raveh & Mor Nitzan
Racah Institute of Physics, The Hebrew University of Jerusalem, Jerusalem, Israel
Mor Nitzan
Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
Mor Nitzan

Authors

Jonathan Karin
View author publications
You can also search for this author in PubMed Google Scholar
Reshef Mintz
View author publications
You can also search for this author in PubMed Google Scholar
Barak Raveh
View author publications
You can also search for this author in PubMed Google Scholar
Mor Nitzan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.K. and R.M. conceived and developed the framework under the guidance of B.R. and M.N. J.K. and R.M. implemented the software. J.K. designed and conducted the experiments of the EMT and small-intestine datasets. J.K. and R.M. designed the experiments of the PBMC, spatial transcriptomics and mSTZ datasets. J.K. conducted the experiments on the spatial transcriptomics and PBMC datasets. R.M. conducted the experiments on the mSTZ dataset. All authors interpreted the results, shaped the research and wrote the manuscript.

Corresponding authors

Correspondence to Barak Raveh or Mor Nitzan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ananya Rastogi, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Analyzing three sets of annotations for mice small intestine scRNA-seq dataset (n = 19847) (ref. 43).

(a-c) Confidence-variability maps and corresponding 1D histograms of different annotations: (a) cell cycle phase, (b) spatial segment, and (c) cell type. (d, e) 2D UMAP colored by (d) cell type and (e) inferred confidence in cell type annotations. (f) Stacked bar plot showing the number of cells from each cell type across different inferred confidence levels in cell type annotations. (g, h) 2D UMAP colored by (g) cell cycle phase and (h) inferred confidence in cell cycle phase annotations. (i) Stacked bar plot displaying the number of cells from each cell type across different inferred confidence levels in cell cycle phase annotations. (j, k) 2D UMAP colored by (j) spatial domains and (k) inferred confidence for domains annotations.

Source data

Extended Data Fig. 2 Analysis of the distribution of both confidence and variability for different annotations of the mSTZ dataset.

UMAP, confidence-variability map, and 1D histograms of different annotations of the mSTZ-induced diabetic scRNA-seq dataset³⁸, where Annotatability workflow was applied for each label separately. (a) condition (mSTZ/ control), (b) cell type (α/ β/ δ), and (c) treatment (Ctrl/ Estrogen/ GLP1/ GLP1-E/ GLP1-E + PEG-insulin/ PEG-insulin/ Vehicle). Cell type annotations resulted in nearly all cells being easy-to-learn with low variability (variance of confidence = 0.00011, variance of variability = 1.087e-05). Conversely, condition and treatment annotations led to a broader distribution across the confidence-variability plane, with higher variance in both confidence (0.0127 for condition, 0.0427 for treatment) and variability (0.0010 for condition, 0.0031 for treatment) scores across cells.

Source data

Supplementary information

Supplementary Information

Supplementary Material, Figs. 1–8 and Table 1.

Reporting Summary

Supplementary Table 2

Positively and negatively correlated genes associated with the disease state of individual mSTZ β cells.

Source data

Source Data Fig. 2

Inferred confidence and variability for all of the cells.

Source Data Fig. 3

Inferred confidence and variability for all of the cells, mean and variance of area under the curve.

Source Data Fig. 4

Inferred confidence and variability for all of the cells.

Source Data Fig. 5

Inferred confidence and variability for all of the cells confidence and variability rank for all of the cells, and source data for ‘d’ and ‘h’ plots.

Source Data Extended Data Fig. 1

Inferred confidence and variability for all of the cells for cell type, cell cycle phase and segment annotations.

Source Data Extended Data Fig. 2

Inferred confidence and variability for all of the cells for condition, treatment and cell type annotations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Karin, J., Mintz, R., Raveh, B. et al. Interpreting single-cell and spatial omics data using deep neural network training dynamics. Nat Comput Sci 4, 941–954 (2024). https://doi.org/10.1038/s43588-024-00721-5

Download citation

Received: 23 March 2024
Accepted: 08 October 2024
Published: 04 December 2024
Issue Date: December 2024
DOI: https://doi.org/10.1038/s43588-024-00721-5

Subjects

Abstract

Similar content being viewed by others

Main

Results

Characterizing cells and annotations with Annotatability

Identifying erroneous annotations and ambiguous cell states in human PBMCs

Reannotating spatial transcriptomics data

Uncovering the epithelial-to-mesenchymal pseudotime trajectory

Inference of disease-related cell states and treatment responses

Discussion

Methods

Annotatability workflow

Rationale

Initial training phase

Annotation-trainability score

Classifying cells on the basis of confidence and variability scores and setting thresholds

Threshold for cells that are hard to learn

Threshold between easy-to-learn and ambiguously annotated cells

Reannotation of erroneously annotated cells

Trainability-aware graph embedding

Data preprocessing

Distribution across the confidence–variability plane

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Extended Data Fig. 1 Analyzing three sets of annotations for mice small intestine scRNA-seq dataset (n = 19847) (ref. 43).

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.