0% found this document useful (0 votes)

15 views29 pages

Early Detection of Visual Impairment in Young Children Using A Smartphone-Based Deep Learning System

The document presents a study on the Apollo Infant Sight (AIS), a smartphone-based deep learning system designed for early detection of visual impairment in young children. The system analyzes gazing behaviors and facial features from videos of children to identify 16 ophthalmic disorders, achieving high accuracy in both clinical and at-home settings. The AIS has the potential to improve early diagnosis and intervention for visual impairment, particularly in low-resource environments.

Uploaded by

sjtu.xiaowei

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views29 pages

Early Detection of Visual Impairment in Young Children Using A Smartphone-Based Deep Learning System

Uploaded by

sjtu.xiaowei

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

nature medicine

Article https://doi.org/10.1038/s41591-022-02180-9

Early detection of visual impairment in

young children using a smartphone-based
deep learning system
Received: 16 June 2022 A full list of authors and their affiliations appears at the end of the paper.

Accepted: 9 December 2022

Early detection of visual impairment is crucial but is frequently missed in
Published online: 26 January 2023
young children, who are capable of only limited cooperation with standard
Check for updates vision tests. Although certain features of visually impaired children, such as
facial appearance and ocular movements, can assist ophthalmic practice,
applying these features to real-world screening remains challenging. Here,
we present a mobile health (mHealth) system, the smartphone-based Apollo
Infant Sight (AIS), which identifies visually impaired children with any of 16
ophthalmic disorders by recording and analyzing their gazing behaviors
and facial features under visual stimuli. Videos from 3,652 children (≤48
months in age; 54.5% boys) were prospectively collected to develop and
validate this system. For detecting visual impairment, AIS achieved an
area under the receiver operating curve (AUC) of 0.940 in an internal
validation set and an AUC of 0.843 in an external validation set collected
in multiple ophthalmology clinics across China. In a further test of AIS for
at-home implementation by untrained parents or caregivers using their
smartphones, the system was able to adapt to different testing conditions
and achieved an AUC of 0.859. This mHealth system has the potential to be
used by healthcare professionals, parents and caregivers for identifying
young children with visual impairment across a wide range of ophthalmic
disorders.

Visual impairment is one of the most important causes of long-term tests, even when performed by experienced pediatric ophthalmolo-
disability in children worldwide and has a detrimental impact on edu- gists, have been shown to have low repeatability in large-scale popu-
cation and socioeconomic achievements1,2. Infancy and toddlerhood lation screening studies9–11. Therefore, it is imperative to develop an
(early childhood) are critical periods for visual development3, dur- easy-to-use and effective detection tool to enable the timely diagnosis
ing which early detection and prompt treatment of ocular pathology of visual impairment in young children and prompt intervention.
can prevent irreversible visual loss4,5. Young children are unable to Ocular abnormalities causing visual impairment in children often
complain of visual difficulties, and since they are unwilling or find it manifest with typical phenotypic features, such as leukocoria (white
difficult to cooperate with standard vision tests (for example, optotype eye) in cataract12 and retinoblastoma13, eyelid drooping in congenital
tests), age-appropriate tests such as grating acuity cards are commonly ptosis14, and a cloudy and enlarged cornea in congenital glaucoma15. In
used to observe their reactions to visual stimuli6,7. However, evaluating addition, previous studies have found that dynamic aberrant behavio-
the vision of young children using these tests requires highly trained ral features such as abnormal ocular movement, fixation patterns or
operators, which greatly hinders their wider adoption, especially in visual preference can also point toward an underlying ocular pathol-
low-income and middle-income countries with the highest prevalence ogy in children16,17. These phenotypic manifestations are frequently
of visual impairment but poor medical resources8. In addition, these seen in ocular diseases, such as amblyopia and strabismus, and they

e-mail: dingxiaowei@sjtu.edu.cn; linht5@mail.sysu.edu.cn

Nature Medicine | Volume 29 | February 2023 | 493–503 493

Article https://doi.org/10.1038/s41591-022-02180-9

can provide valuable clues for diagnosing visual impairment in young preparations, including choosing and maintaining a suitable testing
children18–20. However, systematically recording and applying these setting (Extended Data Fig. 2). After data collection was completed, DL
features to real ophthalmic practice are still in their infancy due to the models were applied to analyze the collected features and identify visu-
lack of practical and effective tools. ally impaired children. To ensure the system’s performance in chaotic
Given the rapid development of mobile health (mHealth) and arti- settings (environments with various interference factors or biases that
ficial intelligence (AI) algorithms in identifying or monitoring disease can impact the system’s performance), a series of algorithm-based qual-
states21,22, the use of mobile devices, such as smartphones, to record ity checking operations, including face detection (max-margin object
and analyze phenotypic features to help identify visual impairment detection (MMOD) convolutional neural network (CNN)); facial key
in young children presents great opportunities. However, develop- point localization (ensemble of regression trees); and crying, occlu-
ing such a system for large-scale ophthalmic application is hindered sion and interference factor detections (the EfficientNet-B2 backbone
by three main challenges: (1) collecting phenotypic data that reliably shown in Extended Data Fig. 3a,b), was first automatically performed by
reflect the visual status of the children in complex environments, (2) a quality control module to extract consecutive frames of high quality
generalizing the system for large-scale applications and (3) providing from the original video as short clips. Facial areas were cropped out
evidence of its feasibility. The major bottleneck that impedes the wide- to further eliminate environmental interference before the qualified
spread adoption of many medical AI systems is the limited feasibility clips were sent to a DL-based detection model for identifying visually
and reliability when applied to settings with various data distribu- impaired children and a diagnostic model for discriminating multiple
tions in the real world23,24. A lack of cooperation is very common in ocular disorders (the EfficientNet-B4 backbone shown in Extended
pediatric ophthalmic practice, with constant head movement during Data Fig. 3c). The final results were returned to the mHealth app to
examinations introducing test noise that poses several challenges to alert users to promptly refer children at high risk of visual impairment
the stability of the system25. For the nascent technology of mHealth, to experienced pediatric ophthalmologists for timely diagnosis and
rigorous evidence of clinical application is necessary but generally lack- intervention.
ing21. These major difficulties explain the current lack of an effective We first developed the data quality control module. Two facial
and practical tool for detecting visual impairment in young children. detection and key point localization models were pretrained on pub-
In this prospective, multicenter, observational study, we devel- licly available datasets and adopted from an open-source library26.
oped and validated a smartphone-based system, the Apollo Infant Sight Additionally, we developed three CNNs for crying, interference and
(AIS), to identify visual impairment in young children in real-world occlusion detection using images sampled from raw videos collected
settings. AIS was designed to induce a steady gaze in children by using at the ZOC clinic (Extended Data Fig. 3d and Supplementary Table 1).
cartoon-like video stimuli and collect videos that capture phenotypic Then, we trained and validated the detection/diagnostic models on the
features (facial appearance and ocular movements) for further analysis development dataset collected by trained volunteers using iPhone-7/8
using deep learning (DL) models with robust quality control design smartphones at the clinic of ZOC (Extended Data Fig. 3e). A total of
against test noises. We collected more than 25,000,000 frames of 2,632 raw videos from 2,632 children were collected, and after auto-
videos from 3,652 children using AIS for DL model training and testing. matic quality control, videos of 2,344 children (89.1%) were reserved as
We evaluated the system for detecting visual impairment caused by the development dataset (Fig. 1b), including 871 (37.2%) for children in
any of 16 ophthalmic disorders in five clinics at different institutions. the ‘nonimpairment’ group, 861 (36.7%) in the ‘mild impairment’ group
Furthermore, we validated this system under different conditions with and 612 (26.1%) in the ‘severe impairment’ group. Detailed information
various test noise levels or ambient interference presented in real-world on the qualified dataset is provided in Table 1. Before model training,
settings. We also evaluated AIS used by untrained parents or caregivers the development dataset was randomly split into training, tuning
at home to test its wider applicability. This preliminary study indicates and validation sets stratified on sex, age and the ophthalmic condi-
that AIS shows potential for early detection of visual impairment in tion (Supplementary Table 2). The videos utilized for quality control
young children in both clinical and community settings. module development were excluded from the detection/diagnostic
model validation.
Results
Overview of the study Performance of the detection model in real clinical settings
We conducted this prospective, multicenter and observational study with trained volunteers
(identifier: NCT04237350) in three stages from 14 January 2020 to 30 The detection model was trained to discriminate visually impaired
January 2022 and collected a total of 3,865 videos with 25,972,800 children from nonimpaired children based on the high-quality clips
frames of images from 3,652 Chinese children (aged ≤48 months) to extracted from the phenotypic videos. At the clip level, the detection
develop and validate the AIS system in clinical and at-home settings model achieved an area under the receiver operating curve (AUC) of
(Fig. 1). The AIS system was developed and comprehensively tested 0.925 (95% confidence interval (95% CI), 0.914–0.936) in the internal
(internal validation and reliability analyses under different testing validation (Extended Data Fig. 4a). Furthermore, we evaluated the per-
conditions) at the clinic of Zhongshan Ophthalmic Center (ZOC) in formance of the detection model via an independent external validation
the first stage, and was further tested in four other centers (external performed by trained volunteers using iPhone-7/iPhone-8 smartphones
validation) and community settings (at-home implementation) in the at the routine clinics of four other centers. In this stage, quality checking
second and third stages, respectively. was embedded in the data acquisition process, and the quality control
module automatically reminded volunteers to recollect data when the
Development of the mHealth AIS system videos were of low quality (Fig. 1b). Qualified videos for 298 children
We developed AIS for detecting visual impairment in young children undergoing ophthalmic examinations were utilized for final valida-
tailored to the present study (Fig. 1a and Supplementary Video 1). A tion, including 188 (63.1%) nonimpaired children, 67 (22.5%) mildly
child-friendly app was designed to attract children to maintain their impaired children and 43 (14.4%) severely impaired children (Table 1).
gaze using cartoon-like stimuli (Extended Data Fig. 1). The inbuilt front At the clip level, the detection model achieved an AUC of 0.814 (95%
camera of the smartphone recorded 3.5-min videos that captured CI, 0.790–0.838) in the external validation (Extended Data Fig. 4b).
phenotypic features of the facial appearance and ocular movements The performance of the detection model to identify visually
during gazing. In this process, the mHealth app interactively guided impaired children was evaluated by averaging the clip-level predic-
users (healthcare professionals, volunteers, parents and caregivers) tions. Figure 2a shows distinguished clip predicted probability pat-
to familiarize themselves with the system and complete standardized terns for children with various visual conditions. At the child level, the

Nature Medicine | Volume 29 | February 2023 | 493–503 494

Article https://doi.org/10.1038/s41591-022-02180-9

a
Q Output

Occlusion
detector

Video level
integration
Crying
detector
O

augmentation
Crop & data
Inteference
detector FC
decoder

P
Temporal
Key point pooling
regressor

EfficientNet
Start Stop -B4
70% Face
detector encoder

I Sface

b
3,286 children recruited at 28 ineligible
ZOC 25 could not record raw
videos
3 had history of brain or
Stage 1 mental illnesses
3,258 children recorded
phenotypic videos at ZOC
24 ineligible Developing
24 could not undergo detection/diagnostic models
eye examinations and relability analyses (2,344
3,234 children received eye children, 2,344 videos)
examinations at ZOC
Across-smartphone analysis
(361 children, 361 videos)
Developing quality Raw data (3,234 children,
control module 3,447 videos) Retest analysis (187 children,
374 videos)

Unqualified data (342 Qualified data (2,892 children,

Automated quality control
children, 368 videos) 3,079 videos)

Stage 2 Stage 3

305 children recruited at 125 children recruited

multiple centers 4 ineligible 3 ineligible online
4 could not record 3 could not record
qualified videos qualified videos
301 children recorded 122 children recorded
phenotypic videos with phenotypic videos with
automated quality control automated quality control
at multiple centers 3 ineligible 2 ineligible at home
3 could not undergo 2 could not undergo
eye examinations eye examinations
298 children received 120 children received eye
eye examinations at examinations at ZOC
multiple centers

Qualified data (298 Qualified data (120 children,

children, 298 videos) 120 videos)

At-home Fine-tuning the detection

External validation (298
implementation (88 model (32 children, 32
children, 298 videos)
children, 88 videos) videos)

Fig. 1 | Overall study design and participant flow diagram. a, Workflow of the inputs to the detection/diagnostic models. A small rectangle indicates input or
system. The smartphone-based AIS system consists of two key components: an output data, a large rectangle indicates mathematical operation, and a trapezoid
app for user education, testing preparation and data collection and a DL-based indicates DL or machine learning algorithm. b, Participant flow diagram.
back end for data analysis. Parents or other users utilize the app to induce Children were recruited at multiple clinics to develop and comprehensively test
children to gaze at the smartphone, allowing the app to record their phenotypic the AIS system in stage 1 and stage 2. Children were recruited online to perform
states as video data. Then, the phenotypic videos are sent to a quality control an at-home validation by untrained parents or caregivers in stage 3. I, input video;
module to discard low-quality frames. After automatic quality checking, multiple O, clip-level model outputs; P, key point coordinates; Q, qualified clips; Sface, facial
sets of consecutive qualified frames are extracted from the original video as clips, regions of the clips; FC, fully connected.
and the child’s facial regions are cropped from the clips to serve as candidate

Nature Medicine | Volume 29 | February 2023 | 493–503 495

Article https://doi.org/10.1038/s41591-022-02180-9

Table 1 | Summary of the qualified datasets used in this study

Dataset A Dataset B Dataset C Dataset D Dataset E Dataset F

(n = 2,344 children, (n = 187 children, (n = 361 children, (n = 298 children, (n = 32 children, (n = 88 children,
2,344 videos) 374 videos) 361 videos) 298 videos) 32 videos) 88 videos)

Sources ZOC clinic ZOC clinic ZOC clinic Clinics of multiple At-home At-home
hospitals environment environment
Usage of dataset Model Retest analysis Across-smartphone External validation Model fine-tuning At-home
development and analysis validation
reliability analyses
Images, n 15,751,680 2,513,280 2,425,920 2,002,560 215,040 591,360
Visual conditions, n (%)
Nonimpairment 871 (37.2%) 102 (54.5%) 87 (24.1%) 188 (63.1%) 10 (31.3%) 31 (35.2%)
Mild impairment 861 (36.7%) 52 (27.8%) 169 (46.8%) 67 (22.5%) 14 (43.8%) 31 (35.2%)
Severe impairment 612 (26.1%) 33 (17.6%) 105 (29.1%) 43 (14.4%) 8 (25.0%) 26 (29.5%)
Age of months (mean ± s.d.) 25.2 ± 11.7 28.7 ± 10.9 25.7 ± 10.8 28.0 ± 13.0 29.5 ± 10.1 30.0 ± 10.9
Sex, n (%)
Boys 1,265 (54.0%) 107 (57.2%) 202 (56.0%) 169 (56.7%) 16 (50.0%) 50 (56.8%)
Girls 1,079 (46.0%) 80 (42.8%) 159 (44.0%) 129 (43.3%) 16 (50.0%) 38 (43.2%)
Room illumination (mean ± s.d.) 289.5 ± 130.1 280.5 ± 122.5* 334.0 ± 117.5 N/A N/A N/A
Testing distance, n (%)
Short 195 (8.3%) 45 (12.0)* 34 (9.4%) 61 (20.5%) 6 (18.8%) 19 (21.6%)
Medium 1,738 (74.2%) 291 (77.8%)* 279 (77.3%) 125 (42.0%) 9 (28.1%) 51 (58.0%)
Long 411 (17.5%) 38 (10.2%)* 48 (13.3%) 112 (37.6%) 17 (53.1%) 18 (20.5%)
Laterality of the eye disorder, n (%)
Bilateral 995 (67.6%) 49 (57.7%) 209 (76.3%) 77 (70.0%) 8 (36.4%) 29 (50.9%)
Unilateral 478 (32.5%) 36 (42.4%) 65 (23.7%) 33 (30.0%) 14 (63.6%) 28 (49.1%)
Smartphones used iPhone-7/iPhone-8 iPhone-7/iPhone-8 Huawei Honor-6 Plus/ iPhone-7/iPhone-8 Parents’ own smartphones
Redmi Note-7 (no restriction)
*Metrics calculated in the unit of video. Except for the asterisk-marked metrics in dataset B, metrics were calculated in the unit of child. ZOC, Zhongshan Ophthalmic Center; N/A, not
applicable.

detection model achieved an AUC of 0.940 (95% CI, 0.920–0.959), an classification of AIS to detect common causes of visual impairment
accuracy of 86.5% (95% CI, 83.4%–89.0%), a sensitivity of 84.1% (95% CI, in young children.
80.2%–87.4%) and a specificity of 91.9% (95% CI, 86.9%–95.1%) in the Additionally, the performance of AIS in discriminating mild or
internal validation (Fig. 2b and Supplementary Table 3). It achieved a severe impairment from nonimpairment was assessed at the child level
child-level AUC of 0.843 (95% CI, 0.794–0.893), an accuracy of 82.6% (Fig. 2g–j and Supplementary Table 3). Significantly lower predicted
(95% CI, 77.8%–86.4%), a sensitivity of 80.9% (95% CI, 72.6%–87.2%) and probabilities of AIS were obtained for the nonimpaired group than for
a specificity of 83.5% (95% CI, 77.6%–88.1%) in the external validation the mild or severe impairment groups. For discriminating mild impair-
(Fig. 2c and Supplementary Table 3). ment from nonimpairment, an AUC of 0.936 (95% CI, 0.912–0.960) and
Furthermore, we investigated whether our system could identify an AUC of 0.833 (95% CI, 0.774–0.892) were obtained for the internal
visual impairment with any of 16 common ophthalmic disorders at validation and the external validation, respectively. For discriminating
the child level (Table 2 and Supplementary Table 4). For different oph- severe impairment from nonimpairment, an AUC of 0.944 (95% CI,
thalmic disorders, the predicted probabilities of the detection model 0.919–0.969) and an AUC of 0.859 (95% CI, 0.779–0.939) were obtained
were all significantly higher than those for nonimpairment (Fig. 2d). for the internal validation and the external validation, respectively.
AIS achieved AUCs of over 0.800 in 15 of 16 binary classification tasks To further evaluate the performance of AIS when applied to
to distinguish visual impairment with various causes from nonimpair- a population with a rare-case prevalence of visual impairment, we
ment (Fig. 2e,f and Supplementary Table 5), except for limbal dermoid conducted a ‘finding a needle in a haystack’ test based on the inter-
with an AUC of 0.747 (95% CI, 0.646–0.849). Even for diseases not nal validation dataset, with the simulated prevalences ranging from
present in the training set, our system showed effective discriminative 0.1% to 9%. AIS successfully identified visually impaired children at
capabilities, revealing wider extendibility and generalizability to other different simulated prevalences, with AUCs stabilized around 0.940
conditions (Fig. 2f). In addition, we initially recruited children with (Supplementary Table 7).
aphakia (including iatrogenic aphakia cases with common features
of visual impairment, accounting for 10.2% of the visually impaired Performance of the detection model in at-home settings with
participants enrolled) to increase diversity of training samples for the untrained parents or caregivers
robustness of the system. Therefore, to evaluate the performance of After validation in real clinical settings, we further implemented
AIS in the natural population without iatrogenic cases or cases with a more challenging application in at-home settings by parents or
medical interventions, the children with aphakia were removed from caregivers using their smartphones according to the system’s instructions
the validation datasets for further analysis and AIS remained reli- (Fig. 1b). Of the 125 children recruited online from the Guangdong
able (Supplementary Table 6). These results indicate the advanced area, 122 children (97.6%) successfully completed qualified video

Nature Medicine | Volume 29 | February 2023 | 493–503 496

Article https://doi.org/10.1038/s41591-022-02180-9

a d e
1.00
1.00
NI (n = 173)
Predicted probability

NI
VI NA (n = 32) AA CG
* PA RB

Sensitivity
SA (n = 48) *
0.50 0.50 OF CP
LD (n = 34) * CC SA
CP (n = 20) * HA NA
RB (n = 14) * PFV
0 MO (n = 19) * 0
0 10 20 30 40 CG (n = 18) * 0 0.50 1.00
PFV (n = 13) 1 – Specificity
Video stream (clip) *
HA (n = 12) *
b c f
PM (n = 19) * 1.00
1.00 1.00
CC (n = 69) *
OF (n = 10) *

Sensitivity
SSOM (n = 11) * Other
Sensitivity
Sensitivity

PA (n = 14) * 0.50 SSOM

0.50 0.50
AA (n = 28) PM
*
AUC: 0.940 AUC: 0.843 Other (n = 35) MO
(0.920–0.959) (0.794–0.893)
*
LD
0 0.50 1.00
0
0 0 Predicted probability 0 0.50 1.00
0 0.50 1.00 0 0.50 1.00
1 – Specificity
1 – Specificity 1 – Specificity

g Mild versus NI: P = 1.96 × 10–49 h i Mild versus NI: P = 5.75 × 10–16 j
Severe versus NI: P = 2.88 × 10–47 Severe versus NI: P = 2.13 × 10–13
1.00 1.00
Predicted probability

# #
Predicted probability

#
#
1.00 1.00
Sensitivity

Sensitivity
0.50 AUC 0.50 AUC
0.50 Mild versus NI 0.50 Mild versus NI
0.936 (0.912–0.960) 0.833 (0.774–0.892)
Severe versus NI Severe versus NI
0.944 (0.919–0.969) 0.859 (0.779–0.939)
0 0
0 0
NI Mild Severe 0 0.50 1.00 NI Mild Severe 0 0.50 1.00
n = 173 n = 215 n = 181 1 – Specificity n = 188 n = 67 n = 43 1 – Specificity

k Mild versus NI: P = 2.85 × 10–6 l m 1.00 0.50 0 n

Severe versus NI: P = 1.42 × 10–6 1.00 1.00
Percentage of each row
Predicted probability

# AUC SA 8 1 0 1 29
1.00 #
Sensitivity

NI 5 Sensitivity AA
VI versus NI 0 2 1 166
0.50 0.859 (0.767–0.950) 0.50 CG
Mild versus NI
CP 0 0 19 2 0
NI
Prediced

0.50
0.846 (0.743–0.949) CG 0 15 0 1 0 CP
Severe versus NI
0.873 (0.783–0.964) AA 20 0 0 3 14 SA
0
0 0
NI Mild Severe 0 0.50 1.00 AA CG CP NI SA 0 0.50 1.00
n = 31 n = 31 n = 26 1 – Specificity Acutal 1 – Specificity

Fig. 2 | Performance of the AIS system in clinical and at-home settings. the at-home implementation (k). Results are expressed as mean ± s.d. #P < 0.001,
a, Typical predicted probability patterns of the detection model. b,c, Receiver two-tailed Mann–Whitney U-tests. h,j, ROC curves of the detection model for
operating characteristic (ROC) curves of the detection model for distinguishing distinguishing mildly or severely impaired children from nonimpaired children
visually impaired children from nonimpaired children in the internal validation in the internal validation set (h) and in the external validation set (j). l, ROC curves
set (b) and in the external validation set (c). Center lines show ROC curves and of the detection model for distinguishing impaired, mildly impaired or severely
shaded areas show 95% CIs. d, The predicted probabilities of children with impaired children from nonimpaired children in the at-home implementation.
the indicated ophthalmic disorders and nonimpaired children in the internal m, The confusion matrix of the diagnostic model. n, ROC curves of the diagnostic
validation set. Results are expressed as mean ± s.d. *P < 0.001 (ranging from 4.83 × model for discriminating each category of ophthalmic disorder from the other
10−27 for congenital cataract (CC) to 2.40 × 10−5 for high ametropia (HA) compared categories (aphakia (AA), AUC = 0.947 (0.918–0.976); congenital glaucoma (CG),
with nonimpairment (NI), two-tailed Mann–Whitney U-tests). e,f, ROC curves AUC = 0.968 (0.923–1.000); NI, AUC = 0.976 (0.959–0.993); CP, AUC = 0.996
of the detection model for distinguishing nonimpaired children from children (0.989–1.000); strabismus (SA), AUC = 0.918 (0.875–0.961)). 95% DeLong CIs are
with the indicated ophthalmic disorders that overlap (e) or did not overlap (f) shown for AUC values. MO, microphthalmia; NA, nystagmus; OF, other fundus
with those in the training set (AUCs range from 0.747 for limbal dermoid (LD) diseases; PA, Peters’ anomaly; PFV, persistent fetal vasculature; PM, pupillary
to 0.989 for congenital ptosis (CP)). g,i,k, The predicted probabilities of the membrane; RB, retinoblastoma; SSOM, systemic syndromes with ocular
detection model for the nonimpaired, mildly impaired and severely impaired manifestations; VI, visual impairment.
groups in the internal validation set (g), in the external validation set (i) and in

collection, among whom 120 children undergoing ophthalmic exami for the home environments compared with the clinics, we fine-tuned
nations were enrolled. Other detailed information on the qualified data the detection model using qualified videos from 32 children and then
is summarized in Table 1. Given the great difference in data distributions tested it by the subsequently collected validation set from another

Nature Medicine | Volume 29 | February 2023 | 493–503 497

Article https://doi.org/10.1038/s41591-022-02180-9

Table 2 | Summary of the ophthalmic conditions of participants in this study

Dataset A (n = 2,344 Dataset B (n = 187 Dataset C (n = 361 Dataset D (n = 298 Dataset E (n = 32 Dataset F (n = 88
children) children) children) children) children) children)

Ophthalmic conditions, n (%)

Nonimpairment 871 (37.2%) 102 (54.5%) 87 (24.1%) 188 (63.1%) 10 (31.3%) 31 (35.2%)
Aphakia 153 (6.5%) 7 (3.7%) 44 (12.2%) 4 (1.3%) 6 (18.8%) 6 (10.5%)
Congenital cataract 348 (14.8%) 7 (3.7%) 133 (36.8%) 28 (9.4%) 10 (31.3%) 30 (52.6%)
Congenital glaucoma 95 (4.1%) 5 (2.7%) 0 2 (0.7%) 0 1 (1.8%)
High ametropia 69 (2.9%) 7 (3.7%) 2 (0.6%) 10 (3.4%) 2 (6.3%) 3 (5.3%)
Peters’ anomaly 39 (1.7%) 4 (2.1%) 0 0 0 1 (1.8%)
Nystagmus 174 (7.4%) 7 (3.7%) 39 (10.8%) 21 (7.0%) 1 (3.1%) 4 (7.0%)
PFV 36 (1.5%) 2 (1.1%) 6 (1.7%) 0 1 (3.1%) 3 (5.3%)
Other fundus diseases 54 (2.3%) 4 (2.1%) 2 (0.6%) 7 (2.3%) 0 0
Congenital ptosis 101 (4.3%) 10 (5.3%) 0 2 (0.7%) 0 0
Retinoblastoma 41 (1.7%) 6 (3.2%) 2 (0.6%) 3 (1.0%) 0 2 (3.5%)
Strabismus 245 (10.5%) 15 (8.0%) 35 (9.7%) 28 (9.4%) 1 (3.1%) 6 (10.5%)
Limbal dermoid 34 (1.5%) 3 (1.6%) 0 0 0 0
Microphthalmia 19 (0.8%) 2 (1.1%) 1 (0.3%) 3 (1.0%) 0 0
Pupillary membranes 19 (0.8%) 2 (1.1%) 7 (1.9%) 1 (0.3%) 1 (3.1%) 1 (1.8%)
SSOM 11 (0.5%) 0 0 1 (0.3%) 0 0
Other 35 (1.5%) 4 (2.1%) 3 (0.8%) 0 0 0
PFV, persistent fetal vasculature; SSOM, systemic syndromes with ocular manifestations.

88 children. On the validation set, 31 (35.2%) children were classified the correctly recognized clips (Extended Data Fig. 5). Moreover, for the
as nonimpaired and 57 (64.8%) children were classified as visually 20% of samples with the lowest predicted confidence values, the false
impaired. AIS achieved effective performance in the at-home imple- identification rate was significantly higher than that of other groups
mentation, with an AUC of 0.817 (95% CI, 0.756–0.881) for discrimi- and the system was equivocal. We aimed to find a solution when the
nating clips of visually impaired children from those of nonimpaired system was unreliable by filtering out equivocal samples for manual
children (Extended Data Fig. 4c). At the child level, significantly lower review by ophthalmologists. The results show that the system perfor-
predicted probability patterns were obtained for the nonimpaired mance was substantially improved with the increasing ratios for manual
children compared with mildly or severely impaired children (Fig. 2k). review. For instance, when selecting cases with confidence values less
An AUC of 0.859 (95% CI, 0.767–0.950), an accuracy of 77.3% (95% CI, than 0.071 for manual review, accounting for 3% of the total cases, the
67.5%–84.8%), a sensitivity of 77.2% (95% CI, 64.8%–86.2%) and a speci- sensitivity improved from 84.1% to 85.1% and the specificity improved
ficity of 77.4% (95% CI, 60.2%–88.6%) were attained for discriminating from 91.9% to 93.1%; when selecting cases with confidence values less
visual impairment from nonimpairment (Fig. 2l and Supplementary than 0.193 for manual review, accounting for 7% of the total cases, the
Table 3). sensitivity and specificity improved to 85.4% and 94.2%, respectively
(Extended Data Fig. 6).
Model visualization and explanation
We improved the interpretability of the detection model outputs Multiple-category classification of ophthalmic disorders
by visualizing the model results in the internal validation set. After Considering that our system exhibited different attention patterns for
being projected into a two-dimensional space, the feature information visual impairment caused by specific ophthalmic disorders (Fig. 3c),
extracted by the detection model exhibited distinct patterns between we further developed a DL-based diagnostic model to differentiate
the visually impaired and nonimpaired clips (Fig. 3a). The attention ophthalmic disorders with characteristic attention patterns by the
patterns of the detection model presented by the average heat maps detection model (aphakia, congenital glaucoma, congenital ptosis and
varied with the children’s visual functions and underlying ophthal- strabismus) and nonimpairment at the child level. In the diagnostic
mic disorders (Fig. 3b,c). Among the visually impaired children, the validation, our system effectively discriminated multiple ophthalmic
detection model focused more on the eyes and areas around the neck disorders, achieving AUCs ranging from 0.918 for strabismus to
(Fig. 3c). In particular, for the clips extracted from visually impaired 0.996 for congenital ptosis (Fig. 2m,n).
samples, those classified by human experts as having abnormal patterns
were more likely to be predicted by our system as ‘visual impairment’ Reliability and adjusted analyses
than those that were randomly extracted (Fig. 3d,e and Supplementary Stable performance is critical for real-world applications of mHealth
Table 8), indicating that the detection model might pay more attention and medical AI systems. Thus, we investigated the reliability of AIS at the
to the morphological appearance or behavioral patterns of the eye and clinic of ZOC. We first evaluated the influences of patient-related fac-
head regions, as we previously reported16. tors, including sex, age, laterality of the eye disorder and the apparency
Additionally, the clips misidentified by the system exhibited of the phenotypic features, on the performance of AIS. For the reliability
different clustering characteristics from the correctly recognized clips stratified by sex, AIS achieved an AUC of 0.948 (95% CI, 0.921–0.971) in
(true visually impaired or true nonimpaired clips), and more of the the boys group and an AUC of 0.931 (95% CI, 0.899–0.961) in the girls
misidentified clips fell in the intermediate zone of the two clusters for group (Fig. 4a). The predicted probability pattern of AIS remained

Nature Medicine | Volume 29 | February 2023 | 493–503 498

Article https://doi.org/10.1038/s41591-022-02180-9

a VI NI c

20
NI HA LD OF
t-SNE2

−20

RB PFV PA SSOM

−40 −20 0 20 40
t-SNE1
b

CP MO PM AA

CC CG NA SA
d e
Clip with abnormal behavior
Motionless fixation *
Random VI clip
NI clip
Squinting
NI video

Nystagmus *

Head position *
VI video

Suspected strabismus *

Random VI

Suspected Nystagmus Squinting Head position Motionless 0 0.50 1.00

strabismus fixation
Clip predicted probability

Fig. 3 | Interpretability and visualization of the detection model. a, The types of clips in (d) were compared: motionless fixation, n = 48; squinting,
t-SNE algorithm was applied to visualize the detection model at the clip level. n = 18; nystagmus (NA), n = 95; head position, n = 115; suspected strabismus
b, Facial detection and facial landmark localization algorithms were applied to (SA), n = 360; random visual impairment (VI), n = 1,000; nonimpairment (NI),
detect and crop the facial regions of the children before data served as inputs n = 1,000. Results are expressed as mean ± s.d. *P < 0.01 for comparisons with
to the AIS system. c, Average heat maps obtained from the detection model random VI (motionless fixation, P = 4.60 × 10−8; suspected SA, P = 1.09 × 10−15;
based on the inputs of facial regions in (b) for nonimpaired children and for NA, P = 1.52 × 10−7; head position, P = 0.005; two-tailed Mann–Whitney U-tests).
children with the indicated ophthalmic disorders. d, The predicted probabilities AA, aphakia; CC, congenital cataract; CG, congenital glaucoma; CP, congenital
for various types of clips were compared: clips randomly extracted from the ptosis; HA, high ametropia; LD, limbal dermoid; MO, microphthalmia; OF, other
videos of nonimpaired children, clips randomly extracted from the videos of fundus diseases; PA, Peters’ anomaly; PFV, persistent fetal vasculature;
visually impaired children and clips labeled by experienced ophthalmologists PM, pupillary membrane; RB, retinoblastoma; SSOM, systemic syndromes
as having abnormal behavioral patterns extracted from videos of visually with ocular manifestations.
impaired children. e, Predicted probabilities of the detection model for various

stable under various age conditions (Fig. 4b), and the system achieved who could have insidious phenotypic features and were easily neglected
AUCs ranging from 0.909 for age group 4 to 0.954 for age group 3 by community ophthalmologists (Supplementary Table 9).
(Fig. 4c). Additionally, AIS effectively identified visually impaired Furthermore, we investigated the reliability of AIS under different
children with bilateral or unilateral eye disorders, with an AUC of data capture conditions, including testing distance, room illuminance,
0.921 (95% CI, 0.891–0.952) in the unilateral group and an AUC of 0.952 repeated testing and duration of the video recording. Similarly, AIS
(95% CI, 0.932–0.973) in the bilateral group (Fig. 4d). In addition, AIS obtained stable detection performance among groups of different
achieved satisfactory performance with an AUC of 0.939 (95% CI, testing distances, with the lowest AUC of 0.935 (95% CI, 0.912–0.958)
0.918–0.960) in identifying hard-to-spot visually impaired children, in the medium-distance group (Fig. 4e). Additionally, the AIS predicted

Nature Medicine | Volume 29 | February 2023 | 493–503 499

Article https://doi.org/10.1038/s41591-022-02180-9

a Girl b c
Boy
NI 1.00

1.00
VI
Metric

0.50

Sensitivity
0 0.50 0.50 Age group 1

Dispersion
AUC ACC SEN SPE Age group 2
0
Age group 3
d Unilateral
Bilateral −0.50 Age group 4

1.00 0
0 10 20 30 40 50 0 0.50 1.00
Metric

Age 1 – Specificity
0.50

AUC ACC SEN SPE

f g 1.00
NI
VI
Long distance
e
Medium distance Illuminance
Short distance group 1

Sensitivity
0.50
0.50
Dispersion

1.00 Illuminance
0 group 2
Metric

0.50 Illuminance
−0.50 group 3
0
0
AUC ACC SEN SPE 0 200 400 600 0 0.50 1.00
Room illuminance 1 – Specificity
NI VI
h 1.00
i
Retest predicted probability

0.9
ACC

AUC
Metric

0.50
SEN

0.8 SPE

0 0.50 1.00 0 0.50 1.00 0 30 60 90 120 150 180

Test predicted probability Video duration

Fig. 4 | Performance of the AIS system in reliability analyses. a, Performance distance, n = 432; short distance, n = 90. f, Scatterplot of dispersion of the AIS
of AIS in detecting children with visual impairment (VI) based on sex: girls, predicted probability changes by room illuminance (in lux (lx)). g, ROC curves
n = 254; boys, n = 315. b, Scatterplot of dispersion of the AIS predicted probability of AIS for distinguishing children with VI under various room illuminance
changes by age (months). c, Receiver operating characteristic (ROC) curves of conditions: illuminance group 1, room illuminance ≤ 200 lx, n = 125, AUC = 0.936
AIS for detecting children with VI by age groups: age group 1, age ≤ 12 months, (0.895–0.976); illuminance group 2, 200 lx < room illuminance ≤ 400 lx, n = 317,
n = 98, AUC = 0.925 (0.847–1.000); age group 2, 12 months < age ≤ 24 months, AUC = 0.932 (0.901–0.963); illuminance group 3, room illuminance > 400 lx,
n = 160, AUC = 0.936 (0.895–0.977); age group 3, 24 months < age ≤ 36 months, n = 127, AUC = 0.950 (0.915–0.985). h, Predicted probabilities of the detection
n = 189, AUC = 0.954 (0.928–0.980); age group 4, 36 months < age ≤ 48 months, model for repeated detection tests (NI, n = 102; VI, n = 85). i, Performance curves
n = 122, AUC = 0.909 (0.855–0.964). d, Performance of AIS for identifying of AIS by video duration. In a,d,e, results are expressed as means and 95% CIs with
children with unilateral or bilateral VI: unilateral, n = 158; bilateral, n = 238; DeLong CIs for AUC values and 95% Wilson CIs for other metrics. ACC, accuracy;
nonimpairment (NI), n = 173. e, Performance of AIS for detecting children with SEN, sensitivity; SPE, specificity.
VI under various testing distance conditions: long distance, n = 47; medium

probability pattern remained stable under different room illumi- a maximal AUC of 0.931 (95% CI, 0.914–0.956) with a video duration
nance conditions (Fig. 4f). Our system achieved the lowest AUC of longer than 30 s (Fig. 4i).
0.932 (95% CI, 0.901–0.963) in the medium illuminance group To further verify that the detecting results of our system were reli-
(Fig. 4g). In the retest analysis, the system remained robust with an able and not solely mediated by baseline characteristics as confound-
intraclass correlation coefficient for predicted probabilities of 0.880 ers, we examined the odds ratios (ORs) of the AIS predictions adjusted
(95% CI, 0.843–0.908) and a Cohen’s κ for predicted categories of 0.837 for baseline characteristics at the child level. Even after controlling for
(95% CI, 0.758–0.916) in another independent validation population potential baseline confounders, the AIS predictions had statistically
recruited at ZOC (Fig. 4h and Table 1). In addition, as the duration significant adjusted ORs for detecting visual impairment in the internal
of the video recording increased, AIS remained stable and achieved and external validations and the at-home implementation (P < 0.001).

Nature Medicine | Volume 29 | February 2023 | 493–503 500

Article https://doi.org/10.1038/s41591-022-02180-9

The adjusted ORs ranged from 3.034 to 3.248 for tasks in the internal To apply AIS to various scenarios, we recruited cases of a broad range of
validation (Supplementary Table 10) and from 2.307 to 2.761 for tasks eye disorders with variable severity in terms of their impact on vision.
in the external validation (Supplementary Table 11). For the at-home Our system was reasonably accurate in identifying mildly impaired
implementation, the AIS predictions had a statistically significant children who could have subtle phenotypic features, making them easy
adjusted OR of 2.496 (95% CI, 1.748–3.565, P = 4.815 × 10−7) for detecting to miss. Furthermore, our results indicate that AIS can be extended to
visual impairment (Supplementary Table 12). diseases that have not been previously encountered in the training
process, demonstrating its broader applicability.
Performance of the AIS across different smartphone The use of smartphones to detect visual impairment caused by
platforms extraocular diseases or systematic diseases is an important application
To test the stability of our system in more complex settings, we per- in the future, but the feasibility remained to be further verified. Some
formed adjustments to a dataset randomly sampled from the ZOC systemic diseases, such as cardiovascular, hepatobiliary and renal
validation set with various blurring, brightness, color or Gaussian noise diseases, can exhibit ocular manifestations that are recognizable by
adjustment gradients to simulate the diversity of data quality collected algorithms, which is also indicated by our findings in small samples30–32.
by different smartphone cameras. Our system remained reliable and Furthermore, disorders of neurological system can impact vision and
achieved AUCs of over 0.800 with blurring factors no more than 25 cause cerebral visual impairment with pathology outside the eye,
or brightness factors no more than 0.7, and it achieved AUCs of over which is a common type of visual impairment in developed countries
0.930 under different color adjustments and over 0.820 under various but lacking in this study33,34. Therefore, future work is needed to evalu-
Gaussian noise adjustments (Extended Data Fig. 7). ate the merit of AIS in detecting visual impairment caused by a broad
Furthermore, an independent validation set from 389 children range of diseases, such as cerebral visual impairment, and in reducing
was collected at ZOC using the Huawei Honor-6 Plus and Redmi Note-7 the extraocular morbidity associated with systemic diseases in a larger
smartphones with the Android operation system to evaluate the per- population: for example, the cardiovascular complications linked with
formance of AIS (Fig. 1b and Supplementary Table 13). After data qual- Marfan syndrome.
ity checking, videos of 361 children were reserved (92.8%), including A major strength of AIS is its reliability in real-world practice.
87 (24.1%) children without visual impairment, 169 (46.8%) children Although a large number of medical AI systems have been evaluated
with mild visual impairment and 105 (29.1%) children with severe with high performance in the laboratory setting, only a few systems
visual impairment (Table 1). AIS showed significantly higher predicted have demonstrated real-world medical feasibility23,25. Bias from training
probabilities for mild or severe impairment than for nonimpairment data and low stability of the model design greatly limit the generaliz-
and achieved an AUC of 0.932 (95% CI, 0.902–0.963) for identifying ability of these AI systems. Previously, we evaluated the feasibility of
visual impairment for the Android system at the child level (Extended identifying visual impairment in children by analyzing their phenotypic
Data Fig. 8). characteristics using DL algorithms16. For that study, the evaluation
was conducted by experienced experts under a tightly controlled,
Discussion standardized laboratory setting to strictly control for interference
With the high incidence of visual problems during the first few years factors, which is not possible in routine ophthalmic practice. In this
of life, timely intervention to counter pathological visual deprivation study, we prospectively collected a large amount of phenotypic data
mechanisms during this critical development period can prevent or (facial features and ocular movements) to develop a DL system with a
minimize long-term visual loss3. However, early detection of visual highly reliable design. Our results show that AIS exhibited high stabil-
impairment in young children is challenging due to the lack of accurate ity and prediction effectiveness under various testing conditions.
and easy-to-use tools applicable to both clinical and community envi- Importantly, AIS remained effective in multicenter external validation
ronments. To overcome these challenges, we developed and validated and crucially, when rolled out in the community and used by parents
a smartphone-based system (AIS) that provides a holistic and quan- or caregivers at home. When transferred to at-home settings, fac-
titative technique to identify visual impairment in young children in tors such as environmental interference, blurring, brightness, pixels
real-world settings. We comprehensively evaluated this system for 16 of different cameras and the influence of untrained operators may
important causes of childhood vision loss. Our system achieved an AUC impact the system’s performance. Therefore, we used a pilot dataset
of 0.940 in the internal validation and an AUC of 0.843 in the external to fine-tune our system for its generalizability to various home envi-
validation at the clinics of four different hospitals. Furthermore, our ronments and broader applications. AIS achieved an acceptable AUC
system proved reliable when used by parents or caregivers at home, of 0.859 in the subsequent implementation, which indicates that it
achieving an AUC of 0.859 under these specific testing conditions. can benefit from further model updating on larger-scale datasets for
One of the merits of AIS is in its applicability to different ocular broader applications. Importantly, AIS kept stable in 88 different types
diseases. Previous studies have utilized photographs to detect ocular of home environments after one round of fine-tuning, demonstrating
and visual abnormalities in childhood27,28. These technologies, which its potential to be used generally in a variety of complex environments
focus on a single static image, are not suitable for large-scale applica- with no requirement of regular adaptations or fine-tuning in the future
tions due to their limited effectiveness and inability to handle multiple application.
abnormalities with variable patterns. Given the complexity of ocular Our findings demonstrate that sensory states, especially
pathologies in children, the concept of accurately assessing a broad vision, can be derived from phenotypic video data recorded using
range of ocular conditions is attractive. In our prospective multicenter consumer-grade smartphones. Two types of underlying features
study, we analyzed more than 25,000,000 frames of information-rich seemed to be captured by smartphones. First, changes in facial appear-
phenotypic videos and accurately identified visual impairment caused ance caused by ocular pathologies can be directly recorded by mobile
by a wide range of sight-threatening eye diseases. Strikingly, AIS was devices, especially those of the ocular surface or adnexa: for example,
able to detect most of the common causes of visual impairment in eyelid drooping in congenital ptosis. Second and more importantly,
childhood, including anterior and posterior segment disorders, stra- individuals may display aberrant behaviors to adapt to changes in
bismus, ocular neoplasms, developmental abnormalities and ocular their sensory modality, a process conserved from arthropods to
manifestations of systemic and genetic diseases29. Although cases like mammals35,36 and confirmed in human children16. Our results show
congenital cataracts tend to be easily diagnosed in specialist settings that the model can focus on behavioral features replicated in various
by experienced doctors, they are still frequently missed in the commu- eye diseases, such as abnormal ocular movement or alignment/fixa-
nity, especially in areas with pediatric ophthalmic resource shortfall28. tion patterns. These common behavioral patterns may broaden the

Nature Medicine | Volume 29 | February 2023 | 493–503 501

Article https://doi.org/10.1038/s41591-022-02180-9

applicability of AIS to multiple ocular diseases, including posterior References

segment abnormalities that are more challenging to diagnose based 1. Kliner, M., Fell, G., Pilling, R. & Bradbury, J. Visual impairment in
on phenotypic video data. children. Eye 25, 1097–1097 (2011).
A smartphone-based system to detect ocular pathology in children 2. Mariotti, A. & Pascolini, D. Global estimates of visual impairment.
has obvious clinical implications. Early identification by parents or Br. J. Ophthalmol. 96, 614–618 (2012).
caregivers of ocular abnormalities facilitates timely referral to pedi- 3. Bremond-Gignac, D., Copin, H., Lapillonne, A. & Milazzo, S.
atric ophthalmologists and prompt intervention. AIS does not require Visual development in infants: physiological and pathological
professional medical equipment; smartphones and simple stabilization mechanisms. Curr. Opin. Ophthalmol. 22, S1–S8 (2011).
are sufficient. This low-barrier system is a promising tool for the timely 4. Teoh, L., Solebo, A. & Rahi, J. Temporal trends in the epidemiology
testing of children in the community, which is a major advantage given of childhood severe visual impairment and blindness in the UK.
the rapidly changing nature of the ocular pathology encountered in Br. J. Ophthalmol. https://doi.org/10.1136/bjophthalmol-2021-320119
children. This could have a major impact by improving vision-related (2021).
outcomes and even survival rates in cases such as retinoblastoma37,38. 5. Gothwal, V. K., Lovie-Kitchin, J. E. & Nutheti, R. The development
Furthermore, AIS is a promising tool to screen young children for of the LV Prasad-Functional Vision Questionnaire: a measure
ocular abnormalities remotely, which can reduce ophthalmologists’ of functional vision performance of visually impaired children.
exposure risk to infectious agents, as exemplified by the impact of Investigative Ophthalmol. Vis. Sci. 44, 4131–4139 (2003).
the coronavirus disease 2019 (COVID-19) pandemic, in the so-called 6. Brown, A. M. & Yamamoto, M. Visual acuity in newborn and
‘new normal’ period39. preterm infants measured with grating acuity cards. Am. J.
This study has several limitations. First, although we may miss Ophthalmol. 102, 245–253 (1986).
the recruitment of some patients with conditions causing slight visual 7. Dutton, G. N. & Blaikie, A. J. How to assess eyes and vision in
impairment in specialist clinical settings, our system was satisfactorily infants and preschool children. BMJ Br. Med. J. 350, h1716 (2015).
accurate in identifying mildly impaired children with subtle phenotypic 8. Blindness and Vision Impairment (World Health Organization,
features. Importantly, the versatile AIS system kept reliable perfor- 2021); https://www.who.int/en/news-room/fact-sheets/detail/
mance to detect visually impaired children who were hard to spot even blindness-and-visual-impairment
for community ophthalmologists, which sheds light on its significant 9. Mayer, D. L. & Dobson, V. in Developing Brain Behaviour
application prospect of expanding our future work to the general popu- (ed. Dobbing, J.) 253–292 (Academic, 1997).
lation and groups of children with mild or early-stage ocular pathol- 10. Quinn, G. E., Berlin, J. A. & James, M. The Teller acuity card
ogy. Second, to develop the quality control module and analyze the procedure: three testers in a clinical setting. Ophthalmology 100,
influencing factors, only a single video was collected for each child at 488–494 (1993).
ZOC, accounting for the relatively high rate of unsuccessful cases in this 11. Johnson, A., Stayte, M. & Wortham, C. Vision screening at 8
stage. However, our system allowed users to repeat video recordings and 18 months. Steering Committee of Oxford Region Child
until the qualified videos were acquired. As a result, the successful rate Development Project. Br. Med. J. 299, 545–549 (1989).
of identification greatly improved. Although a proportion of uncoop- 12. Long, E. et al. Monitoring and morphologic classification of
erative children may not be appropriate for our tool, our AIS system has pediatric cataract using slit-lamp-adapted photography.
greatly lowered the minimal operating threshold for untrained users, Transl. Vis. Sci. Technol. 6, 2 (2017).
indicating the potential for the general applications. Third, our cohorts 13. Balmer, A. & Munier, F. Differential diagnosis of leukocoria
recruited in clinical settings may not represent the real-world popula- and strabismus, first presenting signs of retinoblastoma.
tion. Although AIS effectively identified visually impaired children Clin. Ophthalmol. 1, 431 (2007).
in the finding a needle in a haystack test with a prevalence simulated 14. SooHoo, J. R., Davies, B. W., Allard, F. D. & Durairaj, V. D. Congenital
to a general population, a large-scale screening trial is needed in the ptosis. Surv. Ophthalmol. 59, 483–492 (2014).
future to validate the utility of the AIS system in the real-world applica- 15. Mandal, A. K. & Chakrabarti, D. Update on congenital glaucoma.
tions. Fourth, AIS requires collecting facial information from children, Indian J. Ophthalmol. 59, S148 (2011).
which may pose a risk of privacy exposure. To avoid potential privacy 16. Long, E. et al. Discrimination of the behavioural dynamics of
risks, future techniques such as lightweight model backbones40 and visually impaired infants via deep learning. Nat. Biomed. Eng. 3,
model pruning41 could be applied to deploy the DL system in individual 860–869 (2019).
smartphones with no requirement for additional computing resources. 17. Brown, A. M. & Lindsey, D. T. Infant color vision and color
In addition, digital fingerprint technology, such as blockchain42, can preferences: a tribute to Davida Teller. Vis. Neurosci. 30, 243–250
also be applied to monitor data usage and mitigate abuse effectively. (2013).
Additionally, we developed a real-time three-dimensional facial recon- 18. Holmes, J. M. & Clarke, M. P. Amblyopia. Lancet 367, 1343–1351
struction technology to irreversibly erase biometric attributes while (2006).
retaining gaze patterns and eye movements43, which can be used in the 19. Abadi, R. & Bjerre, A. Motor and sensory characteristics of
future to safeguard children’s privacy when using AIS. infantile nystagmus. Br. J. Ophthalmol. 86, 1152–1160 (2002).
In conclusion, we developed and validated an innovative 20. Wright, K. W., Spiegel, P. H. & Hengst, T. Pediatric Ophthalmology
smartphone-based technique to detect visual impairment in young and Strabismus (Springer, 2013).
children affected with a broad range of eye diseases. Given the ubiquity 21. Sim, I. Mobile devices and health. N. Engl. J. Med. 381, 956–968
of smartphones, AIS is a promising tool that can be applied in real-world (2019).
settings for secondary prevention of visual loss in this particularly 22. Grady, C. et al. Informed consent. N. Engl. J. Med. 376, 856–867
vulnerable age group. (2017).
23. Beede, E. et al. A human-centered evaluation of a deep learning
Online content system deployed in clinics for the detection of diabetic
Any methods, additional references, Nature Portfolio reporting sum- retinopathy. In Proc. 2020 CHI Conference on Human Factors in
maries, source data, extended data, supplementary information, Computing Systems 1–12 (Association for Computing Machinery,
acknowledgements, peer review information; details of author contri- 2020)..
butions and competing interests; and statements of data and code avail- 24. Davenport, T. H. & Ronanki, R. Artificial intelligence for the real
ability are available at https://doi.org/10.1038/s41591-022-02180-9. world. Harvard Bus. Rev. 96, 108–116 (2018).

Nature Medicine | Volume 29 | February 2023 | 493–503 502

Article https://doi.org/10.1038/s41591-022-02180-9

25. Lin, H. et al. Diagnostic efficacy and therapeutic decision-making 36. Klein, M. et al. Sensory determinants of behavioral dynamics in
capacity of an artificial intelligence platform for childhood Drosophila thermotaxis. Proc. Natl Acad. Sci. USA 112, E220–E229
cataracts in eye clinics: a multicentre randomized controlled trial. (2015).
eClinicalMedicine 9, 52–59 (2019). 37. Finger, P. T. & Tomar, A. S. Retinoblastoma outcomes: a global
26. King, D. E. Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. perspective. Lancet Glob. Health 10, e307–e308 (2022).
10, 1755–1758 (2009). 38. Wong, E. S. et al. Global retinoblastoma survival and globe
27. Munson, M. C. et al. Autonomous early detection of eye disease in preservation: a systematic review and meta-analysis of associ
childhood photographs. Sci. Adv. 5, eaax6363 (2019). ations with socioeconomic and health-care factors. Lancet Glob.
28. Long, E. et al. An artificial intelligence platform for the Health 10, E380–E389 (2022).
multihospital collaborative management of congenital cataracts. 39. Romano, M. R. et al. Facing COVID-19 in ophthalmology
Nat. Biomed. Eng. 1, 0024 (2017). department. Curr. Eye Res. 45, 653–658 (2020).
29. Gogate, P., Gilbert, C. & Zin, A. Severe visual impairment and 40. Howard, A. et al. Searching for mobilenetv3. In Proc. IEEE/CVF
blindness in infants: causes and opportunities for control. International Conference on Computer Vision 1314–1324
Middle East Afr. J. Ophthalmol 18, 109–114 (2011). (IEEE, 2019).
30. Cheung, C. Y. et al. A deep-learning system for the assessment of 41. Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N. & Peste, A. Sparsity
cardiovascular disease risk via the measurement of retinal-vessel in deep learning: pruning and growth for efficient inference and
calibre. Nat. Biomed. Eng. 5, 498–508 (2021). training in neural networks. J. Mach. Learn. Res. 22, 1–124 (2021).
31. Sabanayagam, C. et al. A deep learning algorithm to 42. Leeming, G., Ainsworth, J. & Clifton, D. A. Blockchain in health
detect chronic kidney disease from retinal photographs in care: hype, trust, and digital health. Lancet 393, 2476–2477 (2019).
community-based populations. Lancet Digital Health 2, 43. Yang, Y. et al. A digital mask to safeguard patient privacy.
e295–e302 (2020). Nat. Med. 28, 1883–1892 (2022).
32. Xiao, W. et al. Screening and identifying hepatobiliary diseases
through deep learning using ocular images: a prospective, Publisher’s note Springer Nature remains neutral with regard to
multicentre study. Lancet Digital Health 3, e88–e97 (2021). jurisdictional claims in published maps and institutional affiliations.
33. Pehere, N., Chougule, P. & Dutton, G. N. Cerebral visual
impairment in children: causes and associated ophthalmological Springer Nature or its licensor (e.g. a society or other partner) holds
problems. Indian J. Ophthalmol. 66, 812–815 (2018). exclusive rights to this article under a publishing agreement with
34. Gilbert, C. & Foster, A. Childhood blindness in the context of the author(s) or other rightsholder(s); author self-archiving of the
VISION 2020—the right to sight. Bull. World Health Organ 79, accepted manuscript version of this article is solely governed by the
227–232 (2001). terms of such publishing agreement and applicable law.
35. Dey, S. et al. Cyclic regulation of sensory perception by
a female hormone alters behavior. Cell 161, 1334–1344 © The Author(s), under exclusive licence to Springer Nature America,
(2015). Inc. 2023

Wenben Chen1,24, Ruiyang Li1,24, Qinji Yu2,24, Andi Xu1,24, Yile Feng3,24, Ruixin Wang1, Lanqin Zhao 1, Zhenzhe Lin1,
Yahan Yang1, Duoru Lin1, Xiaohang Wu1, Jingjing Chen1, Zhenzhen Liu1, Yuxuan Wu1, Kang Dang3, Kexin Qiu3,
Zilong Wang 3, Ziheng Zhou3, Dong Liu1, Qianni Wu1, Mingyuan Li1, Yifan Xiang 1, Xiaoyan Li1, Zhuoling Lin1,
Danqi Zeng1, Yunjian Huang1, Silang Mo4, Xiucheng Huang4, Shulin Sun5, Jianmin Hu6, Jun Zhao7, Meirong Wei8,
Shoulong Hu9,10, Liang Chen11, Bingfa Dai6, Huasheng Yang1, Danping Huang1, Xiaoming Lin1, Lingyi Liang1, Xiaoyan Ding1,
Yangfan Yang1, Pengsen Wu1, Feihui Zheng12, Nick Stanojcic13, Ji-Peng Olivia Li 14, Carol Y. Cheung15, Erping Long 1,
Chuan Chen16, Yi Zhu17, Patrick Yu-Wai-Man 14,18,19,20, Ruixuan Wang21, Wei-shi Zheng 21, Xiaowei Ding 2,3 &
Haotian Lin 1,22,23

1
State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of
Ophthalmology and Vision Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China. 2Institute of Image
Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China. 3VoxelCloud, Shanghai, China. 4School of Medicine,
Sun Yat-sen University, Shenzhen, China. 5Department of Urology, Peking University Third Hospital, Peking University Health Science Center, Beijing,
China. 6Department of Ophthalmology, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China. 7Shenzhen People’s Hospital
(The Second Clinical Medical College, Jinan University; The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen,
China. 8Liuzhou Maternity and Child Healthcare Hospital, Affiliated Women and Children’s Hospital of Guangxi University of Science and Technology,
Liuzhou, China. 9National Center for Children’s Health, Department of Ophthalmology, Beijing Children’s Hospital, Capital Medical University, Beijing,
China. 10Department of Ophthalmology, Zhengzhou Children’s Hospital, Zhengzhou, China. 11Shenzhen Eye Hospital, Jinan University, Shenzhen Eye
Institute, Shenzhen, China. 12Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore. 13Department of Ophthalmology,
St. Thomas’ Hospital, London, UK. 14Moorfields Eye Hospital, London, UK. 15Department of Ophthalmology & Visual Sciences, Faculty of Medicine, The
Chinese University of Hong Kong, Hong Kong, China. 16Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami,
FL, USA. 17Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL, USA. 18University College
London Institute of Ophthalmology, University College London, London, UK. 19Cambridge Eye Unit, Addenbrooke’s Hospital, Cambridge University
Hospitals, Cambridge, UK. 20Cambridge Center for Brain Repair and Medical Research Council (MRC) Mitochondrial Biology Unit, Department of Clinical
Neurosciences, University of Cambridge, Cambridge, UK. 21School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
22
Hainan Eye Hospital and Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Haikou, China. 23Center for Precision
Medicine and Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China. 24These
authors contributed equally: Wenben Chen, Ruiyang Li, Qinji Yu, Andi Xu, Yile Feng. e-mail: dingxiaowei@sjtu.edu.cn; linht5@mail.sysu.edu.cn

Nature Medicine | Volume 29 | February 2023 | 493–503 503

Article https://doi.org/10.1038/s41591-022-02180-9

Methods used to detect abnormalities in the eyes. Additional examinations,

Ethics approval such as intraocular pressure, ultrasound, computerized tomography
The predefined protocol of the clinical study was approved by the scans and genetic tests, were determined by experienced pediatric
Institutional Review Board/Ethics Committee of ZOC and prospectively ophthalmologists when necessary.
registered at ClinicalTrials.gov (identifier: NCT04237350), and it is According to the results of the abovementioned examinations and
shown in Supplementary Note. Consent was obtained from all individu- a referenced distribution of monocular visual acuity45, experienced
als whose eyes or faces are shown in the figures or video for publication. pediatric ophthalmologists comprehensively stratified children’s
Before data collection, informed written consent was obtained from at visual conditions into three groups. Children with the best-corrected
least one parent or guardian of each child. The investigators followed visual acuity (BCVA) of both eyes in the 95% referenced range with no
the requirements of the Declaration of Helsinki throughout the study. abnormalities of structure or other examination results were assigned
to the nonimpaired group. Children with the BCVA in the 99% refer-
Study design and study population enced range in both eyes with abnormalities of structure or other
This prospective, multicenter and observational study was conducted examination results were assigned to the mildly impaired group. Chil-
between 14 January 2020 and 30 January 2022 to recruit children for the dren with the BCVA of at least one eye outside the 99% referenced range
development and validation of the mHealth system in three stages (Fig. or worse than light perception with structural abnormalities or other
1b). Major eligibility criteria included an age of 48 months or younger examination results were assigned to the severely impaired group16.
and informed written consent obtained from at least one parent or We recruited visually impaired children with primary diagnoses of the
guardian of each child. We did not include children having central following 16 ocular disorders: aphakia, congenital cataract, congenital
nervous system diseases, mental illnesses or other known illnesses glaucoma, high ametropia, Peters’ anomaly, nystagmus, congenital
that could affect their behavioral patterns, in the absence of ocular ptosis, strabismus, persistent fetal vasculature, retinoblastoma, other
manifestations. Children who could not cooperate to complete the fundus diseases, limbal dermoid, microphthalmia, pupillary mem-
ophthalmic examinations or the detection test using AIS were excluded. branes, systemic syndromes with ocular manifestations and other
We also excluded children who had received ocular interventions and ocular conditions (Table 2 and Supplementary Table 4). A tiered panel
treatments in the month immediately preceding data collection. consisting of two groups of experts assigned and confirmed the pri-
In the first stage completed from 14 January 2020 to 15 September mary diagnosis as the most significant diagnostic label for each child.
2021, children were enrolled at the clinic of ZOC (Guangdong Province) The first group of experts consisted of two pediatric ophthalmologists
to develop and comprehensively validate (internal validation and reli- with over 10 years of experience in each recruiting ophthalmic center
ability analyses) the system. In the second stage, which occurred from who separately provided the preliminary labeling information. If a
22 September 2021 to 19 November 2021, children were enrolled at consensus was not reached at this stage, a second group of more senior
the clinics of the Second Affiliated Hospital of Fujian Medical Univer- pediatric ophthalmologists with over 20 years of experience at ZOC
sity (Fujian Province), Shenzhen Eye Hospital (Guangdong Province), verified the diagnostic labels as the ground truth. The diagnoses of
Liuzhou Maternity and Child Healthcare Hospital (Guangxi Province) children recruited online for the at-home implementation were made
and Beijing Children’s Hospital of Capital Medical University (Beijing) by experts at ZOC following the same criteria.
to additionally evaluate the system (external validation). We selected
these sites from three provinces across northern and southern China, Concept of the AIS system
representing the variations in clinical settings. In the first two stages, The AIS system consisted of a smartphone app (available for iPhone
recruited children underwent ophthalmic examinations by clinical and Android operating systems) for data collection and a DL back end
staff, and phenotypic videos were collected by trained volunteers for data analysis (Fig. 1a and Extended Data Fig. 1). To ensure the qual-
using mHealth apps installed on iPhone-7 or iPhone-8 smartphones at ity of data collected in real-world settings, AIS interactively instructed
each center. In the third stage conducted from 24 November 2021 to users to follow a standardized preparation sequence for data collec-
30 January 2022, we advertised our study through the online platform tion (Extended Data Fig. 2). Before data collection, a short demo video
of the Pediatric Department of ZOC and the social media of WeChat. was displayed to instruct users on the standard operation and how to
We recruited children and their parents or caregivers online from choose an appropriate environment to minimize testing biases (for
the Guangdong area for at-home implementation. The investigators example, room illuminance, background, testing distance and inter-
recruited the children following the same eligibility criteria as the ference). Once the smartphone was firmly in place, a face-positioning
previous two stages by collecting their basic information and medical frame was shown on the screen to help adjust the distance and position
history online. In addition, children who could not come to ZOC for an of the child in relation to the smartphone. After all preparations were
ophthalmic assessment or who had been included in other stages of completed properly, AIS played a cartoon-like video stimulus lasting
this study were excluded. Untrained parents or caregivers recorded the approximately 3.5 min to attract children’s attention, and the inbuilt
phenotypic videos with their smartphones according to the instruc- front camera recorded the children’s phenotypic features (ocular
tions of the AIS app at home (Extended Data Figs. 1 and 2). The quality movements and facial appearance) in video format.
control module automatically reminded parents or caregivers to repeat Then, the collected data were transferred to the DL-based back
data collection when the video recordings were unqualified. In this end, where the quality control module automatically performed quality
stage, all the children who completed successful video recordings checking on each frame first. To eliminate background interference,
underwent ophthalmic examinations at ZOC. A total of 3,652 children the children’s facial regions were then cropped out of consecutive
were finally enrolled, recording more than 25,000,000 frames of videos frames of sufficient quality to form short video clips as inputs of the
for development and validation of the system. subsequent DL models for final decision-making (a detection model to
distinguish visually impaired children from nonimpaired individuals
Definition of visual impairment and a diagnostic model to discriminate multiple ocular disorders). The
Comprehensive functional and structural examinations were per- DL models produced classification probabilities for short video clips,
formed to stratify children’s visual conditions for developing and which were eventually merged into the video-level classification prob-
validating the DL-based AIS. For unified examination, a teller vision ability as the final outcome by averaging. The final results were returned
card (Stereo Optical Company) was utilized to measure children’s to the mHealth app to alert users to promptly refer children at high risk
monocular visual acuity44. In addition, high-resolution slit lamp exami- of visual impairment to experienced pediatric ophthalmologists for
nations, fundoscopy examinations and cycloplegic refraction were further diagnosis and intervention.