Statement of Purpose - GT HCC
Statement of Purpose - GT HCC
me]
Statement of Purpose
December 15, 2023
I want to design next-generation spatial user interfaces and intelligent tools. This idea
originated in my growth during my middle school years when the mobile internet was starting
to boom, and I was deeply inspired by design innovations of electronic products such as the
iPhone. This interest led me to a multidisciplinary education at ShanghaiTech University, where
I studied CS, design, and humanities, and further refined my skills and taste through research at
UIUC and Harvard. I find designing human-centered interfaces critical for effective technology
application, and exploring design challenges through various research methods is fascinating to
me. For these reasons, I am determined to pursue a Ph.D. in HCC at Georgia Tech.
Sparks of Artificial General Intelligence (AGI) and emerging spatial computing platforms
provide a vast design space for Human-Computer Interaction, and three challenges in it excite
me the most. First, spatial computing platforms break the boundaries of traditional screen-based
interfaces, calling for customized information input and display methods to maximize the high
interaction freedom in 3D space. Second, as Large Language Models (LLMs) begin to show AGI
capabilities, the boundaries between human and AI abilities have become increasingly blurred,
thus human-AI collaboration requires a more precise division of roles to support human values
such as creativity. Lastly, as rapid technology advancement imposes higher demands on
people's literacy and adaptability to new things, we need to design user interfaces for
technologically underprivileged groups—transforming technology into new opportunities for
personal growth rather than barriers. Over the past few years, I have conducted research on
several projects spanning these themes, primarily focused on human-centered user interface
design. For my Ph.D. studies, I want to take a broader perspective, while still addressing the
design challenges around the above three themes.
Exploring intuitive interaction design for spatial computing. Recognizing that human
gaze reveals spatial attention during the interaction, I explored using eyes as an input source for
spatial computing headsets and led a research project on VR gaze interaction. Existing gaze
interaction methods mainly use gaze direction for target pointing, but struggle with Midas
Touch problem for target selection—distinguishing between 'looking' and 'choosing' intents.
The key problem is that gaze direction does not naturally contain reliable information to
differentiate these two intentions; therefore an additional input signal for selection is needed.
After analyzing the design space of eye input, we discovered that visual depth calculated from
binocular vision is an intriguing dimension. For instance, we can look through a window at a
distant landscape or focus closer to observe the dust on the window—we can intentionally
change how far we see to selectively observe different objects. Inspired by this, we placed a
'virtual window' in front of users, so they can choose an object by bringing their focus closer
and displaying the information on the window. This idea led to FocusFlow, a hands-free gaze
selection method using visual depth shift for VR headsets. We also designed multiple user
studies to evaluate the usability and learnability of this novel eye input technique, and found
that providing immediate and clear visual feedback can greatly help users understand and
master this interaction process. This intuitive and efficient gaze-depth interaction method has
demonstrated application value in both general and professional scenarios and was showcased
as a demo at UIST 2023, with the full version currently under review. [project page]
Supporting human creativity in future human-AI interaction. The advent of LLMs has
provided a powerful technological foundation for human-AI interaction. During the summer of
2023, I explored the collaboration patterns between designers and LLMs at the Harvard Visual
Computing Group. We aim to leverage the LLM’s fuzzy matching capability to translate UI