0% found this document useful (0 votes)
60 views5 pages

Enhancing Multi-User Interaction With Multi-Touch Tabletop Displays Using Hand Tracking

1. The document describes a rear-projection multi-touch tabletop display that was augmented with hand tracking using computer vision techniques. This allows multiple users to simultaneously interact with the display without interference. 2. The system uses frustrated total internal reflection (FTIR) to detect touches, but this has limitations with differentiating touches from multiple users. An overhead camera is added to track hands using skin color segmentation. 3. Touch points are assigned to tracked hands, solving the problem of differentiating touches from multiple users and increasing robustness. Hand tracking also enables interactions above the surface without touching. This creates a hybrid interaction technique.

Uploaded by

Paul Lacey
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views5 pages

Enhancing Multi-User Interaction With Multi-Touch Tabletop Displays Using Hand Tracking

1. The document describes a rear-projection multi-touch tabletop display that was augmented with hand tracking using computer vision techniques. This allows multiple users to simultaneously interact with the display without interference. 2. The system uses frustrated total internal reflection (FTIR) to detect touches, but this has limitations with differentiating touches from multiple users. An overhead camera is added to track hands using skin color segmentation. 3. Touch points are assigned to tracked hands, solving the problem of differentiating touches from multiple users and increasing robustness. Hand tracking also enables interactions above the surface without touching. This creates a hybrid interaction technique.

Uploaded by

Paul Lacey
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Enhancing Multi-user Interaction with Multi-touch Tabletop Displays

using Hand Tracking


K.C. Dohse 1, Thomas Dohse 2, Jeremiah D. Still 1, Derrick J. Parkhurst 13
Human Computer Interaction Program 1, Computer Science Department 2, Psychology
Department
Iowa State University
{ kcd | dohse | derrick | jeremiah } @ iastate.edu

Abstract cost and scalable [3]. Other computer vision based


tracking systems have a limited ability to detect
A rear-projection multi-touch tabletop display touches versus near touches which is an important
was augmented with hand tracking utilizing computer element of interacting with the table surface.
vision techniques. While both touch detection and FTIR tracking alone has two shortcomings
hand tracking can be independently useful for compared to other methods of touch tracking. Each
achieving interaction with tabletop displays, these touch in FTIR appears as an independent event.
techniques are not reliable when multiple users in Although inferences based on distance between touch
close proximity simultaneously interact with the points can be leveraged to guess which touches are
display. To solve this problem, we combine touch part of the same event, each touch ultimately remains
detection and hand tracking techniques in order to a standalone piece of data. As the number of users
allow multiple users to simultaneously interact with and complexity of their actions increases, so does the
the display without interference. Our hope is that by probability of incorrectly grouping touch points with a
considering activities occurring on and above a single user. The other issue is that the system is
tabletop display, multiuser interaction will become inherently susceptible to problems with spurious IR
more natural and useful, which should ultimately noise (e.g. poor lighting conditions or flash
support collaborative work. photography).
To solve lighting and touch differentiation
1. Introduction problems, we augmented a FTIR tabletop display with
an overhead camera. Using the camera, hands on or
Large displays are useful for information over the table can be tracked using skin color
visualization when multiple people must jointly use segmentation techniques. With hand coordinates
the information to work together and accomplish a available, touch points can be assigned the ownership
single goal. The social interactions that result from necessary to support multiple users and correctly
using a shared display can be highly valuable [1]. identify events comprised of multiple touches. This
However, these displays can fail to allow multiple technique works well even when gestures are made by
users to simultaneously interact with the information. multiple users in close proximity because it does not
Tabletop interfaces can provide a large shared need to differentiate touches based on closeness. The
display while simultaneously accepting natural and fusion of hand and touch point locations also increases
direct interaction from multiple users through touch the robustness of touch sensing in the presence of
detection. For example, multi-touch surfaces that unwanted IR light because of the redundancy of the
utilize the phenomenon of frustrated total internal point’s location.
reflection (FTIR) have received widespread attention Additionally, tracking hands allows users to
recently [2]. FTIR detection techniques allow the generate interactions without touching the surface, but
system to track a large number of simultaneous touch rather making movements above the table. This
points with very high spatial and temporal frequency. creates a hybridization of the two interaction
The FTIR has several advantages over other multi- techniques that is still being explored.
touch detection technologies, such as being both low

1
3. System
We developed a rear-projection tabletop display
with multi-touch sensing capabilities using the
principles of frustrated total internal reflection (see
Figure 1). A 4’ x 3’ acrylic diffusing sheet is mounted
at waist level and projected on from below through a
series of mirrors in order to create a rear-projection
tabletop display. Infrared light entering the sides of a
second acrylic sheet mounted just above the diffusing
sheet is totally reflected at the acrylic-air border
preventing any light from escaping the surface. When
Figure 1: Three users working together using a rear- touched, infrared light is able to leave the acrylic
projection multi-touch tabletop display augmented with creating a spot of light below each point a finger is in
hand tracking using an overhead camera. contact with the acrylic. Touch points can then be
tracked using an infrared camera in the same optical
2. Related Work path as the projector. A visible spectrum camera is
mounted above the display for hand tracking via skin
Techniques and technologies used for interaction detection.
detection on tabletop displays are rapidly maturing,
but many researchers are still seeking better methods 3.1 Touch Detection
for capturing natural interactions made by multiple
users within the context of real world applications. Each frame captured by the camera has a known
There are a number of approaches to tracking user base image subtracted from it and is then converted to
interactions with a tabletop display. One successful a binary image by setting each pixel to black or white
method is to use a surface material that is laden with based on a user defined threshold. The center of non-
sensors, such as the commercially available contiguous regions in the binary image are extracted
DiamondTouch system. This system uses a technique as touch locations. Since touches are viewed as
where a circuit is capacitively closed when the user atomic, largely analogous to a mouse click, the
touches the table [4]. Interfaces like this one use front reduction of a circle of a light to a single point makes
projection due to the opaque surface needed for the sense from a design philosophy perspective as well as
sensors. Other systems such as the metaDESK also use minimizing memory and network usage. These
sensors, but integrate them in physical objects that can locations are then compared to the known locations of
be manipulated [5]. all other touches to determine if they represent a new
Another common approach is to use video touch or the movement of a finger already in contact
cameras to track interactions. For example, the with the screen. When new points of light appear,
HoloWall uses a semi-opaque diffuser that allows there is a short period where the point is tracked but
infrared (IR) light projected from behind the screen to kept in limbo and not reported to the rest of the
reflect off of objects at a certain distance from the system. This minimizes the risk of random
surface [6]. The TouchLight interface uses two IR environmental noise from interfering with an
cameras to determine when contact with the screen has application. Similarly, existing points which are being
occurred [7]. Other projection based systems, like the tracked are kept briefly in limbo after they disappear
ViCat use overhead cameras to track hand gestures in order to continue tracking a point even if it is not
[8]. This table does not use physical touches on the visible for a few frames. Each touch is assigned an
surface, but uses an overhead camera to track hand identification number which it keeps until the finger is
gestures in order to interact with the display. lifted from the screen. When no fingers are in contact
Work is also being done to improve the nature of with the screen, the system resets the identification list
multi-touch interaction. These areas of research are and the next touch will be given the identification
equally vital to the field as designing new systems to number of zero.
support multi-touch. Such as designing cooperative
gestures to facilitate teamwork [9]. Other research has 3.2 Hand Detection
aimed to build a framework for designing and
evaluating multi-touch interactions [10].

2
The use of skin color segmentation has proved
successful in a wide range of applications [11, 12]. 3.3 Event Generation
Yang and Waibel have presented a successful skin-
color algorithm that tracks skin in real time [13, 14]. The touch screen interface requires a more
They are able to achieve skin tracking by dimensional complex event handling system than a standard mouse
reduction of the available color space. Further, the and keyboard interface since it must allow applications
dimensional reduction is achieved by targeting the to treat multiple touches as belonging to a single
clustered normalized skin colors. According to Yang, gesture event. To facilitate robust event recognition
Weier and Waibel (1997), color differences between from camera tracking data, a publisher-subscriber
people appear within the intensity dimension rather design pattern was implemented to distribute event
than color [15]. This dimensional reduction to target generation responsibilities over multiple functions,
skin normalized color is an effective method for skin allowing each function to have a very limited scope.
detection. Thus, we adopted their method of color Each subscriber checks if the raw input from the
space dimensional reduction to effectively target computer vision components of the system meet its
spatial regions likely to be skin. specific criteria to generate an event. To allow each
Hands are detected by creating a binary image of application to control the events the system recognizes
the pixels which fall within the appropriate RGB (red, and to have multiple modes of operation within an
green, blue) ranges. Each contour of the binary image application, subscribers are dynamically added and
is found using the cvFindContours function provided removed at runtime by the client application. Client
by opencv. Very small contours are assumed to be applications need only handle events. The functions
noise in the image and are discarded with the that recognize events can be treated as a black box.
remaining elements assumed to be the hands and arms Data from both the overhead skin tracking and
of users. Three pieces of data, the “fingers” point, the infrared touch tracking is handled simultaneously by
“table edge” point and an integer value representing data listeners, allowing for events to incorporate both
the side of the table, are extracted from the contour data sources into their event detection criteria.
and passed over the network to be used in
applications. This data is acquired by iterating over the
points in each contour to find the points with
minimum and maximum Y-axis values. For simplicity
we will refer to the two primary opposing sides of the
table as side one and side two. Users standing on side
one will create contours which have very low Y-axis
values at one end and middle-ranged Y-axis values at
the opposite end of their arm contour. In this case, the
low-value is known to be the “table edge” point and
the high-value point is known to be the “fingers”
point. Conversely, users on side two will create
contours with very high Y-axis values at the table edge
and middle-ranged Y-axis values at their fingertips.
Similarly, the system can easily be extended to
recognize users on the remaining two sides by doing
comparisons with X-axis values.
Due to the table often operating in low lighting
and each application causing substantially different
colors and intensities on screen, all the hand detection
parameters for color ranges and sizes are adjusted at
runtime. A limitation of the system worthy of note is
that to facilitate accurate tracking of hands,
applications need to maintain a reasonably constant Figure 2: System Dataflow; Red in the
sychronization phase indicates erroneous points
color and intensity across the screen. Map based
applications, virtual board games and workspaces for removed.
collaborative planning, command and control are not
adversely affected by this limitation and comprise the 4. Results
focus of our research.

3
Tracking hands creates two important
functionalities, the first is the ability to associate a
touch with a user. Multi-touch interfaces inherently
accept interaction from multiple users, however, it has
been impossible so far to discriminate between users
in the discreet manner of computer vision skin
tracking. User identities are maintained by the
overhead tracking information and are not dependent
upon an individual's skin tone. By having one user
standing on each side of the table, each touch can be
associated with a corresponding user with very little
risk for error. With the added information of which
hand created a touch, role specific functions can be
given to different users for increased collaborative
power. Users should avoid overlapping hands to
prevent occlusions in the overhead camera's line of Figure 3: Red reflects touch points determined to be
site, which may interfere with tracking. The other noise; Blue shows user A touch; Green shows user B
main benefit is that because a hand is continuously touch; White reflects detected skin; Yellow circles
tracked, multiple touches created by the same hand indicate the acceptable spatial region for a touch to
occur
can be treated as part of the same event. By doing this,
touches no longer need to be treated as singular
events, but can be given a history and associated with 5. Conclusions and Future Work
previous touches. Furthermore users can now interact
with objects on the table without physical contact. One Augmenting a FTIR multi-touch table with
such usage was to have a checker float in front of the computer vision skin tracking improves collaboration,
users hand after touching it in a simulation of the expands the methods of interaction and increases
checkers board game. touch tracking robustness. The overhead camera is
Additionally, hand location data can be used to able to discreetly track multiple users by establishing
increase the touch sensitivity without increasing error. identities for each individual when they make their
The grayscale image is converted to binary by initial actions and tracking them for the remainder of
comparing each pixel with a variable threshold value. their usage.
A lower threshold creates more false positives due to Maintaining user identities has two important
random noise in the image. By tracking hand aspects: preventing users from inadvertently
locations the system automatically ignores erroneous interfering with each other and delegating role specific
touch data by removing touch locations with large functions to different users. When two or more people
distances from any hand. are using a FTIR interface, touches can be confused by
To allow the system to simultaneously track the system and unintentional actions can be
multiple hands, touches, and run computationally performed, but this is overcome when the system can
intensive programs, we adopted a distributed differentiate touches between users. In many
architecture so that each camera is connected to a collaborative projects it is common for people to have
separate computer, each of which communicates over different specialties and assignments, efficiency is
a standard network connection. This allows the increased by allowing different users to interact with
system to be responsive to user actions with minimal the same object or region and have different actions
delay while maintaining a high frame rate in the performed.
application using only average hardware. The system is also able to recognize gestures that
Both cameras are low cost webcams running at a occur above the table, therefore interactions are not
640 x 480 resolution with frame rates of relegated to physical contact with the table. Objects on
approximately 25 fps for touch tracking and 16 fps for the table can be picked up and moved in a very natural
overhead hand tracking under regular usage. manner because the object follows the hand that
The representation of touches as single points selected it. Furthermore, a whole new assortment of
based on their center and hands as finger location natural gestures can be integrated, reducing the
points and table edge points offers enough precision amount of cumbersome gestures a user needs to learn.
for actions such as button presses while minimizing This technology was originally developed for, and
memory and network usage. is showcased well, within the dynamic environment of
military command and control. While planning and

4
reviewing missions, the augmented FTIR multi-touch of the 18th Annual ACM Symposium on User Interface
table can mimic the traditional sandbox interface in Software and Technology.
virtually every way, while providing additional
features impossible to recreate in the real world. The [4] Dietz, P. H., & Leigh, D. L. (2001). DiamondTouch: A
Multi-User Touch Technology. In ACM Symposium on
fast paced and multi- faceted nature of live mission User Interface Software and Technology (UIST), 219-226.
command and control can utilize all of the advantages
our table offers. For example, while managing a group [5] Ullmer, B. and Ishii, H. The metaDESK: models and
on unmanned aerial vehicles there are many activities prototypes for tangible user interfaces. Proceedings of the
that need to be performed by soldiers with different 10th annual ACM symposium on User interface software
individual objectives, but all of the soldiers have a and technology, ACM Press, Banff, Alberta, Canada, 1997.
common goal and can greatly benefit from the shared
space that a table provides. User tracking and role [6] Matsushita, N., & Rekimoto, J. (1997). HoloWall:
assignment effectively acts as a gesture multiplier and Designing a Finger, Hand, Body and Object Sensitive Wall.
in ACM Symposium on User Interface Software and
ensures that no soldier can interfere with another's Technology (UIST).
mission goal in this time critical activity.
Future work will consist of adding stereo to hand [7] Wilson, A. (2004). Touchlight: An Imaging Touch
tracking, automatically calibrating skin color Screen and Display for Gesture-Based Interaction. In
parameters, incorporating tangible objects and Proceedings of the International Conference on Multimodal
usability testing. A second overhead camera will be Interfaces. p. 69-76.
added so that above table gestures will be tracked in
three dimensions by determining the disparity between [8] Chen, F., Close, B., Eades, P., Epps, J., Hutterer, P.,
the two images. To allow more robust skin detection, Lichman, S., Takatsuka, M., Thomas, B., and Wu,M.,
“ViCAT: Visualization and Interaction on a Collaborative
the system will calibrate the skin color parameters Access Table”, in Proc. IEEE Workshop on Horizontal
based on the colors being projected on the screen, Human-Computer Systems (2006), 59-60.
rather than predefined values. Tangible objects can be
integrated because the skin tracking algorithm can be [9] Morris, M., Huang, A., Paepcke, A., and Winograd, T.
readily adjusted for any color or intensity, thereby (2006). Cooperative gestures: Multi-user gestural
making objects placed on the surface distinguishable interactions for co-located groupware. Proceedings of the
from projected images. Current interactions have been ACM CHI Conference on Human Factors in Computing
fairly basic; usability research will be conducted to Systems. pp. 1201-1210.
improve the merging of touch and above table
[10] Wu, M., Shen, C., Ryall, K., Forlines, C., and
interactions. Balakrishnan, R. (2006). Gesture registration, relaxation, and
Our goal is to continue to advance multi-touch reuse for multi-point direct-touch surfaces. Proceedings of
technology by removing as many barriers between the IEEE TableTop - the International Workshop on Horizontal
user's intentions and the interfaces input capabilities as Interactive Human Computer Systems. pp 183-190.
possible. We believe the research efforts discussed in
this paper remove some existing barriers and provide [11] Ohta, Y., Kanade, T., & Sakai, T. (1980). Color
users with a more intuitive experience than multi- information for region segmentation. Computer Graphics
touch or hand gesture interaction alone. and Image Processing, 13(3), 222-241.

[12] Swain, M. J., & Ballard, D. H. (1991). Color indexing.


6. Acknowledgements International Journal of Computer Vision, 7(1), 11-32.
We thank the Air Force Office of Scientific
Research (AFOSR) for supporting this research. [13] Yang, J., & Waibel, A. (1995). Tracking human faces in
real-time. Technical Report CMU-CS-95-210, CS
department, CMU.
7. References
[14] Yang, J., & Waibel, a. (1996). A real-time face tracker.
[1] Sara A. Bly and Scott L. Minneman. Commune: A Proceedings of the 3th IEEE Workshop on Applications of
Shared Drawing Surface. In SIGOIS Bul/etin, Computer Vision, Sarasota, Florida, 142-147.
Massachusetts, 1990, pp.184-192.
[15] Yang, J., Weier, L., & Waibel, A. (1997). Skin-Color
[2] Kasday, L. (Nov. 1984). Touch Position Sensitive Modeling Adaptation. Technical Report CMU-CS-97-146,
Surface. U.S. Patent 4,484,179. CS department, CMU.

[3] Han, J. Y. (2005). Low-Cost Multi-Touch Sensing


through Frustrated Total Internal Reflection. In Proceedings

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy