0% found this document useful (0 votes)
85 views9 pages

Real-Time Augmentation of A Children's Card Game: Jordan Rabet

This document proposes augmenting physical trading card games to make them more interactive using real-time 3D augmentation on a mobile device. It describes building a proof-of-concept application that can detect, track, and overlay 3D models onto existing Pokemon trading cards in real-time using the camera on an Android tablet. The key contributions are developing a mobile system that can augment unmodified commercial trading cards in real-time without requiring a fixed camera position.

Uploaded by

John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views9 pages

Real-Time Augmentation of A Children's Card Game: Jordan Rabet

This document proposes augmenting physical trading card games to make them more interactive using real-time 3D augmentation on a mobile device. It describes building a proof-of-concept application that can detect, track, and overlay 3D models onto existing Pokemon trading cards in real-time using the camera on an Android tablet. The key contributions are developing a mobile system that can augment unmodified commercial trading cards in real-time without requiring a fixed camera position.

Uploaded by

John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Real-time augmentation of a children’s card game

Jordan Rabet
Stanford University
jrabet@stanford.edu

Abstract—Still today, trading card games make up a sizable 2) The scene was controlled : players had to place a
part of the entertainment industry. However, their tired paper mat supplied with the game to place their cards on.
form is being rapidly taken over by digital equivalents. In this Additionally, the camera had to be mounted above said
paper, we propose a way to reinvigorate interest in physical
trading card games by making them more interactive through mat using the supplied camera stand. The camera had
real-time 3D augmentation. While our ideal application would to be immobile during the duration of the game.
run on an AR headset such as the Hololens, we build a proof-of- 3) The game ran on the Playstation 3, a powerful non-
concept application on a commercially available Android tablet mobile device, more powerful than modern tablets.
which is able to augment existing Pokemon trading card. We 4) The camera used (the Playstation Eye) was designed for
show that using a modern CPU in conjunction with a GPU, this
task is entirely tractable on a mobile device with few restrictions. computer vision purposes, and was therefore capable of
streaming uncompressed video at high framerates (up to
120 frames per second).
I. I NTRODUCTION A more recent, similar example is ”Drakerz”, a game released
Many trading card games are based on the idea of cards in 2014 for the PC. While it does not require a playmat, it
representing monsters or other entities which are used to still requires that the camera be immobile and mounted above
fight one another. Playing one prominent example of such the playing field. It also runs only on modern computers.
a game, Yu-Gi-Oh, is often portrayed in media as being
B. Contributions
accompanied by holograms representing those monsters
and their actions, which make the game more fun and The main contribution of this project is building an end-
engaging to play. The goal of this project is to build a mobile to-end system which is able to detect, classify, track and
application which is able to augment an existing trading augment multiple commercially-available trading cards at once
card game in a similar fashion. While the ideal target for in real time on a mobile device. It mostly differs from previous
such an application would be an augmented reality headset, comparable applications by the fact that the camera need not
the application was developed for a commercially available be fixed, that the required computations were optimized for a
Android tablet (specifically, the Nvidia SHIELD tablet we mobile device, and that the gamecards being augmented were
have been provided) because of the current lack of such not designed for the purpose of augmentation, making the task
headsets as consumer products. more challenging.

II. T ECHNICAL EXECUTION


A. Previous work A. System overview
There are several commercial products which offer similar Our system’s goal is to take sequential camera frames as
functionality to this project. The most notable of these is input, detect Pokemon game cards in them, track their position
the Playstation 3 game ”The Eye of Judgment”, developed through time and finally augment them with the appropriate
by Sony Computer Entertainment and released in 2007. This 3D representation. In order to achieve this goal, we divided
game implemented the same concept of augmenting physical our system into five distinct components :
gamecards to make playing for engaging; however, it differed • Card detector : takes a single frame as input and attempts
in several key aspects : to find all cards it contains, outputting a list of 2D quads
1) Game cards were designed specifically for the purpose corresponding to the detected cards.
of being augmented in this game. In fact, they were all • Card classifier : takes a single image extracted and
marked with special tags meant to make detection and rectified from a frame (likely by the card detector) and
classification easier : CyberCodes [?]. matches it against known game cards to find which one
Fig. 1. Diagram representing our augmentation system’s organization .

it is. If found, returns the name of the corresponding


Pokemon as well as the quad’s proper orientation.
• Card tracker : takes frame n − 1, the location of a card at
Fig. 2. Sample Pokemon card. Note the thick yellow outline .
frame n − 1 (in the image plane) as well as frame n as
input and outputs the location of that same card (in the
image plane) at frame n.
• Pose estimator : takes 4 points in the image plane cards (though it isn’t usually yellow), making the idea of
representing a square in world space and outputs its world detecting cards by looking for their border applicable to more
space position and orientation. than just the Pokemon trading card game.
• Scene augmenter : takes a single frame accompanied by
the corresponding 3D locations and orientations of cards The borders we are looking for have two main
in world space and outputs the scene augmented with 3D characteristics : as tick borders to a quadrilateral, we
models on top of corresponding cards. can expect them to show up as two quadrilaterals in edge
A system’s flow can be seen in Figure 1. The main loop maps, and being bright yellow, we can expect to be able to
consists of the latest camera frame being sent to the card distinguish them from the rest of the scene based on color.
tracker, with the tracker’s output being fed the pose estimator As such, the first step in our detection process is finding all
which in turn goes to the scene augmenter. When initializing pixels which might belong to a card border based on color.
the system, the camera’s latest frame is also sent to the In order to do this, we first apply a white balance filter in
card detector, which uses the card classifier to identify cards order to correct colors and make sure our detector will work
before injecting its findings into the card tracker. The card in different lighting environments. In the current version, this
detection system is run at a much lower frequency than the is done using global histogram equalization on individual
tracker due to higher computational requirements. In practice, RGB channels. While this method was only supposed to
for the purposes of our tests, the detector was only run be a baseline, it ended up yielding very good results under
when the user tapped the tablet’s screen, which ended up multiple different lighting conditions, so it was kept. Then,
being sufficient thanks to the tracker’s robustness. For the we convert the color corrected image to the HSV color space,
final product however, we envision the detector continuously where we apply a simple threshold function to keep only
running in its own thread/core in parallel to the main loop. pixels which are approximately the right color. The values
In order to simplify the problem, we currently make the for this threshold function were manually tuned and chosen
assumption that all cards are located in the same plane, and in order to be more permissive than restrictive; in most cases,
that the camera’s intrinsics are known (ie, we assume that we this filter will catch a large amount of pixels outside of card
know that camera matrix K). borders, but this is remedied later in the detection pipeline.
Then, we compute an edge map of our image using the
B. Card detector Canny edge detection filter applied to a Gaussian-blurred
The game card detector’s goal is to allow the system to version of the frame. We then blur the resulting edge map
automatically find relevant targets in the current scene without using a flat 5x5 kernel, blur the color threshold image with
having the user manually tag them. Unlike previous works, a 21x21 Gaussian kernel, and compute the element-wise
our application is built to work with the approximately 10,000 product of those two images. The result is a bitmap where
Pokemon cards already in circulation [5], meaning we cannot all lit pixels have to be near both a well defined edge and
equip them with specially designed markers making detection the color yellow. An example of those steps can be seen in
easier. That being said, as can be seen in Figure 2, Pokemon Figure 3. As can be seen, while the color map and edge map
cards do have a very distinctive feature : their thick, yellow show a lot of data which is not related to the targets, once
border, which is what our detector targets. Interestingly, most put together they yield a decently accurate search area.
trading card games have a similarly thick border on all their With this bitmap generated, the goal becomes to isolate
parts of it which might be cards and eliminate those which
aren’t. This is done by finding all connected components in
the image. This can be done rather efficiently in a single
top-to-bottom left-to-right pass on the image, by connecting
each lit pixel to its left and top neighbors if either is also
lit, and merging components together if they both are. That
done, we filter connected components to eliminate unlikely
candidates; doing this, we assume that there should be a 1:1
mapping between components and cards, and therefore we
discard components which are too small (likely noise), those
which are contained in another component (likely yellow
markings on the card’s illustration), as well as components
which do not have the right proportions (likely due to noise,
glare, or an extraneous and unfortunately colored object). The
results of this filter can be seen in Figure 3.

With only components deemed likely to be an actual


card’s border left, we consider them each individually. Doing
this means taking each component’s bitmap and taking its
bitwise AND product with the original edge map. Doing
this gives us an edge map which should correspond to the
border, which we then use to fit a quadrilateral to represent
the card. An example of the output of the AND product
can be seen in Figure 4. Quad fitting here is done using the
Hough transform, and is a fairly simple process. We run the
Hough line detection algorithm (with an angular resolution of
1 degree) on the card candidate’s edgemap, which returns a
list of local maxima lines sorted by number of votes. We go
through this list in descending order of votes and only keep
lines which are different enough from previously picked lines,
in order to only keep the best lines of each cluster, stopping
at a maximum of 8 lines (though we typically end up with
fewer than 6). That done, we go through combinations of the
lines, computing all possible intersections, and only keeping
combinations which result in exactly 4 intersections within
the region of interest. Assuming there is more than one such
combination, we pick the one which results in the largest area
quad, which becomes our initial estimate for the card’s border.

We then refine this estimate by using a targeted version


of the Hough transform with an angular resolution of 1/128
degree, whose angular space is limited to angles less than 0.5
degrees away from our initially estimated. This allows us to
get a better angular estimate of the lines simply by running
the Hough line detection algorithm again, at no extra cost
compared to the one run earlier in the detection algorithm.
The results of the detector are then used to compute a
homography between the found borders and a rectangle with
the dimensions of a card, which is in turn used to perspective-
rectify cards. Those rectified images are then sent to the
classifier.

C. Card classifier
Image classification was not the focus of this project, so
in the interest of time little of our resources were put into
making it. As such, the chosen classifier is not particularly

Fig. 3. Sample run of detection pipeline. From top to bottom : original image,
color-filtered image, edge-map, ”border-map”, filtered connected components.
(each color represents a different component)
of course desirable in order to achieve good performance
and smooth augmentation. The tracker works separately for
each card in order to make independent card movement
possible, though tasks which can are batched together for
peformance reasons. The tracker is first initialized for a
card when it receives an initial position from the card
detector. When that happens, the tracker detects Shi-Tomasi
”Good Features to Track” [4] as implemented in OpenCV’s
goodFeaturesToTrack method and saves those which are
Fig. 4. Sample product of a connected component by the edge map. Left located within the initial quadrilateral estimate.
: original edge map. Right : product of edge map by connected component
border. .
When a new frame is received, the first thing done by the
tracker is computing KLT optical flow for the card’s features.
This is done to get an initial estimate of the motion between
the two frames : since our target is planar, a homography is
well adapted to the problem; it was mostly made as a proof-
computed between the frames using those feature matches
of-concept, as well as to take advantage of the geometry
(with RANSAC for outlier resiliance), and this homography
consistency test’s result in the detector and to make the final
is then applied to the card’s four corners. Unfortunately,
augmenter’s output look nicer without having to manually pick
doing just this is not enough : this kind of optical-flow
the types of the detected cards.
based tracking is extremely prone to drift, especially in
The method which was implemented is a simple version of
low resolution and noisy environments. While it might be
a bag-of-words classifier. First, we extract SURF features
able to maintain a good estimate of the card’s location, the
from training images in order to cluster them into a visual
card’s shape slowly changes over time, which is extremely
vocabulary. Then, we train a 1-versus-all SVM for each of the
problematic given that the card’s shape being accurate is
target images based on its histogram response to the full visual
essential to good pose estimation.
vocabulary. With that done, classification can be performed
Due to this, the optical flow tracking is only used as an
by taking the rectified card candidate query image extracted
initial estimate each frame. Once this initial estimate is done,
by the detector, extracting SURF features from it, matching
it is used as a search region for the actual new position of
those features to the vocabulary, computing the histogram
the card. A quad of slightly larger size is used as region of
response and running it through all the SVMs. We then get
interest in an edge map (again computed using the Canny
a score indicating how the confidence in the query image
edge detector), which is then fed to the same quad fitting
matching one of the given training images. We take all n cards
algorithm as the one previously described in the card detector,
which have a confidence score above a certain threshold, and
which a few differences. First, the edge map has to be filtered.
then run a geometry consistency test on them, by matching
Instead of relying on color (which can be unreliable, partly
SURF features from the query image with those from training
due to glare) to isolate the border, we filter out the card’s
images directly and computing a homography using RANSAC.
contents by only keeping the left-most and right-most pixels
Finally, the training image whose homography is computed
on each row, as the top-most and bottom-most pixels on each
with the highest number of inliers is chosen as the one which
column. This filtering can be done efficiently and in practice
matches the query image. Doing this geometry consistency
gives completely usable results, as can be seen in Figure ??.
test is especially useful because it gives us the query image’s
Additionally, instead of having the quad fitting algorithm look
absolute orientation, which in practice is necessary information
for the quadrilateral with the largest area, it instead looks
for our pose estimator.
for the one that minimizes the cumulative distance between
As previously mentioned, classification is not the focus of
the old and new quads. In case the tracker is unable to fit
this project; as a result, this approach is fairly slow and
a quad, it sticks with the initial estimate based on optical
not immediately scalable to large numbers of card types. In
flow. An overview of the card tracker can be found in Figure 6.
practice, given the high number of individual Pokemon (over
700, each of which require their own 3D model, textures,
Another problem which the tracker has to deal with is
animations, sounds...) and of individual Pokemon cards (on the
sudden camera movements, both big and small, which throw
order of 10, 000), one could imagine that hosting the classifier
off the KLT tracker. Indeed, a problem with the approach
as a service on a distant server which the client would send
presented above is that if the optical flow tracker’s initial
query images to, and would in return give information on the
estimate is too far off, then the quad fitting algorithm will
card accompanied by the assets necessary for augmentation.
fail to recover from it. While smooth movements typically
result in good initial estimates, quick camera jerks (typically
D. Card tracker
unintentional) can ruin the tracker’s accuracy. In order to
The goal of the card tracker is to determine the movement deal with this, we first attempt to detect those instance by
of a card between two frames so that neither the detector computing the variance of the distance betwen corresponding
nor the classifier has to be run again each frame, which is card corners as estimated by the optical flow tracker. The
Fig. 5. Sample product of the edge map border filter. Left : original edge
map. Right : border-filtered edge map. .

Fig. 6. Diagram representing the card tracking system .

idea is that if the movement is smooth and the timestep is


small enough, then the shape of the card will have changed
relatively little between two sequential frames, meaning that
the variance would be low. On the other hand, if the movement
is not smooth and the optical flow tracker results in an estimate
which is largely off, then the variance will be high. In practice,
we were able to find a threshold on that variance above Fig. 7. An example of pose ambiguity; both images represent correct
which we determine that a problem occured with the optical mathematical solutions to the pose estimation problem. .
flow tracker. When this happens, we replace the homography
computation with a (still RANSAC-based) the estimation of
an affine transformation between the matching features. This method), however they are not ideal for multiple reasons.
way, we are able to keep a sane shape for the card without The first reason is that being geared towards a more general
sacrificing the knowledge the optical flow tracker gives us problem (that of an arbitrary 3D object), they are less than
about the card’s likely translation and rotation between the two optimal for our problem which only involves planar objects,
frames. Implementing this largely helped improve our tracker’s and performance is definitely a concern for real-time mobile
robustness. applications. The second reason is that the best PnP methods
In practice, we observed that this tracker performs quite well, are iterative and only converge towards a single solution,
is robust to a number of potentially problematic situations which is problematic as, in the case of planar objects, it was
(partial occlusion, glare, camera jerk, slight motion blur) and shown that there are in fact two local minima which each
at a relatively low cost in performance overhead thanks to yield a separate solution pose to the problem [2]. In order to
optimizations which are described below. deal with this ambiguity, it is necessary to know what both of
these solutions are, making generic PnP solutions inadequate.
E. Pose estimator An example of the pose ambiguity problem can be seen in
For every frame, once we are confident that we have found Figure 7.
a reasonably good estimate of a card’s corners in the image
plane, we need to determine the card’s pose in world space, Thankfully, there exists a method made to estimate the
which entails finding its position (represented by a translation camera’s pose relative to a square marker which deals with
vector) as well as its orientation (represented by a 3x3 rotation both of those issue [3]. Essentially, the method introduces
matrix). There are many so-called PnP (Perspective-n-Point) a parameterization of the pose which depends on a single
methods which can be used to estimate camera’s pose relative variable, the primary angle β, which both allows us to find the
to an object given the object’s three dimensional shape and two ambiguous solutions and make extensive use of look up
its corresponding image plane points. These methods can tables (LUTs) to accelerate the process. Making use of both the
in fact be directly applied to this problem (the baseline we original paper and a matlab implementation of the algorithm
used for this part of the system was OpenCV’s solvePnP provided by the authors, we were able to recreate the algorithm
in C++ and integrate it into our system. Of course, because G. Platform-specific optimizations
the algorithm is made to work with square targets while ours In order to get this complex system to run at acceptable
are rectangular, we apply the right scaling transform before speeds on a mobile device, many optimization specific to the
calling the pose estimator. This makes it especially criticaly platform had to be made. The first one is of course the use
that the card’s orientation be known, as applying the scaling of look-up tables for pose estimation, which was mentioned
over the wrong direction would yield a wrong pose. above. The second one has to do with the quad fitting
In order to deal with pose ambiguities, we rely having more algorithm also described above. As it is used in the tracker, it is
than a single card being tracked by our system. Each card pose important that it be very efficient. Its initial implementation,
yields 4 points in 3D, on which we can use RANSAC to fit a based on the HoughLines function defined in OpenCV, was
plane. Given the nature of the ambiguities, this should allow found to perform very poorly. After investigation, it turned
us to find the plane the cards are actually located on, since out that while the function was based on look-up tables,
we assume that all cards are coplanar. Then, using this plane’s those were re-generated each time the function was called.
normal vector, we can, for each card, find the pose which is Additionally, parts of the function made use of doubles which
closest in orientation to that plane, which should be the pose are far slower than single precision floats on our target device.
we are after. In order to improve this design, the hough line function was
In addition to this, we make use of the fact that all cards are reimplemented, based on the OpenCV code, but changing the
located in the same plane to deal with potential outlier cards. way look-up tables are generated (only once per resolution
For example, there are situations in which 3 cards will have value), all the code which used double precision floats, and
been tracked properly, but the fourth’s corners are improperly actually doing away with floats in some parts of the function
registered for a few frames. By default, the fourth card’s pose by switching the look-up tables to use fixed point math. No
would be computated as being completely different from the loss in precision was measured, while the average call to the
others. In order to deal with this, we cluster plane normals function was made 6 times faster.
extracted from each card’s pose, determine which normals Another costly operation in the tracking pipeline was found to
might be outliers, and then compute the plane normal as the be the sparse KLT optical flow calls. First of all, the individual
average of inlier normals. We then reinject that average normal calls made for each card’s tracking were mutualized into a
into each card’s pose by setting it and then applying Gramm- single call using for all tracked features at once, which helped
Schmidt orthonormalization to the rotation matrix. make the use of optical flow more tractable. Additionally, the
optical flow computation was completely moved from the CPU
F. Scene augmenter to the GPU using CUDA, which made the process twice as
fast. Similarly, the computation of the Canny edge map (after
The scene augmenter module was made using OpenGL ES
the Gaussian kernel) was changed to be done using CUDA,
2.0 for rendering. It works by first transferring the current
multiplying the speed threefold. A related design choice had
from to the GPU as a texture, and rendering it as a flat image,
all ”fundamental” image transformations such as the edge map
clearing the depth buffer at the same time. Then, a projection
computation, color correction and conversions all be stored in
matrix is computed for the actual augmentation. This camera
a single ”frame sequence” object in order to make sure that the
matrix is based on the camera matrix K, but is modified
same computation would never be done more than it needed
in order to preserve depth information, which is needed for
to be each frame.
OpenGL rendering. If we have :
Finally, in order to make use of the fact that our target device
  has not only powerful GPGPU capabilities but also a quad core
fx 0 cx CPU, the card tracking process was changed to make good use
K =0 fy cy  of multithreading by spawning 4 worker threads each able of
0 0 1 processing an individal card’s tracking in parallel to others. In
most cases, this multiplied the tracker’s overall speed by 4.
Then the projection matrix is as follows : A visualization of the final tracking pipeline can be seen in
Figure 8.
 2fx 2cx
−1

w 0 w 0
 0 2fy
1−
2cy
0  III. R ESULTS
 h h 
 0 n+f nf 
0 n−f n−f
A. Performance
0 0 1 0 As is indicated by the previous section, performance was
a major concern while making this application. In the end,
Where w and h are the frame’s width and height respectively, we were able to achieve an overall average of 14 frames per
n is the near plane’s distance and f is the far plane’s distance. second in most situations with 5 or fewer cards, which is more
The models used for augmentation are rendered using an than enough to be considered real time application. In Table
OpenGL ES 2.0 OBJ parser and renderer written for the I, the average time taken for each task of main loop can be
occasion. found. What we can see is that the optimizations made to the
Fig. 9. Graph representing the number of frames which the tracker success-
fully fit a quad to over time, in a scene with a total of 6 cards being tracked
.

Fig. 8. Diagram representing the current tracking pipeline’s distribution across


computing devices .

TABLE I
P ERFORMANCE OF THE MAIN LOOP. P ERFORMANCE RECORDED WHEN
TRACKING 4 CARDS .

Task Device Average time taken


Sparse optical flow GPU 19.6ms
Canny edge map GPU 19.3ms
Single Hough transform CPU 10.2ms
Quad fitting CPU 23.6ms
All cards tracking update GPU and CPU 45.5ms
Single card pose estimate CPU 0.58ms
All cards pose estimate CPU 3.2ms
Processing entire frame GPU and CPU 73.5ms

system’s various modules were very effective. For instance,


each pose estimate is done in only half a millisecond; similarly,
the multithreading used to parallelize card tracking results in
very little overhead, allowing the application to process 4 cards
almost as fast as a single core is able to process a single one.
B. Quality of augmentation
It is difficult to accurately gauge the accuracy of the
augmentation as we do not have any kind of ground truth, Fig. 10. An example of the tracker showing robustness to glare. (bottom left
especially in the case of orientation. A (flawed) metric which card) .
can be used to somewhat assess the quality of the tracker is
the number of cards for which the tracked is able to fit a
new quad each frame. A graph representing this metric can output still suffers from intermittent jitters, typically due to
be found in Figure 9; interpreting it, we can see that most poor quad fits happening in certain frames. This indicates that
cards don’t seem to go long without getting a new quad fit to tracking could still be improved, as well as the normal vector
prevent drift, except for one which apparently doesn’t get a outlier rejection mentioned previously.
new fit for the entire middle section of the sequence. A video showing the augmenter running under a variety of
Qualitatively, we can say that the tracker manages to be robust conditions (including the ones described above) can be found
to a number of problematic situations. For example, it is able to attached to this paper.
be robust to glare in certain situations, as can be seen in Figure
10. Additionally, it is able to withstand partial occlusions for IV. F UTURE WORK
short periods of time, as can be seen in Figure 11. It is also While this project’s results are very encouraging, they do
robust enough that a variety of angles work well, even very not make a full product. Due to lack of time, all potential
low angles which let the user look at augmented models from optimizations and features could not be included in this
the side, as can be seen in Figure 12. More impressively, it is project. For example, the tracking pipeline presented in Figure
robust enough that having a user move cards while the camera 8, while already producing good enough performance, leaves
itself is being moved around with their finger is not a problem, to be desired in that it does not make optimal use of all
be it rotations or translations, as can be seen in Figure 13. computing devices at all times. Instead, one could imaging a
That being said, while the tracking is overall robust and tracking pipeline closer to that described in Figure 14, which
accurate, the final result is not perfect. Unfortunately, the has the GPU processing data for frame n while the CPU is
Fig. 14. Diagram representing the a possibly better tracking pipeline’s
distribution across computing devices .

Fig. 11. An example of the tracker showing robustness to partial occlusion.


(bottom right card) .
still processing data from frame n−1, as such a pipeline could
as much as double the currently observed framerate.
Optimization is not the only part of the project which could
be expanded upon. Indeed, although the project runs on a
mobile device and requires an accurate estimate of the device’s
location relative to its environment for proper augmentation, it
currently makes no use of any of the on-board inertial sensors.
One could imagine that integrating signals from those sensors
could help make tracking even more robust, as well as give a
good heuristic to help decide between ambiguous poses in the
pose estimator.
Finally, a big part of trading card games is actually playing
with them. While augmenting cards with 3D models already
helps make the game more engaging, it is not enough for
Fig. 12. An example of the tracker showing robustness to a low viewing an application which aims to fully augment card game duels,
angle. . by for example animating Pokemon attacking other Pokemon.
While the underlying rendering part is trivial, detecting that
a player is commanding one of its creatures to attack is
not. Doing so would require maintaining a coherent map of
the battle field, including attached semantics such as card
ownership and card status (for example, detecting that a card
is sideways which in a lot of card games is indicative of an
action). One could imagine implementing some kind of card-
based gesture recognition system to allow the application to
fully follow the duel.

V. C ONCLUSION
With this paper, we demonstrated that it is entirely possible
to augment a trading card game in real time even in spite
of the absence of specially designed markers on cards, of a
fixed camera or of heavy computation capabilities. We paid
close attention to code and algorithm optimization in order
to get the system running smoothly on a mobile device, the
Nvidia SHIELD tablet. We leveraged of both the CPU and
the GPU for computation, and found that our card tracker is
robust enough to handle users moving cards while the camera
is also moving independently. We made suggestions for ways
to expand on the project in both purely technical ways and
more feature-oriented ones.

R EFERENCES
Fig. 13. An example of the tracker keeping up with a variety of user card
movements. [1] Jun Rekimoto and Yuji Ayatsuka, CyberCode: Designing Augmented
Reality Environments with Visual Tags 2001.
[2] G. Schweighofer and A. Pinz, Robust pose estimation from a planar target
2006.
[3] Shiqi Li and Chi Xu, Efficient Lookup Table Based Camera Pose
Estimation for Augmented Reality 2011.
[4] Jianbo Shi and Carlo Tomasi, Good Features to Track 1994.
[5] Pokepedia, Pokepedia list of Pokemon cards. http://www.pokepedia.net/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy