An Indoor Navigation System For Smartphones
An Indoor Navigation System For Smartphones
Smartphones
Abhijit Chandgadkar
Department of Computer Science
Imperial College London
Navigation entails the continuous tracking of the user’s position and his
surroundings for the purpose of dynamically planning and following a route
to the user’s intended destination. The Global Positioning System (GPS)
made the task of navigating outdoors relatively straightforward, but due to
the lack of signal reception inside buildings, navigating indoors has become
a very challenging task. However, increasing smartphone capabilities have
now given rise to a variety of new techniques that can be harnessed to solve
this problem of indoor navigation.
In this report, we propose a navigation system for smartphones capable
of guiding users accurately to their destinations in an unfamiliar indoor en-
vironment, without requiring any expensive alterations to the infrastructure
or any prior knowledge of the site’s layout.
We begin by introducing a novel optical method to represent data in
the form of markers that we designed and developed with the sole purpose
of obtaining the user’s position and orientation. Our application incorpo-
rates the scanning of these custom-made markers using various computer
vision techniques such as the Hough transform and the Canny edge detec-
tion. In between the scanning of these position markers, our application uses
dead reckoning to continuously calculate and track the user’s movements.
We achieved this by developing a robust step detection algorithm, which
processes the inertial measurements obtained from the smartphone’s motion
and rotation sensors. Then we programmed a real-time obstacle detector us-
ing the smartphone camera in an attempt to identify all the boundary edges
ahead and to the side of the user. Finally, we combined these three com-
ponents together in order to compute and display easy-to-follow navigation
hints so that our application can effectively direct the user to their desired
destination.
Extensive testing of our prototype in the Imperial College library revealed
that, on most attempts, users were successfully navigated to their destina-
tions within an average error margin of 2.1m.
Acknowledgements
I would like to thank Dr. William J. Knottenbelt for his continuous support
and guidance throughout the project. I would also like to thank Prof. Duncan
Gillies for his initial feedback and assistance on computer vision. I would
also like to thank Tim Wood for his general advice on all aspects of the
project. I would also like to thank all the librarians on the third floor of the
Imperial College central library for allowing me to use their area to conduct
my experiments. Finally, I would like to thank all my family and friends who
helped me test my application.
Contents
1 Introduction 3
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Report outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background 7
2.1 Smartphone development overview . . . . . . . . . . . . . . . 7
2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Computer vision . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Hough Transform . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Gaussian smoothing . . . . . . . . . . . . . . . . . . . 10
2.3.3 Canny edge detection . . . . . . . . . . . . . . . . . . . 11
2.3.4 Colour . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.5 OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Barcode scanning . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Location fingerprinting . . . . . . . . . . . . . . . . . . 15
2.4.3 Triangulation . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.4 Custom markers . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Obstacle detection . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Dead reckoning . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6.1 Inertial sensors . . . . . . . . . . . . . . . . . . . . . . 17
2.6.2 Ego-motion . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Digital signal filters . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Position markers 21
3.1 Alternate positioning systems . . . . . . . . . . . . . . . . . . 21
3.2 Marker design . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Image gathering . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Circle detection . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1
3.5 Angular shift . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6 Data extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Obstacle detection 32
4.1 Boundary detection . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Obstacle detection . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Dead reckoning 38
5.1 Initial approach . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2.1 Linear acceleration . . . . . . . . . . . . . . . . . . . . 42
5.2.2 Rotation vector . . . . . . . . . . . . . . . . . . . . . . 42
5.3 Signal filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.4 Footstep detection . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5 Distance and direction mapping . . . . . . . . . . . . . . . . . 47
7 Evaluation 55
7.1 Evaluating position markers . . . . . . . . . . . . . . . . . . . 55
7.2 Evaluating our obstacle detection algorithm . . . . . . . . . . 58
7.3 Evaluating our dead reckoning algorithm . . . . . . . . . . . . 59
7.3.1 Pedometer accuracy . . . . . . . . . . . . . . . . . . . 59
7.3.2 Positioning accuracy . . . . . . . . . . . . . . . . . . . 61
7.4 Evaluating the integration of navigation system . . . . . . . . 63
7.4.1 Test location setup . . . . . . . . . . . . . . . . . . . . 63
7.4.2 Quantitative analysis . . . . . . . . . . . . . . . . . . . 64
7.4.3 Qualitative analysis . . . . . . . . . . . . . . . . . . . . 66
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8 Conclusion 69
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2
Chapter 1
Introduction
1.1 Motivation
Our motivation for this project stems from the fact that people are increas-
ingly relying upon their smartphones to solve some of their common daily
problems. One such problem that smartphones have not yet completely
solved is indoor navigation. At the time of writing, there is not a single low-
cost scalable mobile phone solution available in the market that successfully
navigated a user from one position to another indoors.
An indoor navigation app would certainly benefit users who are unfamil-
iar with a place. Tourists, for instance, would have a better experience if
they could navigate confidently inside a tourist attraction without any as-
sistance. In places such as museums and art galleries, the application could
be extended to plan for the most optimal or ‘popular’ routes. Such a system
could also be integrated at airports to navigate passengers to their boarding
3
gates. Similarly an indoor navigation system could also benefit local users
who have previously visited the location but are still unaware of the where-
abouts of some of the desired items. These include supermarkets, libraries
and shopping malls. The application could also benefit clients who install the
system by learning user behaviours and targeting advertisements at specific
locations.
1.2 Objectives
The objective of this project was to build a robust and flexible smartphone
based indoor navigation system that met the following four criteria:
4
1.3 Contributions
In this report we present an indoor navigation system for smartphones, which
uses a combination of computer vision based techniques and inertial sensors
to accurately guide users to their desired destinations. Our solution entails
the scanning of custom-made markers in order to calibrate the user’s position
during navigation. Then it employs a dead reckoning algorithm to approx-
imate user movements from the last known point. Finally our application
uses this information along with an integrated vision based obstacle detector
to display correct directions in real-time leading to the user’s destination.
Our indoor navigation solution required the study and development of
three individual components prior to their integration:
1. Position markers: These are custom markers that our application is ca-
pable of scanning from any angle using the smartphone camera. Colour
is used to encode position data along with a direction indicator to ob-
tain the angle of scanning. These markers were used to calibrate the
user’s position and orientation. OpenCV functions were used to detect
circles and other features from the camera preview frames to decode
these markers.
5
Figure 1.1: The image shows how all the main components integrate to
make the final indoor navigation system
6
Chapter 2
Background
7
2.2 Related work
In the past few years, a great amount of interest has been shown to develop
indoor navigation systems for the common user. Researchers have explored
possibilities of indoor positioning systems that use Wi-Fi signal intensities
to determine the subjects position[14][4]. Other wireless technologies, such
as bluetooth[14], ultra-wideband (UWB)[9] and radio-frequency identifica-
tion (RFID)[31], have also been proposed. Another innovative approach uses
geo-magnetism to create magnetic fingerprints to track position from distur-
bances of the Earths magnetic field caused by structural steel elements in the
building[7]. Although some of these techniques have achieved fairly accurate
results, they are either highly dependent on fixed-position beacons or have
been unsuccessful in porting the implementation to a ubiquitous hand-held
device.
Many have approached the problem of indoor localisation by means of
inertial sensors. A foot-mounted unit has recently been developed to track
the movement of a pedestrian[35]. Some have also exploited the smart-
phone accelerometer and gyroscope to build a reliable indoor positioning
system. Last year, researchers at Microsoft claim they have achieved metre-
level positioning accuracy on a smartphone device without any infrastructure
assistance[17]. However, this system relies upon a pre-loaded indoor floor
map and does not yet support any navigation.
An altogether different approach applies vision. In robotics, simultaneous
localisation and mapping (SLAM) is used by robots to navigate in unknown
environments[8]. In 2011, a thesis considered the SLAM problem using in-
ertial sensors and a monocular camera[32]. It also looked at calibrating an
optical see-through head mounted display with augmented reality to overlay
visual information. Recently, a smartphone-based navigation system was de-
veloped for wheelchair users and pedestrians using a vision concept known
as ego-motion[19]. Ego-motion estimates a cameras motion by calculating
the displacement in pixels between two image frames. Besides providing the
application with an indoor map of the location, the method works well under
the assumption that the environment has plenty of distinct features.
Localisation using markers have also been proposed. One such technique
uses QR codes1 to determine the current location of the user[13]. There is
also a smartphone solution, which scans square fiducial markers in real time
to establish the user’s position and orientation for indoor positioning[24].
Some have even looked at efficient methods to assign markers to locations
for effective navigation[6]. Although, scanning markers provide high precision
1
www.qrcode.com
8
positioning information, none of the existing techniques have exploited the
idea for navigation.
Finally, we also looked at existing commercial indoor navigation systems
available on the smartphone. Aisle411 (aisle411.com) provided a scalable
indoor location and commerce platform for retailers, but only displayed in-
door store maps of where items were located to the users without any sort
of navigation hints. The American Museum of Natural History also released
a mobile app (amnh.org/apps/explorer) for visitors to act as their personal
tour guide. Although, the application provides the user with turn-by-turn
directions, it uses expensive Cisco mobility services engines to triangulate
the device’s position.
9
Hough circle transform
The Hough circle transform is similar to the Hough transform for detect-
ing straight lines. An equation of a circle is characterised by the following
equation.
(x − xc )2 + (y − yc )2 = r2
In order to detect circles in a given image, the centre coordinate (xc , yc ) of
the circle and its radius r have to be identified. As three different parameters,
xc , yc and r, are modelled, the graph would have 3-dimensions. Each non-
zero pixel in the binary image will produce a conical surface as shown in
figure 2.12 .
Figure 2.1: The image shows a cone formed by modelling all the possible
radius of a circle with the centre point at a 2D coordinate
Once again, this process will be repeated for every non-zero pixel point
and will result in several such cones plotted on the graph. This can con-
veniently be represented in a three-dimensional matrix. When the number
of intersections exceeds a certain threshold, we consider the detected three-
dimensional coordinate as our centre and radius.
10
filters, Gaussian filters are perhaps the most useful in our application. They
are typically used to reduce image noise prior to edge detection.
The theory behind Gaussian filters stem from the following two-dimensional
Gaussian function, studied in statistics, where µ is the mean and σ is the
variance for variables x and y.
(x − µx )2 (y − σy )2
− +
f (x, y) = Ae 2µ2x 2σy2
Figure 2.2: The image shows a plot of a two dimensional Gaussian function
11
2. Finding the intensity gradient - To determine the gradient strength
and direction, convolution masks used by edge detection operators such
as Sobel (shown below) are applied to every pixel in the image. This
yields the approximate gradient in the horizontal and vertical direc-
tions.
−1 0 1 1 2 1
Gx = −2 0 2 Gy = 0 0 0
−1 0 1 −1 −2 −1
2.3.4 Colour
Colours have been previously used to encode data. Microsoft’s High Capacity
Color Barcode (HCCB) technology encodes data using clusters of coloured
triangles and is capable of decoding them in real-time from a video stream[36].
Although their implementation is very complex, we can use the basic concept
behind HCCB in our application.
12
Each distinct colour can be used to represent a certain value. Colours can
be grouped together in a set format to encode a series of values. We have to
take into account that smartphone cameras cannot distinguish between small
variations of a certain colour in non-ideal situations, such as light green or
dark green. Therefore we would be limited on the number of discrete values
we can encode. Colour is typically defined using the “Hue Saturation Value”
(HSV) model or the “Red Green Blue” (RGB) model.
Figure 2.3: The left image shows the HSV model and the right image shows
the RGB model. They both describe the same thing but with different
parameters
The HSV model is more appropriate for the identification and comparison
of colours. The difference in the hue component makes it easier to determine
which range a colour belongs to. For example, the colour red has a hue
component of 0 ± 15 while green has a hue component of 120 ± 15.
2.3.5 OpenCV
Open Source Computer Vision (OpenCV) is a library of programming func-
tions for real time computer vision. It is released under a BSD license allowing
us to use many of their optimised algorithms for academic and commercial
purposes[27]. The library is cross-platform and ports to all the major mobile
operating systems. For Android, the OpenCV manager app needs to be in-
stalled on the testing device prior to development. It is an Android service
targeted to manage OpenCV library binaries on end users devices[26].
The library supports the calculation of Hough transforms, Canny edge
detection and optical flow. It also provides various smoothing operations
including Gaussian smoothing, as well as image conversion between RGB,
HSV and grayscale. These algorithms are highly optimised and efficient, but
they only produce real-time performance for low resolution images.
13
2.4 Positioning
In order to develop a navigation system, the application needs to be aware
of the user’s position. There are numerous methods available that solve
the indoor positioning problem but we had to only consider those that were
accessible on a smartphone device, and would minimise the number of in-
frastructure changes.
14
2.4.2 Location fingerprinting
Location fingerprinting is a technique that compares the received signal
strength (RSS) from each wireless access point in the area with a set of
pre-recorded values taken from several locations. The location with the clos-
est match is used to calculate the position of the mobile unit. This technique
is usually broken down in to two phases[36]:
With a great deal of calibration, this solution can yield very accurate
results. However, this process is time-consuming and has to be repeated at
every new site.
2.4.3 Triangulation
Location triangulation involves calculating the relative distance of a mobile
device from a base station and using these estimates to triangulate the user’s
position[16]. Distance estimates are made based on the signal strength re-
ceived from each base station. In order to resolve ambiguity, a minimum of
three base stations are required.
In free space, the received signal strength (s) is inversely proportionate
to the square of the distance (d) from the station to the device.
1
s∝
d2
Signal strength is affected by numerous factors such as interference from
objects in the environment, walking, multipath propagation3 , etc. There-
fore, in non-ideal conditions, different models of path attenuation need to be
considered.
15
Figure 2.4: The image shows the trilateration of a device using the signal
strength from three nearby cell towers
16
to the absence of a general geometric pattern and the irregular positioning
of challenging structures.
An interesting approach taken by a group in the 2003 RoboCup involved
avoiding obstacles using colour[12]. Although this is a relatively straight-
forward solution and achieves a fast and accurate response, it restricts the
use of an application to a certain location and is prone to ambiguous er-
rors caused by other similar coloured objects in the environment. Another
impressive piece of work combines three visual cues from a mobile robot to
detect horizontal edges in a corridor to determine whether they belong to a
wall-floor boundary[18]. However, the algorithm fails when strong textures
and patterns are present on the floor.
There has been very little emphasis on solving the problem on a smart-
phone mainly due to the high computational requirements. There is neverthe-
less one mobile application tailored for the visually impaired that combines
colour histograms, edge cues and pixel-depth relationship but works with the
assumption that the floor is defined as a clear region without any similarities
present in the surrounding environment[29].
There is currently a vast amount of research being conducted in this area.
However, our focus was driven towards building a navigation system and not
a well-defined free space detector. Therefore, for our application, we have
adopted some of the vision concepts mentioned in literature such as boundary
detection.
vf = vi + a · t
d = vf · t − 0.5 · a · t2
17
However, due to the random fluctuations in the sensor readings, it is not
yet possible to get an accurate measure of displacement even with filtering4 .
Nevertheless, the accelerometer data can be analysed to detect the number
of footsteps. In that case, a rough estimate of the distance travelled can be
made, provided the user’s stride length is known. Furthermore, the orienta-
tion sensor can be employed simultaneously to determine the direction the
user is facing. Using this information, the new position of the user can be
calculated on each step as follows:
18
constrains and cross-correlation validations need to be applied to eliminate
erroneous detections.
To calculate the direction of movement, we need to also consider the
orientation of the device. This can be calculated using geo-magnetic field
sensors and gyroscopes. However, we need to also convert this orientation
from the world’s frame of reference to the site’s frame of reference.
2.6.2 Ego-motion
An alternate solution to dead reckoning uses a vision concept known as ego-
motion. It is used to estimate the three-dimensional motion relative to the
static environment from a given sequence of images. Our application can
use the smartphone camera to feed in the live images and process them in
real-time to derive an estimate of the distance travelled.
There has been some interesting work published, in recent times, relat-
ing to the application of ego-motion in the field of navigation. A robust
method for calculating the ego-motion of the vehicle relative to the road has
been developed for the purpose of autonomous driving and assistance[34]. It
also integrates other vision based algorithms for obstacle and lane detection.
Ego-motion has also been employed in robotics. A technique that combines
stereo ego-motion and a fixed orientation sensor has been proposed for long
distance robot navigation[25]. The orientation sensor attempts to reduce the
error growth to a linear complexity as the distance travelled by the robot
increases. However there has not been a great amount of work in this topic
using smartphone technology. The only published work that we came across
proposed a self-contained navigation system for wheelchair users with the
smartphone attached to the armrest[19]. For pedestrians it uses step detec-
tion instead of ego-motion to measure their movement.
Technically, to compute the ego-motion of the camera, we first estimate
the two-dimensional motion taken from two consecutive image frames. This
process is known as the optical flow. We can use this information to extract
motion in the real-world coordinates. There are several methods to estimate
optical flow amongst which the LucasKanade method[20] is widely used.
In our application, the smartphone camera will be used to take a series of
images for feature tracking. This typically involves detecting all the strong
corners in a given image. Then the optical flow will be applied to find these
corners in the next frame. Usually the corner points do not remain in the
same position and a new variable has to be introduce, which models all the
points within a certain distance of the corner. The point with the lowest is
then regarded as that corner in the second image. Template matching will
then be applied to compare and calculate the relative displacement between
19
the set of corners in the two images. This information can be used to roughly
estimate the distance travelled by the user.
Figure 2.5: The image shows the three types of digital signal filters
Signal data can be analysed in the temporal domain to see the variation in
signal amplitude with time. Alternatively, a signal can be represented in the
frequency-domain to analyse all the frequencies that make up the signal. This
can be useful for filtering certain frequencies of a signal. The transformation
from the time-domain to the frequency-domain is typically obtained using
the discrete Fourier transform (DFT). The fast Fourier transform (FFT) is
an algorithm to compute the DFT and the inverse DFT.
20
Chapter 3
Position markers
We decided to develop our own custom markers with the purpose of obtaining
the position of the user. Several of these markers would be placed on the floor
and spread across the site. In particular, they would be situated at all the
entrances and other points of interest such that application can easily identify
them. Upon scanning, the application would start displaying directions from
that position to their destination.
In this chapter, we start by discussing some of the other alternatives we
considered before deciding to use custom markers and detail our reason as
to why we did not choose any of these options. Then we proceed to describe
the design of the marker specifying what data it encodes and how this data
is represented. Then we start explaining our implementation for the smart-
phone scanner. Firstly, we explain how we detect the marker boundary using
the Hough circle transform. Then we describe how the angular shift encoded
in the marker helps us to calculate the orientation of the user. Finally, we
explain the process of extracting the position data from the marker.
21
of interference need to be considered along with the position of each access
point. Since every site is structured differently, complex models for signal
attenuation would need to be developed independently. [1] describes some
further problems with triangulation.
The advantages of location fingerprinting are similar to triangulation.
However, to achieve accurate results, fingerprinting requires a great amount
of calibration work. This is a tedious process and would need to be replicated
on every new site. In addition, several people have already raised privacy
concerns for Wi-Fi access points[11].
At first, we strongly considered the option of placing barcodes around
the site encoded with their respective positions. We even tested a few open-
source barcode scanning libraries available on Android. However, we quickly
realised that using an external library would affect its future integration with
other features. Since Android only permits the use of the camera resource
to one single view, we would have been unable to execute the obstacle detec-
tion mechanism simultaneously, unless we developed our own scanner. We
could have also potentially extended the barcode scanning library by further
studying and modifying a considerable amount of their codebase. The other
major drawback with using barcodes was the inability to encode direction
data needed to calibrate our application with the site’s frame of reference.
See section 3.2 for further information on this requirement.
Developing custom markers would give us complete control over the de-
sign of the marker, the scanning and its integration with the rest of the
system. These custom markers would not only be designed to encode posi-
tion data but also the direction. The only drawback would be that it takes a
considerable amount of time to develop a bespoke scanner that gives highly
accurate results. Nevertheless, we decided to take this approach as the ben-
efits outweighed the disadvantages.
22
locally or elsewhere. This gives us the additional flexibility of changing
the position of the marker offline without the need of physically moving
the marker. Upon scanning this feature, our application will be able to
determine the position of the user.
Colours are used to encode the UID. Currently our marker only supports
three distinct colours - red, blue and green are used to represent the values 0,
1 and 2 respectively. We decided to choose these three colours because they
are the furthest apart from each other in the HSV model. This will reduce
erroneous detection as there will be lower chance of an overlap.
Each marker encodes six data digits with one extra digit for validation.
Therefore, using the ternary numeral system, a number between 0 and 728
(36 − 1) can be encoded by our marker. This allows for a total of 729 unique
identifiers. The validation digit provides an extra level of correction and to a
certain extent reduces incorrect detections. The first six data values detected
are used to calculate the validation digit (v) as follows:
6
!
X
v= DataValue i mod 3
i=1
If v is not equal to the extra validation digit detected from the marker,
then the scanner discards that image frame and tries again.
The marker encodes the direction indicator using two parallel lines joined
by a perpendicular line to form a rectangle. Figure 3.1 shows the structure
of the marker with each colour section containing a number corresponding
to the digit it represents in the UID’s ternary representation.
As mentioned previously, the marker’s direction indicator is also required
to align the smartphone’s orientation with respect to the site’s frame of refer-
ence. From the built-in orientation sensors, our application can estimate the
direction the device is facing. This direction does not necessarily represent
the user’s direction with respect to the site. For example, suppose the user
is standing at the coordinate position (0,0) and the desired destination is at
position (0,1). We can say that the destination is ‘north’ of the user with
respect to the site’s Cartesian grid. Let us assume that the orientation sensor
tells the user that north is 180◦ away from the local ‘north’. This would result
in the application navigating the user in the opposite direction. To solve this
23
Figure 3.1: The left image shows the structure of the marker, and the right
image shows a marker encoded with UID 48
Figure 3.2: The image shows four markers placed such that their direction
indicator is pointing to the local north
The markers could be of any reasonable size, but to be able to scan them
while in a standing position they should be printed such that they occupy a
complete A4 piece of paper.
24
3.3 Image gathering
The Android documentation recommended our client view to implement
the SurfaceHolder.Callback interface in order to receive information upon
changes to the surface. This allowed us to set up our camera configurations
on surface creation and subsequently display a live preview of the camera
data. The next step was to obtain the data corresponding to the image
frames for analysis. The Camera.PreviewCallback callback interface was de-
signed specifically for delivering copies of preview frames in bytes to the
client. On every callback, we performed analysis on the preview frame re-
turned and used a callback buffer to prevent overwriting incomplete image
processing operations. For development, we did not initially use the preview
callback interface. Instead, we took individual pictures of the markers to
test our algorithm, and only implemented the callback once our algorithm
achieved real-time performance.
The default camera resolution for modern smartphones is significantly
high for an application to achieve real-time image processing. Therefore, we
had to decrease the resolution of the preview frames prior to processing while
still maintaining a certain level of quality. We decided upon a resolution of
640 x 480 as it provided a good compromise between performance and image
quality. We also ensured that if a smartphone camera did not support this
resolution, our application would select the one closest to it.
25
Parameter Description
image Grayscale input image
circles Output array containing the centre coordi-
nates and radii of all the detected circles
method Method used for detecting circles, i.e. using
Hough transforms
dp Inverse ratio of the accumulator resolution to
the image resolution
minDist The minimum distance between the centres
of two circles
param1 The upper Canny threshold
param2 The accumulator threshold
minRadius The minimum radius of a circle
maxRadius The maximum radius of a circle
Prior to the Hough circle detection, the input image had to be converted
to grayscale to detect the gradient change and filtered to remove noise. The
image data from the smartphone camera is received in bytes. This is first
converted from bytes to the YUV format and then to grayscale. OpenCV
contains several image processing functions, allowing the conversion of an
image between different colour models. OpenCV also provides a function to
apply the Gaussian blur to an image with a specified window size. Figure 3.3
shows the process of marker detection from the original image to the detection
of the centre of the circle.
26
Parameter Description
src Input image
dst Output image
ksize Gaussian kernel size
sigmaX Standard deviation in the horizontal direc-
tion for the Gaussian kernel
sigmaY Standard deviation in the vertical direction
for the Gaussian kernel
borderType Method for pixel extrapolation
Figure 3.3: The image shows the process of circle detection - original input
image, grayscaled, Gaussian blurred and centre detection
27
Figure 3.4: The image shows the angular shift θ
centre point of the marker to all the detected line segments. The closest line
from the centre is selected as the first direction indicator. Then, we search
for the second parallel line by checking the gradient of the line against all the
detected line segments that are a certain distance away from the first direction
indicator. If either of these indicators are not found, we abandon further
processing and wait for the next preview frame. Otherwise, we continue to
search for the perpendicular line allowing us to distinguish between the two
possible direction scenarios (positive or negative angular shift). We then
combine this information with the line equations of the direction indicators
to calculate the angular shift. To reduce erroneous detections, we applied
two further heuristics to our algorithm: (1) Minimum line length; and (2)
Maximum distance from the centre.
Prior to the Hough line transform, the input image has to be first con-
verted to a binary image with all the boundaries highlighted where a strong
gradient change occurs. To achieve this, Canny edge detection can be per-
formed on the filtered grayscale image obtained from the previous circle
detection step. OpenCV provides us with a Canny function taking in pa-
rameters that define the upper and lower thresholds. We have given the
specification of this function in section 4.1.
The next step involves rotating the image anticlockwise by the angular
shift to obtain the natural orientation of the marker for data extraction. We
first calculate the affine matrix for two-dimensional transformations using
OpenCV’s getRotationMatrix2D(). Then, we use this matrix to actually
perform the transformation on the image using warpAffine(). Figure 3.5
28
illustrates the entire process of angular shift transformation.
Figure 3.5: The image shows the process of angular shift transformation -
original input image, Canny edge detection, line detection, direction
indicator detection and rotation
29
Parameter Description
center Center of rotation
angle Angle of rotation
scale Scale factor
return Output 2x3 affine matrix
Parameter Description
src Input image
dst Output image
M Affine transformation matrix
dsize Size of output image
flags Method of interpolation
30
Figure 3.6: The image shows the seven border positions used to calculate
the colour regions
Figure 3.7: The image shows the path to calculate the value encoded by
each colour region
all the seven regions. Our algorithm accumulates the hue values for each pixel
in the specified path. Then the modal colour is calculated by counting the
number of pixels recorded in each colour range. This process is repeated for
all the seven regions. At present, we only consider the modal colour values if
they are either red, blue or green. We use the colour encodings from the six
data regions to calculate the UID and the seventh data region for validation.
31
Chapter 4
Obstacle detection
The purpose of the obstacle detector was to avoid giving directions to the
user that led to an immediate obstacle. The only plausible solution to detect
obstacles from a smartphone was to use the camera to roughly detect the
object boundaries.
In this chapter, we discuss the process of boundary detection using the
Hough line transform and the Canny edge detector. We also explain how
we achieved real-time performance using OpenCV libraries. The last sec-
tion describes how this boundary information is used to identify obstacles
surrounding the user.
32
Parameter Description
image Binary input image
lines Output array containing the two coordinate
points of the detected line segments
rho The distance resolution, usually 1 pixel for
preciseness
theta The angle resolution
threshold Minimum number of intersections required
for line detection
minLineLength The minimum length of a line
maxLineGap The maximum distance between two points
belonging to the same line
The process of retrieving the camera preview frames was exactly the same
as for scanning position markers (section 3.3). In fact, we used the same class
as before to also incorporate boundary detection. As a result, we were able to
simultaneously compute results for both these tasks. Once again, prior to the
Hough line transform, we applied Gaussian blur and Canny edge detection
to these preview frames. While the function call to GaussianBlur remained
unchanged, the thresholds for the Canny edge detector were modified such
that only the strong edges were detected. Figure 4.1 summarises the entire
process of boundary detection.
33
Figure 4.1: The image shows the process of boundary detection - original
input image, grayscaled, Gaussian blurred, Canny edge detection and
Hough line transform
boundaries to see if there was an edge on the left, right and in front of the
user. To achieve this, we used the Java Line2D API to check for line segment
intersections.
We first considered detecting obstacles straight ahead of the user. We
34
Parameter Description
image Grayscale input image
edges Binary output with the edges highlighted
threshold1 Lower threshold used for Canny edge detec-
tion
threshold2 Upper threshold used for Canny edge detec-
tion
noticed that the object boundaries formed by the obstacles in front of the
user were almost always horizontal. Therefore, we searched through all the
detected boundary lines and only retained those that had an angle of 180◦ ±
25◦ or 0◦ ± 25◦ . Figure 4.2 illustrates this process. We then used an accumu-
lator to count the number of intersections between these lines and vertical
lines. The vertical lines signify the user walking straight ahead. If this num-
ber of intersection is quite high, it would indicate that there is an obstacle
in front of the user, assuming that he is holding the phone in the direction
of movement.
For detecting obstacles on the left and the right side of the user, we
used a similar approach. We noticed that the object boundaries on the side
were slightly slanted and very close to forming a vertical line. Therefore, for
detecting obstacles on the left hand side, we decided to only retain boundary
lines that had an angle of 75◦ ± 25◦ or 255◦ ± 25◦ . For the right side, we
only looked at lines with an angle of 105◦ ± 25◦ or 285◦ ± 25◦ . Figure 4.3
illustrates this process. Then, we checked for the number of line intersections
horizontally, representing the user’s side movements. However, for detecting
obstacles on the left, we only checked the left half portion of the image and
similarly the right half for detecting obstacles on the right.
Our current implementation of obstacle detection has two key limitations.
One is that the user has to hold the smartphone with a slight tilt (35◦ ± 20◦ )
such that the back camera is always facing the floor, as shown in figure 4.4.
The reason is that if the phone is held perpendicular to the ground it is not
possible to determine the depth of an obstacle just by looking at an image.
Therefore, we would not be able to conclude whether an obstacle lies right
in front of the user or further away. By forcing the user to hold the phone
in the desired position, we can almost guarantee that an obstacle detected is
immediately ahead or to the side of the user.
The second limitation is that the floor should not contain any patterns.
This is mainly due the fact that our algorithm falsely interprets the patterns
35
Figure 4.2: The images show the preservation of horizontal lines for
detecting obstacles ahead of the user in two different scenarios
36
Figure 4.3: The image shows the preservation of side lines for detecting
obstacles left and right of the user
Figure 4.4: The image shows the correct way to hold the phone for obstacle
detection
37
Chapter 5
Dead reckoning
38
Parameter Description
image Grayscale input image
corners An output array containing the coordinate
positions of all the corners
maxCorners Maximum limit to the number of strong cor-
ners detected
qualityLevel Defines the minimum quality level for each
potential corner
minDistance Minimum distance between two corners
mask Specifies a particular region from the image
to extract features from
blockSize The size used for calculating the covariance
matrix of derivatives over the neighbourhood
of each pixel
From the corners detected in the first image, we can use optical flow to
find the new positions of these corners in the second image. This would
give us an estimate of the feature displacement for that time frame. The
OpenCV implementation for the optical flow uses the iterative Lucas-Kanade
algorithm with pyramids. Figure 5.1 shows the results we obtained by moving
the camera slowly to the left.
At this stage, we realised that the feature tracking and the optical flow
algorithms required a large amount of computation time to achieve the de-
sired results. This would prevent our application to perform real-time opera-
tions as well as quickly drain the smartphone’s battery. A potential solution
39
Figure 5.1: The first two images show the movement of the camera from
left to right and the third image shows the feature displacement calculated
by the optical flow
Parameter Description
prevImg First input image
nextImg Second input image
prevPts An array containing the coordinate positions
of all the corners detected in the first image
nextPts An output array containing the new coordi-
nate positions of the corners detected from
the first image
status An output status array
err An output error array
40
distance estimations using an ego-motion based approach. Therefore, we
decided to explore the inertial method of dead reckoning. Since a pedometer
algorithm had been previously achieved by many, we were more confident
with this approach.
5.2 Sensors
Nowadays, all smartphones are equipped with a wide variety of sensors. For
our application, we only considered the accelerometer, the geo-magnetic field
and the gyroscope sensors. The accelerometer enables us to monitor motion
relative to the world’s frame of reference. The remaining two sensors can be
combined with the accelerometer to give an accurate reading of the device’s
orientation also relative to the world’s frame of reference. Our aim is to make
use of these sensors to track the user’s position relative to the site’s frame of
reference.
The Android API for sensor events is well documented making it simple to
integrate with our application. Figure 5.2 from the Android documentation1
illustrates the coordinate system used by the sensors. To receive data from a
particular sensor, our client registers itself to listen to the changes from that
sensor type using a SensorListener interface. Our application registers with
two sensor types: linear acceleration and rotation vector.
Figure 5.2: The image shows the coordinate system used by the Android
sensor API
1
http://developer.android.com/reference/android/hardware/SensorEvent.
html
41
5.2.1 Linear acceleration
This sensor type is used to measure the acceleration on the device on all
the three axes excluding the force of gravity. As stated previously in the
background chapter, we cannot apply double integration on the linear accel-
eration data to obtain distance as the accuracy deteriorates significantly with
increasing distance. Instead, we can use this data to detect footsteps. The
accelerometer sensor returns the acceleration of the device (Ad ) influenced
by the force of gravity.
X
Ad = −g − F/m
42
individual orientation angle for each axis, we need to first calculate the rota-
tion matrix from the rotation vector. The Android sensor API provides us
with the function getRotationMatrixFromVector(), and also getOrientation()
to calculate these angles.
Figure 5.3: The image shows the coordinate system used by the rotation
vector
Figure 5.4: The image shows the coordinate system used by the orientation
vector
This function returns an array of size three containing the angle of ro-
tation, in radians, around the z-axis (azimuth), the x-axis (pitch), and the
y-axis (roll). Here we are only concerned with the azimuth, which represents
the horizontal angle formed by the difference in the y-axis from the magnetic
43
north. The azimuth lies between 0◦ and 359◦ , with 0 corresponding to the
north. Therefore, this value can be used to monitor the direction of the user’s
movement whenever a user takes a step. However, to be useful indoors, this
direction has to be first calibrated with the site’s frame of reference. This can
be done by scanning any marker on the site, providing the marker’s direction
indicator is parallel to the site’s y-axis. In other words, the marker points to
the north of the site.
44
Figure 5.5: The image shows the plot of a sinc function
The value of ALPHA was used for calibrating the low pass filter while
the value of BETA was used for the high pass filter, such that the relevant
properties of the signal were preserved. We applied these filters to data
received from both the sensor types. Figure 5.6 shows the attenuation of the
high frequency components from the original signal.
45
Figure 5.6: The image shows the accelerometer data before and after
filtering
The basis of our pedometer algorithm relied upon detecting the peaks
and troughs from the time-domain waveform. Essentially, when a peak was
detected, we started searching for the next trough. Similarly, after a trough
was detected, we started searching for the next peak. This sinusoidal at-
46
tribute of the signal along with other amplitude characteristics constituted
an average footstep. Taking the first derivative of the accelerometer data al-
lowed us to calculate and distinguish between a peak and a trough. At times,
when a second peak was detected while looking for a trough, we discarded
the first peak and used this new peak as the starting reference for identifying
a footstep.
We applied three further simple heuristics to reduce false positives.
1. Maximum time duration for one step - If a step took longer than 1.5
seconds, we would not consider it.
3. Minimum number of initial steps - The user has to take 2-3 initial steps
before the algorithm starts detecting individual footsteps.
47
Chapter 6
Integration of navigation
system
The final stage of development involved combining the data obtained from
position markers, dead reckoning and the obstacle detector to calculate the
appropriate direction that would gradually lead the user to their destination.
This is the final step needed to complete our indoor navigation system.
In this chapter, we briefly explain how the location where our system
will be employed should be setup. Then, we discuss how we integrated the
three individual components together to display the appropriate directions.
Finally, we give a high level overview of our entire system through a class
diagram and a sequence diagram.
48
be recorded offline. Essentially, these are all the possible indoor destinations
that our application navigates the user to. For our testing, we stored this data
locally as there were not many destination points. However, to decouple this
task and make it scalable, this data needs to be stored in a file or database.
An internet connection would be required to fetch this data when necessary.
Figure 6.1: The image shows how all the main components integrate to
make the final indoor navigation system
Position markers
On every successful scan of a position marker, the encoded UID is retrieved
and checked for the corresponding position coordinate of the marker. When
49
this is known, the scanner would notify the central module about this new
position. Then, the central module would immediately update its current
estimated position to the one just received and recalculate the directions to
display.
The markers also provide the central module with the angular shift to the
site’s local north. As previously mentioned, this offset is used to calibrate
the device’s orientation with respect to the site’s frame of reference. This is
important for correctly tracking a user’s movements through dead reckoning.
Dead reckoning
There were two aspects to integrating this component. One was the orienta-
tion (azimuth), which gave the direction the user was walking towards and
the other aspect was the pedometer algorithm. This orientation was not only
important for dead reckoning but also for adjusting the currently displayed
direction. For example, if the direction currently displayed told the user to
walk straight but instead he turned right, the direction would need to change
to tell the user to turn left. Also note that when the sensor detected a change
in the device’s orientation, we would straightaway subtract the measurement
with the angular shift offset retrieved from the position marker.
Obstacle detector
Integrating this component ensured that the direction displayed by our appli-
cation was feasible. Whenever there was a change in the navigation direction
displayed, we would check with the obstacle detector on whether there was
an obstacle in that direction. If there were no obstacles, then the direc-
tion displayed would remain the same. However, if it did detect an obstacle
in that direction, we would ensure that the application navigated the user
around the obstacle. We achieved this by updating the direction hint to point
90◦ clockwise, i.e. to the right. While the user follows this direction, our ap-
plication checks whether the obstacle (now on the left side) still exists. If it
does not exist, the direction indicator is returned to its original angle. If it
does, then the user would continue till it eventually disappears. In case there
is another obstacle on the new direction path, then once again we repeat the
50
same step by further updating the direction hint to point 90◦ clockwise till
the obstacle disappears. In the future, this algorithm can be improved to
take into account the user’s past movements and learn to not take a path
with long obstacles.
51
Figure 6.3: The image on the left shows the diagonal fluctuations in the
navigation direction and the image on the right shows how we fix this
problem
52
Figure 6.4: The image shows the class diagram of our proposed navigation
system with the packages highlighted in grey and the classes in white
53
Figure 6.5: The image shows the sequence diagam of the entire system
54
Chapter 7
Evaluation
Results
Table 7.1 shows the results obtained from scanning 10 different UID encoded
position markers. The lighting conditions were used as the main criteria. We
altered the distance and the angle parameters on every test (figure 7.1). If
55
Lighting conditions Total scans Correct Wrong Timeout
Bright 20 1 3 16
Normal 20 17 1 2
Low-light 20 0 0 20
Shadow 20 16 2 2
Figure 7.1: The image shows some of the tests we conducted by altering
parameters such as distance, angle and light
Analysis
Our application failed to detect any of the markers when the lighting con-
ditions were poor. This was partly due to the low resolution of the camera
as well as the conversion of the original image in to grayscale for the Hough
circle transform. This made it impossible for the algorithm to detect the
56
centre of the circle. Our results also showed that the scanner failed to de-
tect markers when there was a lot of light falling on the paper. In fact,
the paper reflected so much light that the colours were slightly challenging
to recognise even for the naked eye. As a result, our algorithm timed out
most of the time as it could not decipher some of the colour regions. There
were also a few wrong results, mainly caused due to the incorrect detection
of the bright colour regions. Our application performed well under normal
and shadowy lighting conditions. Nevertheless, there were a few errors and
timeouts on both the occasions but these only occurred when the camera was
held at a slanted angle and far away from the marker. The reason some of
the results were incorrect were because the application failed to detect the
right marker boundaries. However, we noticed that the errors quickly auto
corrected themselves on the next preview frame due to the real-time nature
of the application.
From our findings, we quickly realised that printing colour encoded mark-
ers on paper was not a good idea as it was susceptible to abnormal lighting
conditions. Therefore, in a real case scenario, we recommend position mark-
ers to be printed on a more durable mat surface and the area next to the
marker should always receive adequate light.
We decided to compare our results with some of the state-of-the-art bar-
code scanners available for smartphones. Online research 1 shows that most
of these scanning applications achieve around 81%-96% accuracy. Under
normal lighting conditions, we achieved 85% accuracy. Although this is a
favourable result, we have to consider that our position markers cannot store
a large amount of data, and therefore we should be making fewer errors. How-
ever, we can also claim that our application is capable of decoding markers
from a larger distance and can enable the scanner to calculate the user’s
orientation.
Speed is also an important metric for scanning applications. However, we
decided not to analyse the speed of our scanner quantitatively as accuracy
was our highest priority. Nevertheless, under ideal lighting conditions and
within a certain distance, our application almost always decodes a position
marker within milliseconds. Besides, the only reason our application would
take a large amount of time would be because the user is either too far away
or at a really slanted angle making it too difficult to detect the centre of the
circle or the direction indicator. Since the scanning operates in real-time,
users are able to quickly alter the position of their device to get a better
view of the marker. Since this process is not time-consuming, the user can
scan some of these markers while they are being navigated in order to feed
1
http://www.scandit.com/barcode-scanner-sdk/features/performance/
57
Features presented Boundary detection Obstacle detection Total time
Few 20 1 3
Moderate 20 17 1
High 20 0 0
Results
We measured the time taken from the point when our implementation re-
ceived a preview frame till the successful identification of obstacles on the
left, right and ahead of the user. We held the device in a static position
while it was processing the preview frames from the camera. This allowed us
to calculate the average time from a sample of 100 frames. Table 7.2 shows
the time performance of our algorithm in milliseconds when presented with
different levels of features in the environment.
Analysis
Our results showed that the average time taken to detect obstacles was ap-
proximately 173ms in the worst case. This allowed for our algorithm to
process up to 6 frames per second (FPS). Under normal stress levels, our ap-
plication was capable of processing 9-10 FPS on average. Although this was
slightly less than the number of frames processed per second by the human
eye (10-12 FPS), this lag was not noticeable from our application. However,
we have to also consider that our test device, the Samsung galaxy S4, has a
very high computational unit that not many smartphones currently possess.
58
Nevertheless, it is predicted that many other smartphones in the future will
follow this specification.
We briefly considered applying this algorithm to a setting which had a
textured carpet. We found that our algorithm falsely detects obstacles caused
by patterns on the floor. This is a very serious drawback of our system as
many sites do not always have a plain carpet. Further image processing would
need to be developed in order to distinguish between the floor patterns and
actual obstacles.
The results that we obtained certainly convey that this mechanism can
be used for a real-time application. This was because our implementation
utilised some of the OpenCV functions, which are arguably the most ef-
ficient vision algorithms currently available on the Android platform. We
also reduced the resolution of the image significantly and also altered other
parameters like the angle resolution to achieve faster performance.
Results
We asked the users to count how many steps they were walking and stop
when they reached the target number of steps. We asked them to do this
twice and averaged the results for all the ten samples. Table 7.3 shows the
results we obtained from this investigation.
In addition to our findings, we repeated the same experiment with fewer
samples using the most popular pedometer app on the Android market -
Accupedo. Table 7.4 shows the results obtained from this app.
59
Number of steps Steps detected (Average) Range
5 5.5 4-6
10 9.3 6-13
25 22.3 18-27
50 45.4 38-51
Analysis
Our pedometer algorithm did not correctly recognise the actual number of
steps on most of the attempts. The average results, however, were not too
discouraging. They were very close to the actual number of steps taken. We
also noticed that the average margin of error increased as the number of steps
walked increased. This was bound to occur as with increasing distance there
was a high chance that a step may not be detected. To find the cause of this
discrepancy, we analysed the accelerometer signal to check where the peaks
were not successfully detected. We discovered that our algorithm failed to
identify steps when users made a sharp turn around an obstacle. The wave
formed by this movement was different to when users walked normally. To
solve this issue, we would have needed to distinguish the two waveforms and
establish when to use the appropriate analysis.
The average number of steps detected using Accupedo were very close to
the actual steps taken. In comparison, Accupedo produced more accurate
results than our application. We also noticed that their algorithm generally
over-approximates while ours failed to detect some of the steps and thus
under-approximates. The other significant difference was that their appli-
cation did not function as expected when users walked a small number of
steps. In their description, they mention that they wait for the first 4-12
steps before displaying the pedometer count. In our case, it was vital that
our application also detect and display small movements. Otherwise, this
would have a great impact on the accuracy of our integrated application.
60
7.3.2 Positioning accuracy
We assessed the overall accuracy of our dead reckoning algorithm by mea-
suring the position drift between the estimated position of the user and their
actual position on the site. To accomplish this, we designed three fixed paths,
as shown in figure 7.2, and observed five test users walk along these paths.
We designed these paths such that they returned the user to their original
starting location. We were also mapping user movements in real-time and
displaying them to the user. This allowed us to calculate the position drift
when the user returned back to the starting marker by finding the difference
in distance between the user’s estimated position according to our application
and their actual final position.
Figure 7.2: The image shows the fixed paths used for testing
61
Path Type Error margin (meters)
A 1.9
B 3.4
C 5.7
Table 7.5: Results for the positioning accuracy achieved using a single
marker
Figure 7.3: The image shows the fixed paths setup with multiple markers
62
Path Type Error margin (metre)
A 0.7
B 1.3
C 2.3
Table 7.6: Results for the positioning accuracy achieved using multiple
markers
63
a platform for a dummy application that allows users to navigate their way
through the library to find their desired books.
We placed eight position markers at various locations around the site and
also ensured that all the direction indicators pointed to the local north of the
library. Then, we introduced nine destinations to mimic the location of the
books. We stored the position coordinates for all these attributes locally in
our application. Figure 7.4 shows the layout of the site as well as highlights
the location of the position markers and the nine destinations.
Figure 7.4: The image shows the layout of the test site with position
markers (grey) and the destinations (red)
64
Destination Error margin (metre)
A 1.7
B 2.6
C 1.1
D 2.6
E 1.2
F 3.0
G 1.8
H 1.5
I 3.3
vice. We had to also input a rough estimate of the user’s stride length for
personalisation.
Results
Table 7.7 shows the distance offsets in metre for all the destinations. All
the values were rounded to the nearest ten centimetres as millimetre level
precision was not required.
Analysis
The results from our investigation showed that our navigation system on av-
erage guided users within 2.1 metres of their intended destination. Although
this level of accuracy is generally sufficient for indoor navigation, we did
observe that at many times users ended up in the wrong aisle. Due to the
narrow passages in the library, our dead reckoning algorithm failed to distin-
guish between correct aisle. This was a direct impact from the inaccuracies
discovered in section 7.3. A more accurate step detection algorithm would
have definitely eliminated these false positives. However, in larger settings
like museums and fairs, this would not be necessary.
We also noticed that the time taken by users to reach their destination
varied. This was partly due to users walking fast and not following the direc-
tions correctly as well as the position drift resulting from our dead reckoning
algorithm. Therefore, users did not always follow the optimal path. Besides,
accuracy was always our priority and not the shortest distance to the desti-
nation. Having multiple markers greatly improved the accuracy as it enabled
the application to readjust position estimates when the user had a chance to
scan a nearby marker.
65
Solution Error margin (metre)
Ego-motion 2.7
Triangulation 1.7
Inertial 1.5
66
obstacles while following the given directions. This would eliminate a bulk of
the vision processing and would certainly increase battery life compared to
before. With respect to the user interface, some suggested to display a map
of the site as it would help to visualise the surroundings and their position
with respect to the site. This contradicted with one of our key objectives,
which was to not pre-load an indoor map of the site. As a result, we decided
to extend our application interface to display a mini-map showing only the
orientation of the user and the direction of the destination. Although this
did not reveal the full picture, it certainly aided navigation without requiring
the application to store any indoor maps.
7.5 Summary
Our results suggest that our application provides a reasonably accurate and
a flexible medium to allow users to navigate indoors without requiring many
infrastructural changes. Nevertheless, we also realised that the average accu-
racy of our application was not satisfactory in certain indoor environments
such as libraries and small supermarkets, which typically have narrow aisles.
This was largely due our faulty dead reckoning implementation which failed
to detect a step when users made a sharp turn. In our evaluation, we man-
aged to suppress this problem to a certain extent by introducing more po-
sition markers around the site. Although this improved the accuracy signif-
icantly, it resulted in more user interaction. The lighting condition in the
site would also largely influence the accuracy and the time taken to decode
position markers. This was an important factor to consider when placing po-
sition markers. Printing these markers on a mat anti-reflective surface would
certainly allow for their successful detection even under bright conditions.
However, we had to avoid placing these position markers under low levels of
light.
The application of obstacle detection to indoor navigation was rather
innovative. It allowed the navigation system to assess the surrounding en-
vironment up to a certain extent and then verify whether users would be
able to walk in a given direction from the current position. Nevertheless,
to successfully integrate and test this mechanism, we had to impose an im-
portant restriction on the site, i.e. the floor should not have any patterns.
Furthermore, the user had to also keep the phone in a set position through-
out their entire indoor journey as our algorithm relied upon the constant
preview frames received from the smartphone camera. This also significantly
increased the battery consumption of the device. Here, we have to agree
that the limitations of using this component outweigh the benefits it brings
67
to navigation. A potential solution for the future could be to remove this
component completely and instead let users use their intuition to avoid ob-
stacles and at the same time keep up with the given direction hint. To aid
navigation further, we could supply our application with an indoor map of
the site. Although this goes against one of our key objectives, our qualitative
analysis revealed that users actually prefer to know the layout of the site and
their current position with respect to their destination.
68
Chapter 8
Conclusion
8.1 Summary
As part of building an indoor navigation system, we designed our own custom
markers which were used to encode integers as well as a direction indicator
for obtaining the user’s position and orientation. We had to study various
computer vision techniques, such as Hough transforms and Canny edge de-
tection, to develop a camera based scanner to decode these markers. Our
analysis showed that it was possible to almost always extract the encoded
data within milliseconds under normal lighting conditions. Furthermore, our
scanner was capable of decoding data from any angle even when the user
was standing at a distance. This gave the user the advantage of scanning
position markers without havi7ng to bend down. However, we also noted
that the markers did not function as expected when the lighting condition
were either too bright or too low.
We followed up on these vision techniques to also develop an obstacle de-
tector using the smartphone camera. After assessing the time performance
69
of this component, we were able to validate its usability in a real-time appli-
cation. Therefore, we decided to include this mechanism in our final system
for checking the feasibility of the user moving to a new position.
From here, we focussed our attention on building a pedometer algorithm
using the inertial sensors on the smartphone. This involved filtering and
analysing the accelerometer signals to correctly detect a step and then keep-
ing track of these user movements as part of the dead reckoning process. Our
initial evaluation for this feature revealed that there was a significant margin
of error in the user’s position with increasing distance. This prompted us to
place more position markers around the site. Thus we had to request users
to scan additional markers while being navigated whenever there was one in
proximity. This significantly improved the accuracy of our dead reckoning
algorithm.
The final step comprised of combining these three components together
in order to display directions that would eventually lead the user to their
destination. A full functional testing of the application demonstrated that
our system was capable of guiding the user to their intended destination
within a small margin of error (2.1 metres). In general, this level of accuracy
was adequate enough to achieve indoor navigation.
70
need to detect and distinguish between the steps taken to walk up and down
the staircase. This entails complex signal processing and even then our im-
plementation may not achieve the desired result. A much simpler alternative
entails the user scanning a position marker on reaching the destination floor.
For instance, the application would navigate the user to the nearest staircase
and then tell them to walk up or down a certain number of floors. A position
marker would be placed at the entrance and at the exit near the staircase of
every floor. When the user reaches the requested floor, they can scan this
position marker and continue as before. Similarly, this could also be applied
to lifts and escalators.
Marker design
The current position markers are susceptible to poor lighting conditions.
This is due to the markers encoding data using colours. If too much or too
little light falls upon the marker, then the camera preview would not be able
to correctly identify the colours. A more robust design could use shapes and
patterns to encode data like in QR codes. The ability to store more data in
these markers would be beneficial for larger site. Another problem with the
current marker design is that they only work really well when placed on the
floor. When they are stuck to the wall, they only allow scanning if the user
is right in front of it and not at a slanted angle. This is due to the incorrect
detection of the circle centre when the user is at an angle. The marker design
could be changed to incorporate its correct detection from every angle.
Obstacle detection
Our obstacle detection mechanism currently detects all the linear edges in
the environment. Due to time limitations, we were unable to expand this
further to be able to distinguish between floor patterns and actual obstacles.
A straightforward analysis on the line angles would help establish whether an
obstacle has three dimensional properties. This could be used to eliminate
some of the false positives. However, more work needs to be done with
complex floor patterns.
Path planning
Once our indoor navigation system is setup, we could extend it to employ
some form of optimal path planning that users can follow from the moment
they enter the site. For instance, users can create their shopping lists offline
and when they enter the supermarket, our application could use this data
to plan and display directions from the start till the end. Museums and art
71
galleries could also offer special tours to visitors using this app navigating
them through all their preferred exhibits. Overall, indoor navigation opens
up a wide variety of extensions tailored to specific industries.
72
Bibliography
[1] Free alliance. Why is wifi inadequate for real time asset or personnel
tracking?
[7] Jaewoo Chung, Matt Donahoe, Chris Schmandt, and et al. Indoor lo-
cation sensing using geo-magnetism mobisys, 2011.
[9] Sinan Gezici, Tian Zhi, G.B. Giannakis, and et al. Localization via
ultra-wideband radios: a look at positioning aspects for future sensor
networks. Signal Processing Magazine, IEEE, 22:70–84, July 2005.
73
[11] Ajay Gupta. Googles approach of a nomap wi-fi zone.
[12] Jan Hoffmann, Matthias Jungel, and Martin Lotzsch. A vision based
system for goal-directed obstacle avoidance. In RoboCup 2004, volume
3276, pages 418–425, 2005.
[13] Sung Hyun Jang. A qr code-based indoor navigation system using aug-
mented reality. In RoboCup 2004, March 2012.
[14] Kamol Kaemarungsi and Krishnamurthy Prashant. Modeling of indoor
positioning systems based on location fingerprinting, 2004.
[15] L. Klingbeil and T. Wark. A wireless sensor network for real-time in-
door localisation and motion monitoring. In International Conference on
Information Processing in Sensor Networks, 2008., pages 39–50, 2008.
[16] Manh Hung V. Le, Dimitris Saragas, and Nathan Webb. Indoor navi-
gation system for handheld devices, 2009.
[17] Fan Li, Chunshui Zhao, Guanzhong Ding, and et al. A reliable and ac-
curate indoor localization method using phone inertial sensors. In Pro-
ceedings of the 2012 ACM Conference on Ubiquitous Computing, pages
421–430, September 2012.
[18] Y. Li and S.T. Birchfield. Image-based segmentation of indoor corridor
floors for a mobile robot, 2010.
[19] Jo Agila Bitsch Link, Felix Gerdsmeier, Paul Smith, and Klaus Wehrle.
Indoor navigation on wheels (and on foot) using smartphones. In Pro-
ceedings of the 2012 International Conference on Indoor Positioning and
Indoor Navigation (IPIN), November 2012.
[20] B.D. Lucas, Kanade, and T. An iterative image registration technique
with an application to stereo vision. In 7th International Joint Confer-
ence on Artificial Intelligence (IJCAI), April 1981.
[21] Android market. Accupedo app.
[22] Android market. Runtastic app.
[23] Gerardo Carrera Mendoza. Robot slam and navigation with multi-
camera computer vision, 2012.
[24] Alessandro Mulloni, Daniel Wagner, Dieter Schmalstieg, and Istvan
Barakonyi. Indoor positioning and navigation with camera phones. In
Pervasive Computing, IEEE, volume 8, pages 22–31, April 2009.
74
[25] Clark F. Olsona, Larry H. Matthies, Marcel Schoppers, and Mark W.
Maimone. Rover navigation using stereo ego-motion. In Robotics and
Autonomous Systems, volume 43, pages 215–229, June 2003.
[31] Ahmed Wasif Reza and Tan Kim Geok. Investigation of indoor location
sensing via rfid reader network utilizing grid covering algorithm. In
Wireless Personal Communications, volume 49, pages 67–80, June 2008.
[34] Gideon P. Stein, Ofer Mano, and Amnon Shashua. A robust method for
computing vehicle ego-motion. In Proceedings of the IEEE Intelligent
Vehicles Symposium, 2000, pages 362–368, 2000.
[35] Oliver Woodman and Robert Harle. Pedestrian localisation for indoor
environments. In Proceedings of the Tenth International Conference on
Ubiquitous Computing (UbiComp 08), pages 362–368, 2008.
75
Appendix A
The following example and the corresponding images are taken from the
OpenCV documentation for Hough line transform1
Let us take an example where the binary image contains a point x = 8
and y = 6. If we plot all the possible values of θ and r for that point, we
obtain the following sinusoidal curve:
We repeat the process for every binary point in the image. So suppose
there are two other points in the image with coordinates (9, 4) and (12, 3),
we then obtain the following plot.
We can see that the three curves intersect when r = 0.925 and = 0.96.
So there is a high probability that there exists a line in the original image
with the following equation.
76
There exist many such intersection points in the graph corresponding to
the different lines detected in the image. Furthermore, we can set a thresh-
old for the minimum number of intersections to eliminate any spurious line
detections.
77