802 11UserFingerprinting
802 11UserFingerprinting
11 User
Fingerprinting
MobiCom 2007
(Sept 9-14 2007 Montreal, Quebec, Canada)
Carnegie Mellon
Jeffery Pang
Intel Research
Ben Greenstein
U of Southern California
Ramakrishna Gummadi
Carnegie Mellon
Srinivasan Seshan
U of Washington
David Wetherall
Authors
The best practices for securing 802.11
networks, embodied in the 802.11i
standard, provide user authentication,
service authentication, data confidentiality,
and data integrity.
However, they do not provide anonymity, a
property essential to prevent location
tracking.
Problem Statement
Demonstrate that existing user
identification and tracking
countermeasures are ineffective
Highlight four previously unrecognized
unique traffic identifiers (implicit
identifiers)
Present an automated procedure to
uniquely identify wireless users
Objectives
The Implicit Identifier Problem
◦ SSID Broadcast with identifiable names
“MIT” or “UniversityofWashington”
◦ Traffic Patterns
Periodic IMAP or SMTP connections to an
identifiable mail server
Unique repetitive packet/frame sizes
◦ Design Flaws/Implementation Variances
Fingerprinting higher layers in the stack (nmap/p0f)
Timing Characteristics
Problem In Detail
Implicit Identifiers
map to physical
locations
WiGLE.net
Location Privacy
◦ RFID devices
◦ GPS enabled devices
Identity Hiding
◦ Using pseudonyms to mask MAC addresses
(Gruteser, Jiang, Stajano)
Implicit Identifiers
◦ Fingerprinting 802.11 driver timings (Franklin,
Kohno)
◦ Clickprints (Padmanabhan and Yang)
Related Work
The Adversary
◦ Passive monitoring (weak adversary)
◦ Using TCPDUMP only
The Environment
◦ Large and small wireless networks evaluated
2004 SIGCOMM Conference (4 days)
U.C. San Diego CS Building (1 day)
Apartment Building (19 days)
◦ Encrypted (WEP/WPA) and Unencrypted
Monitoring Scenario
◦ Assume pseudonyms are randomly chosen every hour
NOTE: Profiled users are those users who were present in both
the training set and the validation data.
Evaluation Criteria
Network Destinations (netdests)
◦ Set of IP <address, port> pairs
Traffic Characteristics/Identifiers
[ssids]
[netdests] [bcast]
[frame]
Identifiers
Naïve Bayes Classifier
From Bayes’ Theorem:
Classification Model
Feature Generation
To compute probabilities implicit identifiers must be
converted to real valued feature
◦ [fields] – each field combination represents a different
value
◦ [ssids, bcast, netdests] – set of discrete elements
Weighted version of Jaccard similarity index to
determine real-valued feature
Classification Model
Accuracy measured by two components
◦ True Positive Rate (TPR)
Fraction of validation samples that user U
generates the are correctly classified
◦ False Positive Rate (FPR)
Fraction of validation samples that user U does
not generate that are incorrectly classified
Classifier Accuracy
Classifier Accuracy Metrics
Mean True Positive Rate for
a failure rate of 1/100 and
1/10 respectively
Max expected
TPR
Complementary cumulative
distribution function (CCDF)
on sigcomm users
(c) FPR = .01
(d) FPR = .1
Constraints:
◦ Public Network: [netdest, ssids, fields, bcast]
◦ Home Network: [ssids, fields, bcast]
◦ Enterprise Network: [ssids, bcast]
Tracking
Tracking
Testing the Classifier
Classification accuracy using In all scenarios the
‘Public, Home, Enterprise’ classifier is able to
constraints
identify unique users
with 90%+ accuracy
Complementary cumulative
distribution function (CCDF)
FPR = .01
Conclusions
Ability to identify user’s is not uniform
◦ Some users do not display any characteristics that distinguish
themselves
◦ Majority of users can be tracked with 90% accuracy even when
unique names/addresses are removed
Any one implicit identifier can be highly discriminating
◦ An adversary may only 1-3 samples of user’s traffic to track them on
average
Research assumptions serve to place a lower bound on the
findings
◦ Advanced adversary may have a significantly higher percentage of
accuracy
Applying existing best practices will fail to protect the
anonymity of a non-trivial fraction of users
◦ Pseudonyms alone are not enough to provide location privacy
Summary of Findings
Similar Research
◦ Dijiang Huang, “Traffic analysis-based unlinkability measure for IEEE 802.11b-based communication systems”. 5th ACM workshop on Wireless security, 2006.
◦ Y. Zhu, R. Bettati, ”Compromising Privacy in Wireless Network Using Cheap Sensors”. Texax A&M University Tech Report, 2005
Pseudonyms
◦ M. Gruteser and D. Grunwald, “Enhancing location privacy in wireless lan through disposable interfaceidentifiers: a quantitative
analysis.,” in WMASH, pp. 46–55, 2003.
RSS
◦ A. M. Ladd, K. E. Bekris, A. Rudys, G. Marceau, L. E. Kavraki, and D. S. Wallach, “Robotics-basedlocation sensing using wireless
Ethernet,” in Proceedings of the Eighth ACM International Conferenceon Mobile Computing and Networking (MOBICOM),
(Atlanta, GA), Sept. 2002.
Angle of Arrival
◦ D. Niculescu and B. Nath, “Vor base stations for indoor 802.11 positioning,” in MobiCom ’04: Pro-ceedings of the 10th annual
international conference on Mobile computing and networking, (NewYork, NY, USA), pp. 58–69, ACM Press, 2004.
◦ D. Niculescu and B. R. Badrinath, “Ad hoc positioning system (aps) using aoa.,” in INFOCOM, 2003.
Time of Arrival
◦ R. J. I. Guvenc, C. T. Abdallah and O. Dedeoglu, “Enhancements to rss based indoor tracking systemsusing kalman filters,” in
GSPx & International Signal Processing Conference, (Dallas, TX), 2003
Next Steps