0% found this document useful (0 votes)
16 views

Monitoring SIP Traffic Using Support Vector Machines

Uploaded by

mipiso9067
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Monitoring SIP Traffic Using Support Vector Machines

Uploaded by

mipiso9067
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Monitoring SIP Traffic Using

Support Vector Machines


Mohamed Nassar, Radu State, Olivier Festor

(nassar, state, festor)@loria.fr

MADYNES Team
INRIA, Nancy Grand Est

17 September 2008
Outline

• Introduction to SIP
• Threats
• Monitoring system
• Experiments
• Future works and Conclusion

2/25
SIP Hard phone
Soft phone
1000@192.168.1.12
bob@192.168.1.10

• SIP (Session Initiation Protocol -


INVITE (SDP (U-Law) )
RFC 3261) Text-based like HTTP 5060 5060
100 Trying
5060 5060
• Request + response = transaction 5060
180 Ringing
5060
200 OK (SDP (A-Law))
5060 5060
• URI = ACK
sip:user@host:port;parameters 5060 5060
RTP (A-Law)
10502 34154
RTP (A-Law)
10502 34154
BYE
5060 5060
200 OK
5060 5060

3/25
DNS
SIP Trapezoid Database
server server
IP address of
SIP service at
berlin.org Where Alice
is registered?

INVITE sip:Alice@berlin.org

Proxy server
Proxy server
INVITE
INVITE sip:Alice@berlin.org SIP/2.0 sip:Alice@berlin.org
Via: SIP/2.0/UDP loria.nancy.org:5060;branch=z9hG4bKfw19b
Max-Forwards: 70
To: Alice <sip:Alice@berlin.org>
From: Bob <sip:Bob@nancy.org>;tag=76341
Bob Call-ID: 123456789@loria.nancy.org
CSeq: 1 INVITE Alice
Contact: <sip:Bob@nancy.org>
Content-Type: application/sdp

<SDP body not shown>

4/25
Threats in the VoIP domain

Misrepresentation Discovering the


Messages not Resulting in
Unwanted calls identity to obtain users extensions in
compliant to protocol resource
for personal information a VoIP domain
specifications exhaustion
telemarketing of the
and advertising target
Displaying a number Brute force
different than the originating one voice-mail and Resulting in premature session
register-account tear down or service abuse
password cracking

5/25
DoS
Using invalid destination domains with 100 Invite/second

• Flooding attacks target the signaling plane elements (e.g. proxy, gateway, etc.) with
the objective to take them down or to limit their quality, reliability and availability
Strategy Destination
Legitimate SIP messages A valid URI in the target domain
Malformed SIP messages A non existent URI in the target domain

Invalid SIP messages A URI with an invalid domain or IP address

Spoofed SIP messages An invalid URI in another domain


CPU-based attacks targeting the
authentication process
A valid URI in another domain.

6/25
SPIT or SPam over Internet Telephony
• Like SPAM (cost-free) but more annoying (phone ringing all the
day, interruption of work)
• Expected to become a severe issue with the large deployment of
VoIP services
• SPIT transactions are technically correct
• We don’t know the content until the phone rings
• We need to be reachable
• SPAM filtering solutions are not directly applicable
• Current approaches: multi-level grey list, Turing tests, Trust
management, VoIP SEAL from NEC, VoIP SPAM detector from
University of North Texas

*From winnipeg.ca
7/25
Monitoring Approach
SIP Flow Queue is full
Vector
Events
(Features) Event Correlator/
Queue Processor Classifier
Decider

Update Couples
(vector, Class Id)

Border
Effect Learning
False Alarms
Positives

Normal Attack Normal


Period
8/25
Monitoring System
Learning
Selected
features Couples Alarms
Adjustment
Short-term (vector, Class
window Id)
Vector Recovery
Analyser Classifier algorithm
(Features)
Flood detection

SIP Start/Stop
flow Long-term
window Events
Vector Event Correlator
Analyser Classifier
(Features) / Decider

Update Couples
(vector, Class
• Short-term/long-term monitoring Learning Id) Alarms
• Count-related/chronological windows
• Different classification and anomaly detection techniques
• Learning-updating/ testing
• Defense against manipulation attacks (poisoning)
• Feature selection and extraction
• Event correlation
• Prevention
9/25
Why SVM ?

Kernel
Function

(Radial Basis, Linear, polynomial, sigmoid )

• Known to process high dimensional data


• Classification, regression and exploration of data
• High performance in many domains (Bioinformatics, pattern recognition) and in network-
based intrusion detection as well
•Unsupervised Learning

10/25
Feature Selection
• We have 38 Features INVITE (SDP)
characterizing the SIP
traffic
100 Inter request arrival
• Distributed over 5 groups:
1. General statistics Inter SDP arrival
2. Call-ID based statistics OPTIONS
Inter response arrival
3. Dialog final state
distribution
200 OK (SDP)
4. Request distribution
Inter request arrival
5. Response distribution
Inter response arrival
• We take into account 200 OK
inbound and outbound
messages
ACK
• Other features can be
investigated as well
•Average inter request arrival
• Features must be
•Average inter response arrival
characterized by a small
extraction complexity •Average inter SDP arrival

• Our feature extraction tool •Number of request / total number of messages

is written in Java using the •Number of responses /total number of messages


Jain SIP parser •Number of SDP/ total number of messages
•Number of messages having the same Call-ID
11/25
Traces and testbed

Real World
VoIP service
provider

12/25
VoIP specific bots
Launches attacks
Asterisk
Available from www.loria.fr/ Cisco
Linksys
Victim
~nassar Thomson,
Grandstream
DoS
VoIP Bot SPIT

commands

Retrieves exploit
Malicious user

VoIP Bot Upload Exploit code


Web server
With dynamic DNS

SIP
IRC IRC RTP
HTTP

Manager IRC server / VoIP Agent


channel
13/25
Experiments
Classification time < 1s

Trace Normal DoS KIF Unknown


SIP pkts 57960 6076 2305 7033
Duration(min) 8.6 3.1 50.9 83.7

14/25
Normal Data Coherence Test

Day 1
Day 1

Day 1

Day 2

15/25
Monitoring Window Size

The overall
trace is about
8.6 minutes
and message
arrival is
about 147
Msg/s

16/25
Feature selection

17/25
Feature Selection
• Greater number of features doesn’t mean
higher accuracy
• Feature selection increases the accuracy
and the performance of the system
• Selected features are highly dependent on
the underlying traffic and the attacks to be
detected
• A preliminary approach combines F-score
and SVM

18/25
Flooding Detection
Background traffic ~ 147 Msg/sec

Window = 30 messages

A N
t
Attack
Period

19/25
Selected Features for Flooding /
Short Term Monitoring
Number Name
F-score
11 NbReceivers

14 NbCALLSET
20 NbInv
4 NbSdp
2 NbReq
3 NbResp
13 NbNOTACALL
12 AvMsg

20/25
SPIT Detection
Background traffic ~ 147 Msg/sec
Window = 30 messages

False Positive = 0 %

A N
t
Attack
Period

21/25
Selected Features for SPIT /
Long Term Monitoring
Number Name

16 NbRejected F-score

4 NbSdp
20 NbInv
23 NbAck
36 Nb4xx
34 Nb2xx
7 AvInterSdp
35 Nb3xx
13 NbNOTACALL

22/25
Event Correlation

Predicate SPIT Intensity

10 Distributed positives in a 2 minutes Low (Stealthy)


period

Multiple Series of 5 successive Medium


Positives

Multiple Series of 10 successive High


Positives

23/25
Conclusion and Future works
• Online monitoring methodology is proposed based on
SVM learning machine
• Offline experiments shows real-time performance and
high detection accuracy
• Anomaly detection and unsupervised learning approach
are future works
• Studying traces of other VoIP attacks
• More investigation about the set of features and the
selection algorithms
• Extending the event correlation framework in order to
reveal attack strategies and attacker plan recognition

24/25
Annex

25/25
Features
Group 1 - General Statistics
1 Duration Total time of the slice
2 NbReq # of requests / Total # of messages
3 NbResp # of responses / Total # of messages

4 NbSdp # of messages carrying SDP / Total # of


messages
5 AvInterReq Average inter arrival of requests

6 AvInterResp Average inter arrival of responses

7 AvInterSdp Average inter arrival of messages carrying SDP


bodies

26/25
Features
Group2 - Call-Id based statistics
8 NbSess # of different Call-IDs

9 AvDuration Average duration of a Call-ID

10 NbSenders # of different senders / Total # of Call-IDs

11 NbReceivers # of different receivers / Total # of Call-IDs

12 AvMsg Average # of messages per Call-ID

27/25
Features
Group 3 – Dialogs’ Final State Distribution
13 NbNOTACALL # of NOTACALL/ Total # of Call-ID
14 NbCALLSET # of CALLSET/ Total # of Call-ID
15 NbCANCELED # of CANCELED/ Total # of Call-ID

16 NbREJECTED # of REJECTED/ Total # of Call-ID

17 NbINCALL # of INCALL/ Total # of Call-ID

18 NbCOMPLETED # of COMPLETE/ Total # of Call-ID

19 NbRESIDUE # of RESIDUE/ Total # of Call-ID

28/25
Features
Group 4 – Request Distribution
20 NbInv # of INVITE / Total # of requests
21 NbReg # of REGISTER/ Total # of requests
22 NbBye # of BYE/ Total # of requests
23 NbAck # of ACK/ Total # of requests
24 NbCan # of CANCEL/ Total # of requests
25 NbOpt # of OPTIONS / Total # of requests
26 NbRef # of REFER/ Total # of requests
27 NbSub # of SUBSCRIBE/ Total # of requests

28 NbNot # of NOTIFY/ Total # of requests


29 NbMes # of MESSAGE/ Total # of requests
30 NbInf # of INFO/ Total # of requests
31 NbPra # of PRACK/ Total # of requests
32 NbUpd # of UPDATE/ Total # of requests
29/25
Features
Group5 – Response Distribution
33 Nb1xx # of Informational responses / Total # of
responses
34 Nb2xx # of Success responses / Total # of responses

35 Nb3xx # of Redirection responses / Total # of


responses

36 Nb4xx # of Client error responses / Total # of


responses

37 Nb5xx # of Server error responses / Total # of


responses
38 Nb6xx # of Global error responses / Total # of
responses

30/25
Phreaking by social engineering scheme

I am a technician doing a
Gateway
test, please transfer me to SIP / PSTN
that operator by dialing 9 0 #
and hang up

Trudy
IP PSTN
network network

Bob has a
contract to
make phone
calls towards
the PSTN

31/25
Machine Learning
• Pros
– Better accuracy, small false alarm rate
– Compact representation
– Detecting Novelty
• Cons
– Embedding of network data in metric spaces
– Difficulty of getting labels
– Vulnerable to malicious noise
– Huge data volumes

32/25
*From Wikipedia

33/25
Traces

• Call Setup is a small fraction of the signaling traffic


• Some empty messages are used as Ping or KeepALive for device
management
• Some messages throw parsing exceptions

34/25
Traces

• OPTIONS and REGISTER messages are the most numerous


• MESSAGE, PRACK and UPDATE are absent
• The number of NOTIFY is constant over the time (messages automatically generated at fixed rate)
• #INVITE/#BYE = 2.15 (Not every INVITE result s in a BYE e.g. callee is busy, retransmission, re-
INVITE)

•#INVITE/#ACK = 0.92 (Some INVITE are acknowledged twice)


35/25
Traces

• The most numerous is the 2xx family (in response to


REGISTER and OPTIONS messages)
• #INVITE/#1xx = 0.59 (Probably a 100 Trying and 180
Ringing for each INVITE)

36/25
Traces

• Average Inter-request = Average Inter Response = 20 ms


• Average inter-request with SDP bodies is inversely proportional
to the #INVITE, BYE, ACK and 1xx (which are only used in call-
setup)
•Average inter-request carrying SDP reaches 3s in quiet hours and
0.5s in rush hours which reveals a high call-setup traffic
37/25
LibSVM

38/25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy