0% found this document useful (0 votes)
65 views11 pages

It C Synopsis

This document presents a synopsis of internet traffic classification. It discusses challenges in traditional traffic classification techniques which typically achieve only 50-70% accuracy. The objectives are to develop a more accurate lightweight model for real-time classification using deep learning techniques. The methodology will involve data pre-processing steps like normalization, missing value imputation and feature selection, followed by classification using algorithms like Naive Bayes, Support Vector Machine (SVM) and Convolutional Neural Network (CNN). Literature on existing approaches is surveyed and challenges of traditional methods like PCA, Naive Bayes and SVM are outlined to motivate the proposed solution.

Uploaded by

UJJWAL KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views11 pages

It C Synopsis

This document presents a synopsis of internet traffic classification. It discusses challenges in traditional traffic classification techniques which typically achieve only 50-70% accuracy. The objectives are to develop a more accurate lightweight model for real-time classification using deep learning techniques. The methodology will involve data pre-processing steps like normalization, missing value imputation and feature selection, followed by classification using algorithms like Naive Bayes, Support Vector Machine (SVM) and Convolutional Neural Network (CNN). Literature on existing approaches is surveyed and challenges of traditional methods like PCA, Naive Bayes and SVM are outlined to motivate the proposed solution.

Uploaded by

UJJWAL KUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

JSS MAHAVIDYAPEETHA

JSS SCIENCE AND TECHNOLOGY UNIVERSITY


JSS Technical Institution Campus Mysore -- 570006

Synopsis Of
Internet Traffic Classification

Presented by :

Aditya Srivastava - 01JST16IS003


Ujjwal Kumar - 01JST16IS047
Ranjita Laxman Huded - 01JST17IS410

Under the guidance of


Prof. Manju N.
​DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING
2018-2019

1
Contents

1 Introduction………………………………………………………………………3

2 Literature Survey………………………………………………………………..4

3 Challenges………………………………………………………………………5

4 Motivation………………………………………………………………………..6

5 Problem Definition……………………………………………………………...6

6 Objective………………………………………………………………………...7

7 Methodology………………………………………………………………….....7

7.1 Data Pre-processing

7.1.1 Normalization

7.1.2 Filling missing values

7.1.3 Feature Selection

7.2 Classification Algorithm

7.2.1 Naive Bayesian Classifier

7.2.2 Support Vector Machine (SVM)

7.2.3 ConvNet (CNN)

8 References……………………………………………………………………..11

2
1 Introduction
Accurate network traffic classification is fundamental to numerous network activities, from
security monitoring to accounting, and from Quality of Service to providing operators with useful
forecasts for long-term provisioning. Yet , classification schemes are difficult to operate correctly
because the knowledge commonly available to the network ,i.e packet-headers , often does not
contain sufficient information to allow for an accurate methodology. This leads to traditional
techniques for traffic classification that are often no more accurate than 50-70% [1]

First of all, feature selection is one of the most critical steps in the problem of Internet traffic
identification. We can classify different categories of applications is because there exist so many
discrepancies of their behaviors. Different researchers have different opinions on the
importance of features of traffic flows. Secondly, although a few features have been chosen to
classify different Internet traffic, not every feature has the same importance. Therefore, in order
to improve the recognition rate, each feature selected could have a weight value representing its
importance.Thirdly, previous works only focused on identifying TCP flows, and traffic flows using
UDP protocol cannot be identified. Therefore a Support Vector Machine (SVM) is proposed as
an approach to identify both TCP and UDP traffic flows. [2]

Accurate identification of network traffic is an essential step to improving a multitude of network


services: accounting, security monitoring, traffic forecasting and Quality of Service. However,
high classification accuracies often necessitate the collection of large amounts of either data or
metadata. Accurate traffic classification can quickly identify malicious flows and contribute to the
control of network attacks. The principal objective of this paper is the development of a
lightweight model with high classification accuracy capable of real-time operation. We also
investigate the spatial and temporal stability of this model and compare it to other established
classification schemes. Our model is built using a perceptron network, often referred to as a
neural network, implemented on the freely available Tensorflow software. Tensorflow uses
optimized kernels to prune the computation graph prior to running the neural network, vastly
improving computational time. A key advantage the neural network offers over traditional
machine learning schemes is the introduction of non-linearity which can better capture
complexities within class structures. [3]

In this paper, we will describe some new ideas for traffic identification. We make our
effort to solve the problem in traditional procedure. First, we retrospect some classic methods.
Then we introduce the framework of deep learning. We focus on the specific applications. The
applications include automatic feature learning, protocol classification, anomalous protocol
detection, and unknown protocol identification. The last parts are the conclusion and future
work. [5]

3
2 Literature Survey
Zuev and Moore et al [1] applied a probabilistic model based Bayes method to traffic
classification.

Jun Tan, Xingshu Chen and Min Du et al [2] applied Particle Swarm Optimization-Support
Vector Machines (PSO-SVM) for classification in which PSO is used as an optimization
technique. For feature selection they used genetic algorithm which reduced the attributes from
248 to 46.

This paper ,Ang Kun Joo Michael, Emma Valla, Natinael Solomon Neggatu, Andrew W. Moore
demonstrates the construction of a lightweight neural network capable of real-time network
traffic classification. In the process, they have also provided greater insight into methodologies
used by different classification schemes. Potential procedures for both data processing and
optimization are discussed which are generalizable to other supervised machine learning
methods. We also outlined a fast method of identifying key attributes in the neural network
based on the connection weights. We showed that there were fundamental differences between
the perceptron network and the Naive Bayes methods which manifest in the key attributes
identified by both systems. The neural network achieved only 94%−97% when running on the
attributes identified by Symmetric Uncertainty rankings (Table 1). This increased to 99.0% when
using the attributes identified by their weights.

Table 1: Reduced list of attributes as obtained by Symmetric Uncertainty

Jun Hua SHU, Jiang JIANG, Jing Xuan SUN et al [4] are comparing the two feature extraction
techniques of Naive Bayesian and Decision Trees. To solve the traditional multilayer
disadvantages in the network, they propose a new activation function ReLU, a new weight
initialization method, a new loss function and an anti-fitting method (Dropout, regularization,
etc.)

4
Zhanyi Wang et al [5] used the method of SAE and ANN for feature Extraction. The most
remarkable difference between them is supervised or unsupervised. In ANN model labels are
necessary while SAE are not. There are two advantages of feature extraction by ANN or SAE
model. One is conductive to people to reduce the manual workload. Once inputs of the model
and stopping criterion of the iteration are determined, the model will train automatically. The
other is that when the training process is finished, the goal of dimensionality reduction is also
achieved. Features are mapped to new space. Redundant information is filtered as well.

3 Challenges

Principal Component Analysis


● PCA assumes that the principle components are a linear combination of the original
features. If this is not true, PCA will not give you sensible results.
● PCA uses variance as the measure of how important a particular dimension is. So, high
variance axes are treated as principal components, while low variance axes are treated
as noise.
● PCA assumes that the principle components are orthogonal.
● The method of genetic algorithm for feature selection reduced the dimensionality from
248 to 46 which is not that efficient.

Naive Bayes algorithm


● If categorical variable has a category (in test data set), which was not observed in
training data set, then model will assign a 0 (zero) probability and will be unable to make
a prediction. This is often known as “Zero Frequency”. To solve this, we can use the
smoothing technique. One of the simplest smoothing techniques is called Laplace
estimation.
● On the other side Naive Bayes is also known as a bad estimator, so the probability
outputs from predict_proba are not to be taken too seriously.
● Another limitation of Naive Bayes is the assumption of independent predictors. In real
life, it is almost impossible that we get a set of predictors which are completely
independent.

Support Vector Machine


● It is difficult to monitor the network of unhealthy content, P2P applications take a lot of
network bandwidth and other issues.

5
● SVM doesn’t perform well, when we have large data set because the required training
time is higher
● It also doesn’t perform very well, when the data set has more noise i.e. target classes
are overlapping
● SVM doesn’t directly provide probability estimates, these are calculated using an
expensive five-fold cross-validation. It is related SVC method of Python scikit-learn
library

4 Motivation
● In recent years, with the rapid development of the Internet, the network traffic data also
showed explosive growth, while giving people convenience, but also to the effective
network management, security, network environment has brought great challenges, such
as virus flooding.
● SVM doesn’t perform well, when we have large data set because the required training
time is higher. So we are proposing an Ensemble model by combining and comparing
the different approaches of Naive Bayes algorithm, SVM and ConvNet.
● Traffic classification is also an important solution for the emerging requirement that ISP
networks have to provide LI capabilities.
● Some of the most successful deep learning methods involve artificial neural networks,
such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Deep
Belief Networks (DBN) and Stacked Auto-Encoder(SAE). In recent years, it is more and
more popular and applied to computer vision, speech recognition, and natural language
processing. Studies show that deep learning completely surpasses traditional methods in
most of the areas.Surprisingly ,the error rate fell from 26% to 15% in Image Net
Challenge 2012.

5 Problem Definition
The problem is to classify different applications based on internet flows of data
packets. The dataset has 248 attributes and 10 classes which has to be identified based on
internet flows. A basic requirement of traffic classification is that the flow types are correctly
identified. An application class may contain different kinds of data, for example, the class Mail
includes SMTP and POP3. TCP/IP traffic flows are the fundamental objects for classification,
which is represented as a flow of one or more packets between two hosts of network using
network communication protocols.

6
TABLE 2: THE CATEGORIES OF NETWORK APPLICATIONS

6 Objective
We intend to identify various internet flows on the basis of the dataset provided. The
objective is to achieve higher accuracy of classification. We intend to reduce the dimensionality
for faster classification and higher accuracy. Training time can be reduced by selecting optimum
number of nodes in neural network.

7
​ Methodology
Here, the proposed methodology is PCA for feature selection and an Ensemble model by
combining and comparing the different approaches of Naive Bayes algorithm, SVM and
ConvNet for classification.

7.1 Data Pre-processing


There are 3 steps in data pre-processing:

1. Normalization

2. Filling missing values

3. Feature Selection

7.1.1 Normalization
The main aim of normalizing the data set is to bring all the feature values

7
into a similar scale. This makes the task of comparing easier and less time
consuming. We use the standard score method for normalization given below

7.1.2 Filling Missing Values


We use the computed average values of each attribute to replace the missing
values of each feature.

7.1.3 Feature Selection

Table 3: Comparison of approaches for feature selection

Principal Component Analysis (PCA) is proposed for feature selection in the dataset as it gives
the highest accuracy (Table 4 and 5) and least training and testing time (Table 3) among all the
other feature selection techniques. PCA is a technique used to emphasize variation and bring
out strong patterns in a dataset. It's often used to make data easy to explore and visualize. [6]

8
7.2 Classification Algorithm
An ensemble model is proposed by combining and comparing the different approaches of Naive
Bayes classifiers, SVM and CNN. Our introduced methodology is focused on combining SVM
and Naive Bayes classification algorithms [7] and comparing them with deep learning
approaches (CNN) to get better results and higher accuracy.

7.2.1 Naive Bayesian Classifier


A Naive Bayes classier is a simple probabilistic classier based on Bayes’ theorem
and is particularly suited when the dimensionality of the inputs are high. It’s underlying
probability model can be described as an ”independent feature model”. The Naive Bayes (NB)
classier uses the Bayes’ rule equations -

Where, ​p(d)​ plays no role in selecting ​C∗​. To estimate the term ​p(d|c)​, Naive Bayes
decomposes it by assuming the ​f​i​’s​ are conditionally independent given ​d’s​ class as in equation
-

Where, ​m​ is the no of features and ​fi​ ​ is the feature vector. Consider a training method consisting
of a relative-frequency estimation ​p(c)​ and ​p(f​i​|c).

7.2.2 Support Vector Machine (SVM)


The purpose of SVM classification is to find optimal separating hyperplane by maximizing the
margin between the separating hyperplane and the data. Support vector machines were
introduced in Boser et al. (1992) and basically attempt to find the best possible surface to
separate positive and negative training samples. Support Vector Machines (SVMs) are
supervised learning methods used for classification. [7]
Given training vectors x​i​∈R​n​, i=1,...,l, in two classes, and an indicator vector y∈R​l​ such that
y​i​∈{1,-1}, C−SVC solves the following primal optimization problem (Chang et al., 2011). [7]

9
7.2.3 ConvNet (CNN)
Machine learning algorithms such as Naive Bayesian and decision trees are highly
dependent on the choice of feature attributes, such as the absence of feature selection for naive
Bayesian classification, the accuracy rate is only about 65%. However, the average
classification accuracy after feature selection is about 95%, it can be seen the importance of
feature selection for machine learning. Deep learning can automatically combine the low-order
features of the input, transform, arrange the combination, get high-order features, eliminating
the need for manual construction of high-order features of the workload. [4]
In a broad sense, the deep learning network structure is also a multi-layer neural
network. A multi-layer neural network mainly includes an input layer, hidden layers and an
output layer, as shown in the figure below.

Fig 1 - ConvNet Architecture

10
8. Reference

[1]
Internet Traffic Classification Using Bayesian Analysis Techniques
Andrew W. Moore, Denis Zuev
2005

[2]
An Internet Traffic Identification Approach Based on GA and PSO-SVM
Jun Tan, Xingshu Chen and Min Du
January 2010

[3]
Network traffic classification via neural networks
Ang Kun Joo Michael, Emma Valla, Natinael Solomon Neggatu, Andrew W. Moore
September 2017

[4]
Network Traffic Classification Based on Deep Learning
Jun Hua SHU , Jiang JIANG , Jing Xuan SU
Beijing 102206, China
[5]
The Applications of Deep Learning on Traffic Identification
Zhanyi Wang

[6]
Principal Component Analysis (explained visually)
http://setosa.io/ev/principal-component-analysis/

[7]
SVM and Naive Bayes Classification Ensemble Method for Sentiment Analysis
Konstantinas KOROVKINAS, Paulius DANENAS, Gintautas GARSVA

11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy