It C Synopsis
It C Synopsis
Synopsis Of
Internet Traffic Classification
Presented by :
1
Contents
1 Introduction………………………………………………………………………3
2 Literature Survey………………………………………………………………..4
3 Challenges………………………………………………………………………5
4 Motivation………………………………………………………………………..6
5 Problem Definition……………………………………………………………...6
6 Objective………………………………………………………………………...7
7 Methodology………………………………………………………………….....7
7.1.1 Normalization
8 References……………………………………………………………………..11
2
1 Introduction
Accurate network traffic classification is fundamental to numerous network activities, from
security monitoring to accounting, and from Quality of Service to providing operators with useful
forecasts for long-term provisioning. Yet , classification schemes are difficult to operate correctly
because the knowledge commonly available to the network ,i.e packet-headers , often does not
contain sufficient information to allow for an accurate methodology. This leads to traditional
techniques for traffic classification that are often no more accurate than 50-70% [1]
First of all, feature selection is one of the most critical steps in the problem of Internet traffic
identification. We can classify different categories of applications is because there exist so many
discrepancies of their behaviors. Different researchers have different opinions on the
importance of features of traffic flows. Secondly, although a few features have been chosen to
classify different Internet traffic, not every feature has the same importance. Therefore, in order
to improve the recognition rate, each feature selected could have a weight value representing its
importance.Thirdly, previous works only focused on identifying TCP flows, and traffic flows using
UDP protocol cannot be identified. Therefore a Support Vector Machine (SVM) is proposed as
an approach to identify both TCP and UDP traffic flows. [2]
In this paper, we will describe some new ideas for traffic identification. We make our
effort to solve the problem in traditional procedure. First, we retrospect some classic methods.
Then we introduce the framework of deep learning. We focus on the specific applications. The
applications include automatic feature learning, protocol classification, anomalous protocol
detection, and unknown protocol identification. The last parts are the conclusion and future
work. [5]
3
2 Literature Survey
Zuev and Moore et al [1] applied a probabilistic model based Bayes method to traffic
classification.
Jun Tan, Xingshu Chen and Min Du et al [2] applied Particle Swarm Optimization-Support
Vector Machines (PSO-SVM) for classification in which PSO is used as an optimization
technique. For feature selection they used genetic algorithm which reduced the attributes from
248 to 46.
This paper ,Ang Kun Joo Michael, Emma Valla, Natinael Solomon Neggatu, Andrew W. Moore
demonstrates the construction of a lightweight neural network capable of real-time network
traffic classification. In the process, they have also provided greater insight into methodologies
used by different classification schemes. Potential procedures for both data processing and
optimization are discussed which are generalizable to other supervised machine learning
methods. We also outlined a fast method of identifying key attributes in the neural network
based on the connection weights. We showed that there were fundamental differences between
the perceptron network and the Naive Bayes methods which manifest in the key attributes
identified by both systems. The neural network achieved only 94%−97% when running on the
attributes identified by Symmetric Uncertainty rankings (Table 1). This increased to 99.0% when
using the attributes identified by their weights.
Jun Hua SHU, Jiang JIANG, Jing Xuan SUN et al [4] are comparing the two feature extraction
techniques of Naive Bayesian and Decision Trees. To solve the traditional multilayer
disadvantages in the network, they propose a new activation function ReLU, a new weight
initialization method, a new loss function and an anti-fitting method (Dropout, regularization,
etc.)
4
Zhanyi Wang et al [5] used the method of SAE and ANN for feature Extraction. The most
remarkable difference between them is supervised or unsupervised. In ANN model labels are
necessary while SAE are not. There are two advantages of feature extraction by ANN or SAE
model. One is conductive to people to reduce the manual workload. Once inputs of the model
and stopping criterion of the iteration are determined, the model will train automatically. The
other is that when the training process is finished, the goal of dimensionality reduction is also
achieved. Features are mapped to new space. Redundant information is filtered as well.
3 Challenges
5
● SVM doesn’t perform well, when we have large data set because the required training
time is higher
● It also doesn’t perform very well, when the data set has more noise i.e. target classes
are overlapping
● SVM doesn’t directly provide probability estimates, these are calculated using an
expensive five-fold cross-validation. It is related SVC method of Python scikit-learn
library
4 Motivation
● In recent years, with the rapid development of the Internet, the network traffic data also
showed explosive growth, while giving people convenience, but also to the effective
network management, security, network environment has brought great challenges, such
as virus flooding.
● SVM doesn’t perform well, when we have large data set because the required training
time is higher. So we are proposing an Ensemble model by combining and comparing
the different approaches of Naive Bayes algorithm, SVM and ConvNet.
● Traffic classification is also an important solution for the emerging requirement that ISP
networks have to provide LI capabilities.
● Some of the most successful deep learning methods involve artificial neural networks,
such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Deep
Belief Networks (DBN) and Stacked Auto-Encoder(SAE). In recent years, it is more and
more popular and applied to computer vision, speech recognition, and natural language
processing. Studies show that deep learning completely surpasses traditional methods in
most of the areas.Surprisingly ,the error rate fell from 26% to 15% in Image Net
Challenge 2012.
5 Problem Definition
The problem is to classify different applications based on internet flows of data
packets. The dataset has 248 attributes and 10 classes which has to be identified based on
internet flows. A basic requirement of traffic classification is that the flow types are correctly
identified. An application class may contain different kinds of data, for example, the class Mail
includes SMTP and POP3. TCP/IP traffic flows are the fundamental objects for classification,
which is represented as a flow of one or more packets between two hosts of network using
network communication protocols.
6
TABLE 2: THE CATEGORIES OF NETWORK APPLICATIONS
6 Objective
We intend to identify various internet flows on the basis of the dataset provided. The
objective is to achieve higher accuracy of classification. We intend to reduce the dimensionality
for faster classification and higher accuracy. Training time can be reduced by selecting optimum
number of nodes in neural network.
7
Methodology
Here, the proposed methodology is PCA for feature selection and an Ensemble model by
combining and comparing the different approaches of Naive Bayes algorithm, SVM and
ConvNet for classification.
1. Normalization
3. Feature Selection
7.1.1 Normalization
The main aim of normalizing the data set is to bring all the feature values
7
into a similar scale. This makes the task of comparing easier and less time
consuming. We use the standard score method for normalization given below
Principal Component Analysis (PCA) is proposed for feature selection in the dataset as it gives
the highest accuracy (Table 4 and 5) and least training and testing time (Table 3) among all the
other feature selection techniques. PCA is a technique used to emphasize variation and bring
out strong patterns in a dataset. It's often used to make data easy to explore and visualize. [6]
8
7.2 Classification Algorithm
An ensemble model is proposed by combining and comparing the different approaches of Naive
Bayes classifiers, SVM and CNN. Our introduced methodology is focused on combining SVM
and Naive Bayes classification algorithms [7] and comparing them with deep learning
approaches (CNN) to get better results and higher accuracy.
Where, p(d) plays no role in selecting C∗. To estimate the term p(d|c), Naive Bayes
decomposes it by assuming the fi’s are conditionally independent given d’s class as in equation
-
Where, m is the no of features and fi is the feature vector. Consider a training method consisting
of a relative-frequency estimation p(c) and p(fi|c).
9
7.2.3 ConvNet (CNN)
Machine learning algorithms such as Naive Bayesian and decision trees are highly
dependent on the choice of feature attributes, such as the absence of feature selection for naive
Bayesian classification, the accuracy rate is only about 65%. However, the average
classification accuracy after feature selection is about 95%, it can be seen the importance of
feature selection for machine learning. Deep learning can automatically combine the low-order
features of the input, transform, arrange the combination, get high-order features, eliminating
the need for manual construction of high-order features of the workload. [4]
In a broad sense, the deep learning network structure is also a multi-layer neural
network. A multi-layer neural network mainly includes an input layer, hidden layers and an
output layer, as shown in the figure below.
10
8. Reference
[1]
Internet Traffic Classification Using Bayesian Analysis Techniques
Andrew W. Moore, Denis Zuev
2005
[2]
An Internet Traffic Identification Approach Based on GA and PSO-SVM
Jun Tan, Xingshu Chen and Min Du
January 2010
[3]
Network traffic classification via neural networks
Ang Kun Joo Michael, Emma Valla, Natinael Solomon Neggatu, Andrew W. Moore
September 2017
[4]
Network Traffic Classification Based on Deep Learning
Jun Hua SHU , Jiang JIANG , Jing Xuan SU
Beijing 102206, China
[5]
The Applications of Deep Learning on Traffic Identification
Zhanyi Wang
[6]
Principal Component Analysis (explained visually)
http://setosa.io/ev/principal-component-analysis/
[7]
SVM and Naive Bayes Classification Ensemble Method for Sentiment Analysis
Konstantinas KOROVKINAS, Paulius DANENAS, Gintautas GARSVA
11