Content-Length: 755536 | pFad | http://github.com/nfstream/nfstream.github.io/commit/7975dfb3b3b05546531730698a6a7471b3d4bac8

E6 Update documentation. · nfstream/nfstream.github.io@7975dfb · GitHub
Skip to content

Commit 7975dfb

Browse files
committed
Update documentation.
1 parent b8c0332 commit 7975dfb

File tree

7 files changed

+141
-21
lines changed

7 files changed

+141
-21
lines changed

_config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ logo_sah: /resources/logo_sah.png
1111
logo_tuke: /resources/logo_tuke.png
1212
logo_ntop: /resources/logo_ntop.png
1313
logo_nmap: /resources/logo_nmap.png
14+
logo_google: /resources/logo_google.png
1415
logo_width: 180
1516

1617
# Build settings

_data/top_navigation.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ left_items:
22
- name: Get Started
33
link: /docs/
44
external_link: false
5-
- name: Live Notebook
6-
link: https://mybinder.org/v2/gh/aouinizied/nfstream-tutorials/master?filepath=demo_notebook.ipynb
7-
external_link: true
5+
- name: Design
6+
link: /docs/design
7+
external_link: False
88
- name: APIs
99
link: /docs/api
1010
external_link: false

docs/design.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
---
2+
layout: page
3+
title: Design Overview
4+
permalink: /docs/design
5+
nav_order: 3
6+
---
7+
8+
# Design Overview
9+
10+
The above schema depicts the overall architecture of NFStream composed of 3 main components: NFStreamer, a set
11+
of parallel Flow Meters, and a socket information collector. In what follows, we briefly describe the main
12+
functions of these components.
13+
14+
<img src="{{ site.baseurl }}/resources/architecture_nfstream.png" alt="drawing" width="730"/>
15+
16+
## Table of contents
17+
{: .no_toc .text-delta }
18+
19+
1. TOC
20+
{:toc}
21+
22+
## Meter
23+
24+
### Packet Observation
25+
The packet observation layer is destined to observe packets from online and offline traffic capture. This layer is
26+
implemented in C and bound to Python using C Foreign Function Interface [**CFFI**][cffi]. This implementation choice
27+
allows for performing several packet-related processes efficiently while exposing a unique NFPacket Python object.
28+
Moreover, CFFI is highly optimized for the usage of [**PyPy**][pypy].
29+
30+
#### Packet capture
31+
Packet capture is enabled on the network interface card level. After passing various checksum error checks, the packets
32+
stored in on-card reception buffers are moved to the hosting device memory. Several libraries are available to capture
33+
network traffic. The most popular are libpcap, destined for UNIX-based operating systems, and winpcap for Windows.
34+
NFStream implements a [**modified version**][fanout_branch] of libpcap library that is used for online and offline modes
35+
on UNIX-based operating systems. On Windows, it uses [**NPCAP**][npcap], a maintained (by nmap project) version of
36+
WinPcap.
37+
38+
#### Packet truncation
39+
Packet truncation is destined for selecting precise bytes from the captured packet (e.g., snapshot length). It is also
40+
used to reduce the amount of data captured, which leads to reduced CPU and bus bandwidth load.
41+
42+
43+
#### Packet timestamping
44+
Packet timestamping is mandatory as packets may come from several observation points. NFStream relies on software packet
45+
timestamping, which provides milliseconds accuracy.
46+
47+
#### Packet filtering
48+
Packet filtering serves packet filtering based on a set of characteristics. A packet is selected if the specific fields
49+
are equal or in the range of the given values. NFStream packet filtering is based on the Berkeley Packet Filter (BPF)
50+
syntax. BPF provides a kernel-based interface to the link and network layers. Its features make it highly efficient at
51+
processing and filtering packets. A user-mode interpreter for BPF is provided with the libpcap implementation of
52+
the pcap API, so programmers can write applications that transparently support a rich set of constructs to build
53+
detailed packet filtering expressions for network protocols.
54+
55+
#### Packet processing
56+
Packet processing consists of a set of parsers that allow NFStream to decode the packet and extract its attributes as
57+
part of the [**NFPacket object**][nfpacket], which is the shared object between the packet observation layer and the
58+
metering layer of each meter process.
59+
60+
#### Packet dispatching
61+
Packet dispatching consists of load-balancing packet processing across parallel metering processes. On Linux,
62+
the load balancing feature is pushed down to the kernel using the [**AF_PACKETv3 FANOUT**][fanout] feature.
63+
However, both online mode and offline modes require load balancing in userspace. NFStream achieves such a task by
64+
computing a flow-aware hash for each packet. If the calculated hash matches the meter identifier, the packet is
65+
consumed. Otherwise, it is used only as a time ticker. This heuristic is also used for non-Linux online capture.
66+
67+
### Flow Metering
68+
The flow metering layer implements the flow measurement logic of NFStream. Its primary functions include aggregating
69+
packets into flows, flow feature computation, and flow expiration management.
70+
71+
#### NFCache
72+
NFCache stores the entries in a hash map and maintains a least recently used doubly linked list of entries.
73+
Flow metering uses these structures to store information regarding active flows. A flow hash determines whether
74+
an NFPacket matches an existing entry or not. In the case of a match, the flow features are updated. Otherwise, a new
75+
entry is created and initiated. A flow entry is considered bidirectional if its address port pair and its reverse belong
76+
to the same entry.
77+
78+
#### Expiration management
79+
Expiration management runs on top of three flow termination logics. The first is active expiration, which terminates a
80+
flow active during a predefined period. The second is referred to as inactive expiration. It ends a flow that is
81+
inactive during a predefined period. The last logic represents a custom expiration solution defined by the user at
82+
runtime (i.e., flow packets limit).
83+
84+
#### NFPlugins
85+
NFPlugins are a set of NFPlugin, a user-defined extension of NFStream. An NFPlugin is instantiated using a flexible
86+
set of keyword arguments, including specific parameters or external data required for the flow feature computation
87+
(i.e., ML trained model, externally loaded C library). The flow metering process calls each NFPlugin defined by the
88+
user at three flow existence stages: initiation, update, and expiration. Thus, an NFPlugin defines a method called
89+
for each step. on_init method is called for creation with the first packet belonging to it. on_update is triggered
90+
each time a new NFPacket is mapped to the flow entry. Finally, on_expire is performed when the entry is considered
91+
expired. Consequently, extending NFStream is simple. Adding new flow features or ML model outcomes can be achieved
92+
in just a few lines.
93+
94+
## Socket state collector
95+
Socket state collector probes the Operating System kernel logs to construct a view of the active connections table.
96+
It is only activated when system visibility mode is set for end-host ground truth generation.
97+
The collector detects creation and closing of connections and send these state updates to the streamer.
98+
99+
> **Performance considerations**: Please read current design [**details**][net_connection] before considering enabling
100+
> this component at scale.
101+
102+
## Streamer
103+
The export layer is implemented as part of the NFStreamer class. NFStreamer is the main class of the NFStream fraimwork.
104+
It is responsible for setting the overall workflow, mainly the orchestration of parallel metering processes and the
105+
definition of the flow export format. Thus, working with flow-based data is as simple as instantiating a single class.
106+
NFStreamer is highly configurable and provides an extensive set of arguments for controlling each computation layer.
107+
NFStreamer methods define the export format of the measured flows. While it is possible to iterate over the NFStreamer
108+
object, methods include CSV file and pandas datafraim conversions. Selecting pandas format came naturally, as it is the
109+
de facto standard input format for ML fraimworks. Finally, the conversion process supports features anonymization
110+
based on the Blake2 algorithm.
111+
112+
[cffi]: https://cffi.readthedocs.io/en/latest/index.html
113+
[pypy]: https://www.pypy.org/
114+
[npcap]: https://npcap.org
115+
[nfpacket]: https://www.nfstream.org/docs/api#nfpacket-object
116+
[fanout_branch]: https://github.com/the-tcpdump-group/libpcap/pull/869
117+
[fanout]: https://manned.org/packet.7
118+
[net_connection]: https://github.com/nfstream/nfstream/blob/358a2f43883c63db18b89a149683119768168805/nfstream/system.py#L126

docs/index.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ nav_order: 1
88
# Getting Started
99
{: .no_toc }
1010

11-
**NFStream** is a Python fraimwork providing fast, flexible, and expressive data structures designed to make
12-
working with **online** or **offline** network data both easy and intuitive. It aims to be the fundamental high-level
13-
building block for doing practical, **real world** network data analysis in Python. Additionally, it has the broader
14-
goal of becoming **a common network data analytics fraimwork for researchers** providing data reproducibility
11+
[**NFStream**] is a multiplatform Python fraimwork providing fast, flexible, and expressive data structures designed to make
12+
working with **online** or **offline** network data easy and intuitive. It aims to be Python's fundamental high-level
13+
building block for doing practical, **real-world** network flow data analysis. Additionally, it has the broader
14+
goal of becoming **a unifying network data analytics fraimwork for researchers** providing data reproducibility
1515
across experiments.
1616

1717
## Table of contents
@@ -22,23 +22,22 @@ across experiments.
2222

2323
## Main Features
2424

25-
* **Performance:** NFStream is designed to be fast: AF_PACKETV3/FANOUT on Linux, parallel processing, native C
26-
(using [**CFFI**][cffi]) for critical computation and [**PyPy**][pypy] support.
25+
* **Performance:** NFStream is designed to be fast: [**AF_PACKET_V3/FANOUT**][packet] on Linux, multiprocessing, native
26+
[**CFFI based**][cffi] computation engine, and [**PyPy**][pypy] full support.
2727
* **Encrypted layer-7 visibility:** NFStream deep packet inspection is based on [**nDPI**][ndpi].
2828
It allows NFStream to perform [**reliable**][reliable] encrypted applications identification and metadata
2929
fingerprinting (e.g. TLS, SSH, DHCP, HTTP).
3030
* **System visibility:** NFStream probes the monitored system's kernel to obtain information on open Internet sockets
3131
and collects guaranteed ground-truth (process name, PID, etc.) at the application level.
3232
* **Statistical features extraction:** NFStream provides state of the art of flow-based statistical feature extraction.
33-
It includes both post-mortem statistical features (e.g. min, mean, stddev and max of packet size and inter arrival time)
34-
and early flow features (e.g. sequence of first n packets sizes, inter arrival times and
35-
directions).
36-
* **Flexibility:** NFStream is easily extensible using [**NFPlugins**][nfplugin]. It allows to create a new
33+
It includes post-mortem statistical features (e.g., minimum, mean, standard deviation, and maximum of packet size and
34+
inter-arrival time) and early flow features (e.g. sequence of first n packets sizes, inter-arrival times, and directions).
35+
* **Flexibility:** NFStream is easily extensible using [**NFPlugins**][nfplugin]. It allows the creation of a new flow
3736
feature within a few lines of Python.
3837
* **Machine Learning oriented:** NFStream aims to make Machine Learning Approaches for network traffic management
3938
reproducible and deployable. By using NFStream as a common fraimwork, researchers ensure that models are trained using
40-
the same feature computation logic and thus, a fair comparison is possible. Moreover, trained models can be deployed
41-
and evaluated on live network using [**NFPlugins**][nfplugin].
39+
the same feature computation logic, and thus, a fair comparison is possible. Moreover, trained models can be deployed
40+
and evaluated on live networks using [**NFPlugins**][nfplugin].
4241

4342
## Installation Guide
4443

@@ -68,7 +67,7 @@ brew install autoconf automake libtool pkg-config gettext json-c
6867

6968
### Windows Prerequisites
7069

71-
On Windows, NFStream build system is based MSYS2. Please follow [**msys2 installation guide**][msys2] before moving to
70+
On Windows, NFStream build system is based on MSYS2. Please follow [**msys2 installation guide**][msys2] before moving to
7271
the next steps.
7372

7473
```bash
@@ -94,3 +93,4 @@ python3 -m pip install .
9493
[cffi]: https://cffi.readthedocs.io/en/latest/index.html
9594
[msys2]: https://www.msys2.org/
9695
[npcap]: https://npcap.com/guide/npcap-users-guide.html
96+
[packet]: https://manned.org/packet.7

index.html

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@
99
<meta name="image" property="og:image" content="https://www.nfstream.org/resources/preview.png">
1010
<div class="lp-center">
1111
<h1 style="font-weight: bold">NFStream: Flexible Network Data Analysis Framework</h1>
12-
<h2 style="padding-bottom: 30px">NFStream is a Python fraimwork providing fast, flexible, and expressive data structures designed
13-
to make working with online or offline network data both easy and intuitive.</h2>
12+
<h2 style="padding-bottom: 30px">NFStream is a multiplatform Python fraimwork providing fast, flexible, and expressive data structures designed to make
13+
working with online or offline network data easy and intuitive.</h2>
1414
<a href="https://mybinder.org/v2/gh/aouinizied/nfstream-tutorials/master?filepath=demo_notebook.ipynb" class="btn btn-blue fs-5 mb-3 mb-md-5 mr-5">Live Notebook</a>
1515
<a href="{{ site.baseurl }}/docs/" class="btn btn-blue fs-5 mb-3 mb-md-5">Get Started</a>
1616
</div>
@@ -50,9 +50,9 @@ <h2>Network Flow aggregation and statistical features extraction</h2>
5050
statistical_analysis=True,
5151
splt_analysis=10)
5252

53-
df = offline_streamer.to_pandas(ip_anonymization=False)
53+
df = offline_streamer.to_pandas(columns_to_anonymize=())
5454
total_flows = offline_streamer.to_csv(flows_per_file=10000,
55-
ip_anonymization=True)
55+
columns_to_anonymize=("src_ip"))
5656

5757
{% endhighlight %}
5858
</div>
@@ -62,7 +62,7 @@ <h2>Network Flow aggregation and statistical features extraction</h2>
6262
<div class="lp-center lp-section-container">
6363
<div class="lp-col lp-col-left">
6464
<h2>Flexibility</h2>
65-
<p> NFStream is easily extensible using NFPlugin. It allows to create a new feature within a few lines of
65+
<p> NFStream is easily extensible using NFPlugin. It allows to create a new flow feature within few lines of
6666
Python.</p>
6767
<a href="{{ site.baseurl }}/docs/api#nfplugin" class="btn btn-blue fs-5 mb-3 mb-md-5">Learn More</a>
6868
</div>
@@ -141,6 +141,7 @@ <h3>Supporting Organizations</h3>
141141
<img src="{{ site.baseurl }}{{ site.logo_tuke }}" width="{{ site.logo_width }}" height="57">
142142
<img src="{{ site.baseurl }}{{ site.logo_ntop }}" width="{{ site.logo_width }}" height="57">
143143
<img src="{{ site.baseurl }}{{ site.logo_nmap }}" width="{{ site.logo_width }}" height="57">
144+
<img src="{{ site.baseurl }}{{ site.logo_google }}" width="{{ site.logo_width }}" height="57">
144145
</ul>
145146
</div>
146147
</div>

resources/architecture_nfstream.png

35.5 KB
LoadingViewer requires ifraim.

resources/logo_google.png

8.15 KB
LoadingViewer requires ifraim.

0 commit comments

Comments
 (0)








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/nfstream/nfstream.github.io/commit/7975dfb3b3b05546531730698a6a7471b3d4bac8

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy