New Project Page 3
New Project Page 3
1
INTRODUCTION
Internet of Things (IoT) devices are increasingly integrated in cyber-physical systems
(CPS), including in critical infrastructure sectors such as dams and utility plants. In these
settings, IoT devices (also referred to as Industrial IoT or IIoT) are often part of an Industrial
Control System (ICS), tasked with the reliable operation of the infrastructure. ICS can be
broadly defined to include supervisory control and data acquisition (SCADA) systems,
distributed control systems (DCS), and systems that comprise programmable logic controllers
(PLC) and Modbus protocols.
The connection between ICS or IIoT-based systems with public networks, however,
increases their attack surfaces and risks of being targeted by cyber criminals. One high-profile
example is the Stuxnet campaign, which reportedly targeted Iranian centrifuges for nuclear
enrichment in 2010, causing severe damage to the equipment [1], [2]. Another example is that
of the incident targeting a pump that resulted in the failure of an Illinois water plant in 2011
[3]. BlackEnergy3 was another campaign that targeted Ukraine power grids in 2015, resulting
in power outage that affected approximately 230,000 people [4]. In April 2018, there were also
reports of successful cyber-attacks affecting three U.S. gas pipeline firms, and resulted in the
shutdown of electronic customer communication systems for several days [1]. Although
security solutions developed for information technology (IT) and operational technology (OT)
systems are relatively mature, they may not be directly applicable to ICSs. For example, this
could be the case due to the tight integration between the controlled physical environment and
the cyber systems.
2
Popular attack detection and attribution approaches include those based on signatures
and anomalies. To mitigate the known limitations in both signature-based and anomaly-based
detection and attribution approaches, there have been attempts to introduce hybrid-based
approaches [6]. Although hybrid based approaches are effective at detecting unusual activates,
they are not reliable due to frequent network upgrades, resulting in different Intrusion Detection
System (IDS) typologies [7]. Beyond this, conventional attack detection and attribution
techniques mainly rely on network metadata analysis (e.g. IP addresses, transmission ports,
traffic duration, and packet intervals). Therefore, there has been renewed interest in utilizing
attack detection and attribution solutions based on Machine Learning (ML) or Deep Neural
Networks (DNN) in recent times. In addition, attack detection approaches can be categorized
into network-based or host-based approaches. Supervised clustering, single-class or multi-class
Support Vector Machine (SVM), fuzzy logic, Artificial Neural Network (ANN), and DNN are
commonly used techniques for attack detection in network traffic. These techniques analyze
real-time traffic data to detect malicious attacks in a timely manner. However, attack detection
that considers only network and host data may fail to detect sophisticated attacks or insider
attacks.
3
1.1 OVERVIEW
With the rise of IoT devices in cyber-physical systems (CPS), especially within
critical infrastructure such as dams, water treatment plants, and power grids, security has
become a pressing concern. These IoT components—collectively referred to as Industrial IoT
(IIoT)—often operate as part of Industrial Control Systems (ICS) like SCADA and DCS, which
are highly sensitive to disruptions. The integration of ICS/IIoT with public networks has
significantly expanded the attack surface, making them vulnerable to cyber threats, as
demonstrated by real-world incidents such as Stuxnet, BlackEnergy3, and attacks on U.S. gas
pipelines.
Traditional IT and OT security solutions are often not directly applicable to ICS due to their
tight coupling with physical processes and the prioritization of system availability over
confidentiality. This unique characteristic of ICS calls for specialized system-level security
methods that not only monitor cyber behavior but also analyze physical process data to
maintain reliable operation.
However, many models rely solely on network or host data, overlooking physical process data
and the issue of class imbalance in ICS datasets. Unsupervised learning and deep learning-
based feature extraction offer promising paths to improve detection accuracy, especially for
unseen or stealthy attacks. These advanced approaches aim to automate threat detection while
reducing reliance on handcrafted features.
Thus, this project focuses on designing effective detection and attribution mechanisms for IoT-
enabled CPS by leveraging ML/DL techniques that integrate both cyber and physical data to
enhance the resilience of critical infrastructure systems.
4
1.2 MOTIVATION
Notable cyber-attacks like Stuxnet, the Illinois water plant incident, and the Ukraine power
grid outage have demonstrated the potentially devastating consequences of ICS vulnerabilities.
Traditional IT and OT security methods are not fully suited to the unique operational
constraints of ICS, where system availability and physical safety are top priorities. As such,
there is a growing need for security mechanisms that can monitor both cyber and physical
behaviors in real time.
This project is motivated by the need to design intelligent, adaptive detection and attribution
systems using Machine Learning (ML) and Deep Learning (DL). By leveraging both network
and process-level data, and addressing issues like imbalanced datasets and false positives, these
approaches can better detect advanced, stealthy, or insider threats. Ensuring the safety and
reliability of IoT-enabled CPS is critical for protecting national infrastructure and public
welfare.
In [11], ML algorithms, such as K-Nearest Neighbor (KNN), Random Forest (RF), DT,
Logistic Regression (LR), ANN, Na¨ıve Bayes (NB), and SVM were compared in terms of
their effectiveness in detecting backdoor, command, and SQL injection attacks in water storage
systems. The comparative summary suggested that the RF algorithm has the best attack
detection, with a recall of 0.9744; the ANN is the fifth-best algorithm, with a recall of 0.8718;
and the LR is the worstperforming algorithm, with a recall of 0.4744. The authors also reported
that the ANN could not detect 12.82% of the attacks and considered 0.03% of the normal
5
samples to be attacks. In addition, LR, SVM, and KNN considered many attack samples as
normal samples, and these ML algorithms are sensitive to imbalanced data. In other words,
they are not suitable for attack detection in ICS. In [12], the authors presented a KNN algorithm
to detect cyber-attacks on gas pipelines. To minimize the effect of using an imbalanced dataset
in the algorithm, they performed oversampling on the dataset to achieve balance.
Using the KNN on the balanced dataset, they reported an accuracy of 97%, a precision of 0.98,
a recall of 0.92, and an f-measure of 0.95. In [13], the authors presented a Logical Analysis of
Data (LAD) method to extract patterns/rules from the sensor data and use these patterns/rules
to design a two-step anomaly detection system. In the first step, a system is classified as stable
or unstable, and in the second one, the presence of an attack is determined. They compared the
performance of the proposed LAD method with the DNN, SVM, and CNN methods. Based on
these experiments, the DNN outperformed the LAD method in the precision metric; however,
the LAD performed better in recall and f-measure.
In [14], the authors used the DNN algorithm to detect false data injection attacks in power
systems. Findings of their evaluation using two datasets suggested 91.80% accuracy. In [15],
the authors proposed an autoencoder-based method to detect false data injection attacks and
clean them using denoising autoencoders. Their experiments showed that these methods
outperformed the SVM-based method. To handle the effect of imbalanced data on the
algorithm, they ignored attack data in training the autoencoder. In [16], the authors presented
a technique based on Extreme Learning Machine (ELM) for attack detection in CPS. To
address the imbalanced challenge of neural networks, training was conducted using only
normal data. Based on these experiments, the proposed ELM-based method outperformed the
SVM attack detection method.
6
DISADVANTAGES:
• Not designed for IoT devices: Most existing security systems are made for normal
computers and networks, not for small IoT devices that have less power and memory.
• Cannot detect new attacks easily: Some systems can only detect known attacks and
may fail when a new type of attack happens
• Weak in finding the attacker: It is difficult for current systems to accurately find out
who started the attack
• Slow response time: Some systems take longer to detect and respond to attacks,
which can be dangerous for real-time cyber-physical systems.
• Difficult to update: Many devices cannot be easily updated with the latest security
protections, making them more vulnerable over time.
Moreover, removing the normal data from a dataset is not the right solution since the number
of attack samples in ICS/IIoT datasets is usually less than 10% of the dataset, and most of the
dataset knowledge is discarded by removing 80% of the dataset.
To avoid the above mentioned problems in handling imbalanced datasets, this study proposed
a new deep representation learning method to make the DNN able to handle imbalanced
datasets without changing, generating, or removing samples. This model consisted of two
7
unsupervised stacked auto encoders, each responsible for finding patterns from one class. Since
each model tries to extract abstract patterns of one class without considering another, the output
of that model represented its inputs well. The stacked auto encoders had three decoders and
encoders with input and final representation layers. The encoder layers mapped the input
representation to a higher, 800-dimensional space, a 400-dimensional space, and the final 16-
dimensional space. The system shows the encoder function of an auto encoder. The decoder
layers did the opposite and tried to reconstruct the input representation by starting from the 16-
dimensional new representation and mapping it to the 400-dimensional, 800-dimensional, and
input representations. Equations 2 shows the decoder function of an auto encoder. These hyper
parameters were selected using trial and- error to have the best performance in f-measure with
the lowest architectural complexity.
ADVANTAGES:
1.5 OBJECTIVE
8
This project aims to leverage Machine Learning (ML) and Deep Learning (DL) techniques to
identify anomalies and intrusions by analyzing both cyber data (such as network traffic and
system logs) and physical data (such as sensor and actuator readings). By combining these data
sources, the system can detect even stealthy or insider threats that may not be visible through
conventional methods.
Another key objective is to implement an effective attribution mechanism that can accurately
determine the source, method, and impact of an attack. The system will also address common
challenges in ICS security, such as the imbalance of attack data in datasets, and strive to reduce
false positives. Ultimately, the goal is to evaluate the proposed framework using benchmark
datasets and performance metrics to ensure its effectiveness and reliability in real-world
scenarios.
9
CHAPTER - 2
10
LITERATURE SURVEY
In recent years, the integration of IoT (Internet of Things) devices into cyber-
physical systems (CPS) has grown rapidly, but this integration has also increased the risk
of cyber-attacks. Several researchers and developers have attempted to secure such
environments using various detection techniques, including rule-based systems, machine
learning models, and blockchain technology. However, most existing methods focus only
on detection and often ignore the crucial step of attribution (identifying the source of an
attack). This literature survey explores existing approaches and their limitations,
highlighting the need for improved systems that handle both detection and attribution in
resource-constrained IoT environments.
One of the earliest and most well-known intrusion detection systems is Snort,
developed by Martin Roesch in 1999. Snort is a rule-based intrusion detection and
prevention system designed to monitor network traffic and identify malicious patterns.
Although effective in traditional computer networks, Snort struggles with IoT systems due
to their limited memory and processing power. It is not designed to detect complex or
unknown (zero-day) attacks and offers no mechanisms for tracing the attacker. Thus, while
Snort provides a foundation for understanding network-level intrusion detection, it is not
well-suited for the lightweight and dynamic nature of IoT environments.
11
ideal for IoT-based environments. While Kitsune demonstrates high accuracy in detecting
threats, it lacks the ability to perform attribution. This means while it can say "something
is wrong," it cannot tell "who caused it" — a gap that remains crucial for forensic analysis
and preventing future attacks.
From the review of existing systems, it is clear that most methods either focus
on detection only or are too heavy for IoT environments. Very few solutions handle both
attack detection and attribution efficiently, especially for low-resource IoT devices. The
proposed project aims to fill this gap by developing a lightweight, accurate, and fast system
that not only detects cyber-attacks but also traces the source of those attacks. This makes it
highly suitable for modern, real-time, and sensitive cyber-physical systems powered by
IoT.
12
CHAPTER - 3
13
PROBLEM DEFINATION
With the rapid growth of the Internet of Things (IoT) and its integration into cyber-physical
systems (CPS), there is an increasing risk of cyber-attacks that can affect both the digital and
physical components of these systems. Traditional security solutions such as Intrusion
Detection Systems (IDS) and antivirus programs are not well-suited for IoT devices due to
their limited computing power, memory, and energy resources. Furthermore, most existing
systems focus only on detecting known types of attacks and often fail to identify new or zero-
day threats. They also lack effective mechanisms for tracing or attributing the source of the
attack, which is essential for preventing future incidents and improving overall security.
Therefore, there is a need for a lightweight, real-time security system specifically designed for
IoT-enabled cyber-physical environments. This system should be capable of accurately
detecting abnormal activities (attacks) and identifying the origin or attacker behind them,
without overloading the limited resources of IoT devices. The ultimate goal is to enhance the
safety, reliability, and resilience of modern IoT systems through early detection and effective
attribution of cyber threats.
14
CHAPTER - 4
15
SOFTWARE AND HARDWARE REQUIREMENTS
Monitor LED
16
Chapter-5
17
DESIGN AND IMPLEMENTATION
5.1 Architecture of the proposed System
5.1.1 Architectures
5.1.2 Module Description
The proposed two-level ensemble attack detection and attribution framework consists of
multiple modules, each responsible for a critical function in ensuring the security of IoT-
enabled cyber-physical systems (CPS), specifically within an industrial control system
(ICS) context.
The Data Acquisition Module is the initial component of the framework. It is responsible
for collecting data from various sources within the ICS environment, such as sensors,
actuators, programmable logic controllers (PLCs), and
supervisory control and data acquisition (SCADA) systems. This data may be real-time
or historical and is typically collected using communication protocols such as MQTT,
Modbus, or OPC UA. The collected data serves as the foundation for all subsequent
processing and analysis tasks. Next, the Data Preprocessing Module plays a crucial role
in ensuring the quality and consistency of the input
data. This module performs data cleaning to handle missing values, corrupted records,
and noise. It also includes feature engineering techniques, such as extracting statistical
and frequency-domain features, which are vital for effective learning. The data is then
segmented into fixed-size time windows suitable for sequential analysis and normalized
using techniques like min-max scaling or z- score standardization. In addition, this
module addresses class imbalance—a common issue in security datasets—using methods
such as Synthetic Minority Over-sampling Technique (SMOTE) or by adjusting class
weights during model training.
The Level 1: Attack Detection Module is designed to identify whether a given data
instance represents a cyberattack or normal system behavior. This module employs a
hybrid ensemble approach. Initially, a Decision Tree classifier is used to provide a fast
and interpretable decision layer that can be deployed on resource-constrained edge
devices. In parallel, a more advanced ensemble deep learning component is
implemented. This includes a Convolutional Neural
Network (CNN) for capturing spatial correlations, a Long Short-Term Memory (LSTM)
or Gated Recurrent Unit (GRU) network for modeling temporal
dependencies, and a Transformer model for identifying long-range dependencies
18
using attention mechanisms. The outputs of these models are fused in a dedicated layer,
and a meta-classifier is used to make the final binary classification decision (attack or
normal). If an attack is detected, the data is forwarded to the next stage.
Finally, the Output and Feedback Module handles the system’s response to
detected threats. It generates real-time alerts for system operators, logs predictions and
supporting metadata for audit purposes, and presents information through a
visualization dashboard for better situational awareness. Importantly, this module also
supports a feedback loop, allowing human operators to correct false predictions or add
new labels, thereby enabling continual learning and model
improvement over time. In some deployments, the system also includes an
Optional Deployment Module, which allows components of the architecture to be
distributed across edge and cloud platforms. Lightweight models, such as the
Decision Tree and CNN, can be deployed at the edge for real-time processing,
while the full deep ensemble models and attribution logic can run on a centralized cloud or
server infrastructure. This division enables efficient resource utilization while maintaining
high accuracy and responsiveness.
19
communication logs. The data is transmitted via standard industrial protocols like MQTT,
Modbus, or OPC-UA, and forwarded to the data ingestion engine of the system.
Once collected, the data enters the preprocessing phase, which is essential for ensuring
data quality and consistency. Here, the data undergoes several transformation steps: noise
filtering, handling of missing values, and formatting into structured time windows suitable
for analysis. Features are extracted to represent both temporal behavior and system state
using statistical, domain-
specific, and frequency-based techniques. Additionally, data normalization is
performed to scale features into uniform ranges, and data imbalance is addressed using
synthetic sampling techniques or by assigning class weights.
The preprocessed data is then forwarded to the first-level detection module. This layer
consists of two major components operating in parallel. The first is a
lightweight Decision Tree classifier, which performs a fast and interpretable assessment to
quickly flag suspicious activity. The second component is an ensemble deep representation
learning model that leverages a combination of Convolutional Neural Networks (CNNs),
Long Short-Term Memory (LSTM) networks, and Transformers to extract complex
features from the input data. These models capture local signal patterns, temporal
dependencies, and long- range correlations, respectively. Their outputs are fused into a
unified representation and passed through a meta-classifier, which determines whether the
current input corresponds to normal behavior or a potential attack.
20
Finally, the system supports flexible deployment through an edge-cloud
architecture. The detection module, particularly the decision tree and lightweight CNN
components, can be deployed at the edge (near the data source) to ensure low-latency
responses. The more computationally intensive attribution module can be hosted on a
centralized cloud platform or dedicated server. This distributed deployment ensures that
the system remains scalable, responsive, and suitable for real-time industrial
environments.
The preprocessed data then serves as the input for the Level 1: Attack Detection Module.
Here, the decision tree classifier and the ensemble deep representation learners operate in
parallel to analyze the data. Their individual outputs are integrated within a fusion
mechanism that combines their predictions to generate a consensus decision. This
collaborative approach enhances detection robustness by leveraging both fast
interpretable rules and complex learned features.
When the Level 1 module identifies an anomaly indicative of a potential attack, the data,
along with the detection results, is transmitted to the Level 2: Attack Attribution Module.
This module further processes the data through its ensemble of classifiers—comprising
CNN, GRU, and MLP networks—operating simultaneously to analyze various feature
perspectives. The outputs of these classifiers are merged through a soft voting or stacking
mechanism to produce a final classification of the attack type. This interaction ensures that
the attribution is comprehensive and accounts for diverse characteristics of the detected
anomaly. The final attack detection and attribution results are forwarded to the Output and
Feedback Module, which is responsible for system response and continuous
learning. This module disseminates real-time alerts to system operators and logs
21
detailed event information for audit and forensic purposes. Importantly, it enables a
feedback mechanism where human analysts can review and annotate system
decisions, providing corrected labels or additional context. These annotations are fed back
into the preprocessing and model training pipelines to update the system’s knowledge
base, thereby improving future detection and attribution
performance.
Overall, the interaction among modules is orchestrated through well-defined data flows
and communication protocols, ensuring that the system acts as an integrated whole. This
modular yet interconnected design enables efficient, accurate, and
adaptive cybersecurity monitoring tailored for real-world ICS environments.
5.2 Algorithms
The proposed system leverages several algorithms designed to handle different stages of the attack
detection and attribution process within industrial control systems (ICS). These algorithms are
optimized to address the challenges of imbalanced data, complex temporal-spatial patterns, and
real-time constraints inherent in IoT-enabled cyber- physical systems.
Convolutional Neural Network (CNN): Extracts local spatial features from sensor readings and
control signals.
22
Long Short-Term Memory (LSTM) Network: Models temporal dependencies and sequences in
time-series data.
Each of these networks processes the input data in parallel, producing feature embeddings that are
fused via concatenation or averaging. A meta-classifier, typically a fully connected neural network
or gradient boosting model, then performs the final binary classification. This ensemble approach
improves generalization and robustness,
especially under imbalanced attack scenarios.
Outputs from these networks are combined using soft voting or stacking, allowing the system to
predict the attack type (e.g., Denial-of-Service, Replay, Man-in-the-Middle, Data Injection). This
ensemble strategy leverages the complementary strengths of each classifier to achieve high
attribution accuracy.
23
5.3 System Design
The design of the proposed two-level ensemble attack detection and attribution framework is
centered around the unique challenges of securing IoT-enabled cyber- physical systems (CPS) in
industrial control system (ICS) environments. The system is architected to provide accurate, real-
time detection and precise attribution of cyberattacks, while addressing constraints such as
imbalanced datasets, noisy data, and resource limitations in edge devices.
At a high level, the system consists of several interconnected modules: data acquisition,
preprocessing, attack detection, attack attribution, output management, and deployment
orchestration. These modules interact through well-defined data flows and interfaces, forming a
cohesive pipeline.
Data Acquisition serves as the entry point of the system, where real-time and historical data is
gathered from ICS components including sensors, actuators, programmable logic
controllers (PLCs), and SCADA systems. This data is transmitted over standard industrial
communication protocols such as MQTT, Modbus, and OPC-UA, ensuring compatibility with
diverse ICS setups.
The Data Preprocessing Module is responsible for transforming raw sensor and network data into
a format suitable for machine learning models. It applies techniques such as noise filtering,
handling of missing or corrupted data, feature extraction (both statistical and domain-specific),
data normalization, and windowing of time-series data.
Furthermore, it tackles data imbalance by employing oversampling or class-weighting methods to
ensure minority attack classes are adequately represented during training.
Following preprocessing, data is fed into the Level 1 Attack Detection Module, which
incorporates a hybrid ensemble approach. A Decision Tree classifier provides a rapid and
interpretable initial classification. Simultaneously, an ensemble of deep learning
models—including CNNs to extract spatial features, LSTMs or GRUs to capture temporal
dependencies, and Transformer architectures for attention-based long-range feature extraction—
process the data. The outputs of these models are fused through a meta-classifier to produce a
robust binary decision indicating normal or attack behavior.
If an attack is detected, the system proceeds to the Level 2 Attack Attribution Module. This module
comprises an ensemble of deep neural networks specialized for categorizing the attack type. It
utilizes parallel CNN, GRU, and MLP models to analyze various
aspects of the data. Their combined outputs are integrated via soft voting or stacking to yield an
accurate attack class prediction, facilitating targeted incident response.
24
The Output and Feedback Module manages alerts, logs, and user interactions. It generates real-time
notifications for security personnel, stores detailed event records for auditing, and provides
visualization dashboards for situational awareness. Importantly, it enables a feedback loop wherein
human analysts can annotate and correct system outputs. These corrections serve as new training
data to improve model accuracy through periodic retraining or online learning.
Lastly, the system design supports flexible Deployment Architectures. Lightweight components
like the Decision Tree and CNN models can be deployed at the edge—close to data sources—for
low-latency inference. More computationally intensive modules,
including the full ensemble models for detection and attribution, can be hosted on cloud or
centralized servers. This hybrid deployment maximizes both responsiveness and scalability,
making the framework practical for real-world ICS environments.
Overall, the system design balances accuracy, efficiency, and adaptability to provide a
comprehensive cybersecurity solution tailored for IoT-enabled industrial control systems.
Search Files
25
5.3.2 DFD Diagram
Level - 0
Register
IOT Server
Data Owner and Login,
UPLOAD
DATA, Login, View End Users and
Authorize, View Data Owner
and Authorize,View File
Transactions,View Encrypt Key
Details,View Uploaded
Files,View File Content
Delete,View Files, Attackers,
Audit File,View
Deduplication Files
End User
26
Cloud DBase
Level -1
Login,
View Secret Key Cloud Server
Request
IOT Sub
Server
View Decrypt
Key Req
Download Files
Data owner
27
5.3.3 Class / UML Diagram
Data Owner
METHOD S : Reg ister and
Login,Upload,Update,Delete,
View Files, Audit File, IOT Server
View Deduplication Files
MEMBERS:
File Name,Contents,Trapdoor,Secret
Key,Rank,Date & Time,The Initial
User,Encrypt Key,data owner
28
5.3.4 Flow Chart : Owner User
Start
User Register
Yes No
Logout
29
5.3.5 ER-DIAGRAM
30
5.4 Sample Code
33
Chapter-6
34
System Study
6.1 FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is put forth
with a very general plan for the project and some cost estimates. During system analysis the
feasibility study of the proposed system is to be carried out. This is to ensure that the proposed
system is not a burden to the company. For feasibility analysis, some understanding of the
major requirements for the system is essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus the developed system as well
within the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed system
must have a modest requirement, as only minimal or null changes are required for
implementing this system.
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must be raised so that he is also able
to make some constructive criticism, which is welcomed, as he is the final user of the system.
35
CHAPTER-7
36
SYSTEM TESTING
7.1 TESTING METHODOLOGIES
o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
o Validation Testing.
Unit testing focuses verification effort on the smallest unit of Software design that is
the module. Unit testing exercises specific paths in a module’s control structure to
ensure complete coverage and maximum error detection. This test focuses on each module
individually, ensuring that it functions properly as a unit. Hence, the naming is Unit Testing.
During this testing, each module is tested individually and the module interfaces are
verified for the consistency with design specification. All important processing path are tested
for the expected results. All error handling paths are also tested.
Integration testing addresses the issues associated with the dual problems of verification
and program construction. After the software has been integrated a set of high order tests are
conducted. The main objective in this testing process is to take unit tested modules and builds
a program structure that has been dictated by design.
37
This method begins the construction and testing with the modules at the lowest level in
the program structure. Since the modules are integrated from the bottom up, processing
required for modules subordinate to a given level is always available and the need for stubs is
eliminated. The bottom up integration strategy may be implemented with the following steps:
The low-level modules are combined into clusters into clusters that perform a
specific Software sub-function.
A driver (i.e.) the control program for testing is written to coordinate test case input
and output.
The cluster is tested.
Drivers are removed and clusters are combined moving upward in the program
structure
The bottom up approaches tests each module individually and then each module is module is
integrated with a main module and tested for functionality.
User Acceptance of a system is the key factor for the success of any system. The system
under consideration is tested for user acceptance by constantly keeping in touch with the
prospective system users at the time of developing and making changes wherever required. The
system developed provides a friendly user interface that can easily be understood even by a
person who is new to the system.
After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the
specified format. Asking the users about the format required by them tests the outputs generated
or displayed by the system under consideration. Hence the output format is considered in 2
ways – one is on screen and another in printed format.
Text Field:
The text field can contain only the number of characters lesser than or equal
to its size. The text fields are alphanumeric in some tables and alphabetic in other
tables. Incorrect entry always flashes and error message.
Numeric Field:
38
The numeric field can contain only numbers from 0 to 9. An entry of any character
flashes an error messages. The individual modules are checked for accuracy and what it has to
perform. Each module is subjected to test run along with sample data. The individually
tested modules are integrated into a single system. Testing involves executing the real data
information is used in the program the existence of any program defect is inferred from the
output. The testing should be planned so that all the requirements are individually tested.
A successful test is one that gives out the defects for the inappropriate data and
produces and output revealing the errors in the system.
Taking various kinds of test data does the above testing. Preparation of test data plays
a vital role in the system testing. After preparing the test data the system under study is tested
using that test data. While testing the system by using test data errors are again uncovered and
corrected by using above testing steps and corrections are also noted for future use.
Live test data are those that are actually extracted from organization files. After a
system is partially constructed, programmers or analysts often ask users to key in a set of data
from their normal activities. Then, the systems person uses this data as a way to partially test
the system. In other instances, programmers or analysts extract a set of live data from the files
and have them entered themselves.
It is difficult to obtain live data in sufficient amounts to conduct extensive testing. And,
although it is realistic data that will show how the system will perform for the typical processing
requirement, assuming that the live data entered are in fact typical, such data generally will not
test all combinations or formats that can enter the system. This bias toward typical values then
does not provide a true systems test and in fact ignores the cases most likely to cause system
failure.
Artificial test data are created solely for test purposes, since they can be generated to
test all combinations of formats and values. In other words, the artificial data, which can
quickly be prepared by a data generating utility program in the information systems
department, make possible the testing of all login and control paths through the program.
The most effective test programs use artificial test data generated by persons other than
those who wrote the programs. Often, an independent team of testers formulates a testing plan,
using the systems specifications.
39
The package “Virtual Private Network” has satisfied all the requirements specified as
per software requirement specification and was accepted.
Whenever a new system is developed, user training is required to educate them about
the working of the system so that it can be put to efficient use by those for whom the system
has been primarily designed. For this purpose the normal working of the project was
demonstrated to the prospective users. Its working is easily understandable and since the
expected users are people who have good knowledge of computers, the use of this system is
very easy.
7.3 MAINTAINENCE
This covers a wide range of activities including correcting code and design errors. To
reduce the need for maintenance in the long run, we have more accurately defined the user’s
requirements during the process of system development. Depending on the requirements, this
system has been developed to satisfy the needs to the largest possible extent. With development
in technology, it may be possible to add many more features based on the requirements in
future. The coding and designing is simple and easy to understand which will make
maintenance easier.
TESTING STRATEGY :
A strategy for system testing integrates system test cases and design techniques into a well
planned series of steps that results in the successful construction of software. The testing
strategy must co-operate test planning, test case design, test execution, and the resultant data
collection and evaluation .A strategy for software testing must accommodate low-level tests
that are necessary to verify that a small source code segment has been correctly implemented
as well as high level tests that validate major system functions against user requirements.
Software testing is a critical element of software quality assurance and represents the
ultimate review of specification design and coding. Testing represents an interesting anomaly
for the software. Thus, a series of testing are performed for the proposed system before the
system is ready for user acceptance testing.
40
SYSTEM TESTING:
Software once validated must be combined with other system elements (e.g. Hardware, people,
database). System testing verifies that all the elements are proper and that overall system
function performance is
achieved. It also tests to find discrepancies between the system and its original objective,
current specifications and system documentation.
UNIT TESTING:
In unit testing different are modules are tested against the specifications produced during the
design for the modules. Unit testing is essential for verification of the code produced during
the coding phase, and hence the goals to test the internal logic of the modules. Using the
detailed design description as a guide, important Conrail paths are tested to uncover errors
within the boundary of the modules. This testing is carried out during the programming stage
itself. In this type of testing step, each module was found to be working satisfactorily as regards
to the expected output from the module.
41
Test Cases for Detection and
Attribution of Cyber-Attacks in IoT-
enabled CPS
1. Functional Test Cases
Test Case ID Test Case Input Expected Output Result (Pass/Fail)
Description
TC_F01 Upload file to IoT Encrypted file + File stored,
Server (Initial authenticators deduplication
User) performed if
needed
TC_F02 Upload file to IoT Duplicate file Server links to
Server existing file
(Subsequent User)
TC_F03 Attack Detection ICS network logs Attack detected
(Unbalanced with class-specific
Data) representation
TC_F04 Attack Attribution Attack data input Specific attack
type identified
42
CHAPTER -8
43
RESULTS AND OUTPUT SCREENS
44
45
46
47
CHAPTER -9
48
CONCLUSION
This paper proposed a novel two-stage ensemble deep learning-based attack detection
and attack attribution framework for imbalanced ICS data. The attack detection stage uses
deep representation learning to map the samples to the new higher dimensional space and
applies a DT to detect the attack samples. This stage is robust to imbalanced datasets and
capable of detecting previously unseen attacks. The attack attribution stage is an ensemble of
several one-vs-all classifiers, each trained on a specific attack attribute. The entire model
forms a complex DNN with a partially connected and fully connected component that can
accurately attribute cyberattacks, as demonstrated.
Despite the complex architecture of the proposed framework, the computational
complexity of the training and testing phases are respectively O(n4) and O(n2), (n is the
number of training samples), which are similar to those of other DNN-based techniques in the
literature. Moreover, the proposed framework can detect and attribute the samples timely
with a better recall and f-measure than previous works. Future extension includes the design
of a cyber-threat hunting component to facilitate the identification of anomalies invisible to
the detection component for example by building a normal profile over the entire system and
the assets.
49
CHAPTER-10
50
REFERENCE
51
[8] C. Bellinger, S. Sharma, and N. Japkowicz, “One-class versus binary
classification: Which and when?” in 2012 11th International Conference on
Machine Learning and Applications, vol. 2, 2012, pp. 102–106.
[9] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT Press, 2016.
[Online]. Available: http://www.deeplearningbook.org
[10] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review
and new perspectives,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 35, no. 8, pp. 1798–1828, 2013.
[11] M. Zolanvari, M. A. Teixeira, L. Gupta, K. M. Khan, and R. Jain, “Machine
Learning-Based Network Vulnerability Analysis of Industrial Internet of
Things,” IEEE Internet of Things Journal, vol. 6, no. 4, pp. 6822–6834, 2019.
[12] I. A. Khan, D. Pi, Z. U. Khan, Y. Hussain, and A. Nawaz, “HML-IDS: A
hybrid-multilevel anomaly prediction approach for intrusion detection in SCADA
systems,” IEEE Access, vol. 7, pp. 89 507–89 521, 2019.
[13] T. K. Das, S. Adepu, and J. Zhou, “Anomaly detection in industrial control
systems using logical analysis of data,” Computers & Security, vol. 96, p.
101935, 2020.
[14] J. J. Q. Yu, Y. Hou, and V. O. K. Li, “Online False Data Injection Attack
Detection With Wavelet Transform and Deep Neural Networks,” IEEE
Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3271–3280,
2018.
[15] M. M. N. Aboelwafa, K. G. Seddik, M. H. Eldefrawy, Y. Gadallah, and M.
Gidlund, “A machine-learning-based technique for false data injection attacks
detection in industrial iot,” IEEE Internet of Things Journal, vol. 7, no. 9, pp.
8462–8471, 2020.
[16] W. Yan, L. K. Mestha, and M. Abbaszadeh, “Attack detection for securing
cyber physical systems,” IEEE Internet of Things Journal,vol. 6, no. 5, pp. 8471–
8481, 2019.
52
[17] A. Cook, A. Nicholson, H. Janicke, L. Maglaras, and R. Smith, “Attribution
of Cyber Attacks on Industrial Control Systems,” EAI Endorsed Transactions on
Industrial Networks and Intelligent Systems, vol. 3, no. 7, p. 151158, 2016.
[18] L. Maglaras, M. Ferrag, A. Derhab, M. Mukherjee, H. Janicke, and S. Rallis,
“Threats, Countermeasures and Attribution of Cyber Attacks on Critical
Infrastructures,” ICST Transactions on Security and Safety, vol. 5, no. 16, p.
155856, 2018.
[19] M. Alaeiyan, A. Dehghantanha, T. Dargahi, M. Conti, and S. Parsa, “A
Multilabel Fuzzy Relevance Clustering System for Malware Attack Attribution
in the Edge Layer of Cyber-Physical Networks,” ACM Transactions on Cyber-
Physical Systems, vol. 4, no. 3, pp. 1–22, 2020.
[20] U. Noor, Z. Anwar, T. Amjad, and K.-K. R. Choo, “A machine learning-
based FinTech cyber threat attribution framework using highlevel indicators of
compromise,” Future Generation Computer Systems, vol. 96, pp. 227–242, 2019.
53