0% found this document useful (0 votes)
7 views23 pages

FYP GROUP 2 Presentation-Proposal 1

Uploaded by

Lil Mrith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views23 pages

FYP GROUP 2 Presentation-Proposal 1

Uploaded by

Lil Mrith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

MALWARE CLASSIFICATION TOOL USING AI BASED

ON WINDOWS API
GROUP MEMBERS
Sn Name Registration Number Course

1 SUED MWAIGOMOLE T/UDOM/2020/05222 BSc CNISE

2 JOSEPH KASOBOYE T/UDOM/2020/05213 BSc CNISE

3 HAFIDH R MOKIWA T/UDOM/2020/05265 BSc CNISE

4 MUSSA MRISI T/UDOM/2020/05234 BSc CNISE

5 MSAFIRI MCHOMVU T/UDOM/2020/05208 BSc CNISE


INTRODUCTION
 Malware continues to plague Windows systems,
causing data breaches, financial losses, and
operational disruptions. Traditional signature-
based detection methods struggle to keep pace
with the rapid evolution of malware, leaving
critical vulnerabilities exposed. As a result, there
is a critical need for different approaches to
malware detection and classification.
PROBLEM STATEMENT
 The digital landscape is teeming with ever-evolving
malware, posing a constant threat to Windows systems.
Traditional signature-based detection methods are
becoming increasingly ineffective against sophisticated
malware that utilizes polymorphism and zero-day
exploits. This leaves countless systems vulnerable to data
breaches, financial losses, and operational disruptions.
Furthermore, the sheer volume of malware variants
renders manual analysis and classification impractical,
highlighting the need for automated and intelligent
solutions.
MAIN OBJECTIVE
 The objective of this project is to develop an innovative
Windows API-based malware classification tool that utilizes
artificial intelligence (AI) to overcome the limitations of
traditional methods.
SPECIFIC OBJECTIVES
1. To Collect And Process Datasets Of 10 Malware Families.

2. To Train Malware Classification AI Model Using Deep Learning.

3. To Design User Interface Of Windows Desktop Application.

4. To Evaluate Tool Performance By Testing On Various Malware


Datasets.
RELATED WORKS
 Pirscoveanu et al. (2015) used Windows API calls to implement a malware classification
system that achieved a detection accuracy of 98%. Malicious features were extracted from
about 80,000 malware files including four malware categories (Trojans, rootkit, adware, and
potentially unwanted programs) downloaded from VirusTotal and VirusShare. Looking at
the detection outcome, their approach only performs well when detecting Trojans.
 An API Semantics-Aware Malware Detection Method Based on Deep Learning (2019): This
research proposes a novel method using word vectors and LSTM to capture the semantic
meaning of API call sequences and classify malware. They achieved an accuracy of 96.8%
on their dataset..
 Malbert: A novel pre-training method for malware detection (2021): This research
introduces a pre-trained model based on dynamic analysis for Windows malware detection,
emphasizing API call sequences. They achieved an F1-score of 97.3% on their dataset.
 Singh and Singh (2020) extracted API calls from benign and malware executable files using
dynamic analysis in the Cuckoo sandbox
 Morato et al. (2018) presented a method for detecting ransomware attacks based on
network traffic data.
Functional Requirements
 Malware Classification: The system should be able to
detect and classify various types of malware using Windows
API calls. This includes identifying different malware families
such as viruses, worms, Trojans, ransomware, and other
malicious software
 User Interface: Provide a user-friendly interface for users to
interact with the tool. This could include options for scanning
files, monitoring processes, viewing classification results, and
configuring settings.
 Reporting: Generate reports detailing the results of
malware analysis, including classification outcomes,
detected threats, and any suspicious activity observed.
Non functional requirements
 Performance: The system should have low latency and minimal
overhead to ensure efficient malware detection and classification
without significantly impacting system performance.
 Compatibility: Ensure compatibility with different versions of the
Windows operating system and various hardware configurations
to maximize usability and adoption.
 Documentation: Provide comprehensive documentation,
including user manuals, installation guides, and technical
specifications, to assist users in deploying and using the tool
effectively.
 Maintainability: Design the system with modularity and
maintainability in mind, allowing for easy updates, bug fixes, and
enhancements over time.
DATA COLLECTION

 Dataset was collected from Kaggle, and


Github(imported from VirusShare) and
included the following features: API calls. and
their corresponding malware family labels.
The total number of records is 7387
 MalwareTypes, the families produced
represents the 10 malware families: Trojan,
Backdoor, Downloader, Worms, Spyware
Adware, Dropper, Virus, Agent, Ransomware
Malware Samples Description
Family
Spyware 832 Enables a user to obtain covert information about another's computer
activities by transmitting data covertly from their hard drive.

Downloader 1001 Share the primary functionality of downloading content.

Trojan 1001 Misleads users of its true intent.


Worms 1001 Spreads copies of itself from computer to computer.

Adware 379 Hides on your device and serves you advertisements.

Dropper 891 Surreptitiously carries viruses, back doors and other malicious software so
they can be executed on the compromised machine.

Virus 1001 Designed to spread from host to host and has the ability to replicate itself

Backdoor 1001 a technique in which a system security mechanism is bypassed


undetectably to access a computer or its data.

Agent 165 Offers serving as delivery mechanism for other malicious payloads or
providing remote access it can acts as backdoor or a dropper

Ransomware 115 Designed to encrypt files or lock a victim’s computer system rendering it
inaccessible until a ransom is paid
DATASETS SHOWING THE SEQUENCES OF API
CALL WITH THEIR RESPECTIVE MALWARE
FAMILY
LIST OF API CALLS WITH THEIR UNIQUELY
ASSIGNED NUMBER
DICTIONARY TO MAP SPECIFIC API
CALL TO INDEX NUMBER
FUNCTION TO MAP API CALL TO
UNIQUE INDEX
SHOWING COLUMN OF API CALLS MAPPED
WITH A UNIQUE INDEX NUMBER
ASSIGNING UNIQUE INDEX TO MALWARE
FAMILIES
REMOVAL OF EMPTY VALUES FROM API
SEQUENCES CONVERT THEM INTO ARRAYS
DATASETS FORMAT AFTER FEATURE
ENGINEERING
MODEL SELECTION
 The Deep learning model to be used is LTSM. LSTM (Long Short-Term Memory) is a type of
recurrent neural network (RNN) architecture that is well-suited for learning from and making
predictions on sequential data. Unlike traditional feedforward neural networks, which process
each input independently, recurrent neural networks maintain a form of memory over sequences
by recursively passing information from one time step to the next.

 Advantage of Using an LSTM (Long Short-Term Memory) model for sequence prediction
includes sequential data handling, memory of previous events, feature extraction, modeling
temporal dynamics, transfer learning and Pre-training
IMPORTING PACKAGES FOR MODEL
TRAINING
PREPARING DATASETS FOR MODEL
TRAINING BY PADDING
MODEL TRAINING

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy