0% found this document useful (0 votes)

72 views16 pages

Dimensionality Reduction

This document reviews dimensionality reduction techniques for feature selection and feature extraction. It provides a comprehensive analysis of methods used in various papers, including algorithms, datasets, classifiers, and results. Dimensionality reduction is important for improving accuracy and efficiency by reducing irrelevant and redundant features in high-dimensional data. The paper discusses and compares different feature selection and feature extraction techniques, and how they can significantly reduce computational time while maintaining accuracy.

Uploaded by

Jafar Abbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views16 pages

Dimensionality Reduction

Uploaded by

Jafar Abbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/341413445

A Comprehensive Review of Dimensionality Reduction Techniques for Feature

Selection and Feature Extraction

Article in Journal of Applied Science and Technology Trends · May 2020

DOI: 10.38094/jastt1224

CITATIONS READS

292 4,761

5 authors, including:

Rizgar R. Zebari Adnan Mohsin Abdulazeez

53 PUBLICATIONS 2,384 CITATIONS

Duhok Polytechnic University
198 PUBLICATIONS 3,616 CITATIONS
SEE PROFILE
SEE PROFILE

Diyar Zeebaree Dilovan Zebari

Duhok Polytechnic University Universiti Teknologi Malaysia
96 PUBLICATIONS 1,721 CITATIONS 46 PUBLICATIONS 1,144 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Technical SCIENCE ⚙ View project

Client-Server and Video Broadcasting View project

All content following this page was uploaded by Rizgar R. Zebari on 16 May 2020.

The user has requested enhancement of the downloaded file.

Vol. 01, No. 02, pp. 56 –70 (2020)
ISSN: 2708-0757

JOURNAL OF APPLIED SCIENCE AND TECHNOLOGY TRENDS

www.jastt.org

A Comprehensive Review of Dimensionality Reduction

Techniques for Feature Selection and Feature Extraction

Rizgar R. Zebari 1, *, Adnan Mohsin Abdulazeez2, Diyar Qader Zeebaree3, Dilovan Asaad Zebari4, Jwan Najeeb Saeed5
1
IT Department. Technical College of Informatics Akre, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
rizgar.ramadhan@dpu.edu.krd
2
Presidency of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, Adnan.mohsin@dpu.edu.krd
3,4
Research Center of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, dqszeebaree@dpu.edu.krd,
dilovan.zebari@dpu.edu.krd
5
IT Department. Duhok Technical Institute, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
jwan.najeeb@dpu.edu.krd
* Correspondence: rizgar.ramadhan@dpu.edu.krd

Abstract
Due to sharp increases in data dimensions, working on every data mining or machine learning (ML) task requires more efficient
techniques to get the desired results. Therefore, in recent years, researchers have proposed and developed many methods and techniques
to reduce the high dimensions of data and to attain the required accuracy. To ameliorate the accuracy of learning features as well as to
decrease the training time dimensionality reduction is used as a pre-processing step, which can eliminate irrelevant data, noise, and
redundant features. Dimensionality reduction (DR) has been performed based on two main methods, which are feature selection (FS) and
feature extraction (FE). FS is considered an important method because data is generated continuously at an ever-increasing rate; some
serious dimensionality problems can be reduced with this method, such as decreasing redundancy effectively, eliminating irrelevant data,
and ameliorating result comprehensibility. Moreover, FE transacts with the problem of finding the most distinctive, informative, and
decreased set of features to ameliorate the efficiency of both the processing and storage of data. This paper offers a comprehensive
approach to FS and FE in the scope of DR. Moreover, the details of each paper, such as used algorithms/approaches, datasets, classifiers,
and achieved results are comprehensively analyzed and summarized. Besides, a systematic discussion of all of the reviewed methods to
highlight authors' trends, determining the method(s) has been done, which significantly reduced computational time, and selecting the
most accurate classifiers. As a result, the different types of both methods have been discussed and analyzed the findings.

Keywords: dimension reduction, dimension reduction techniques, feature selection, feature extraction.

Received: April 29, 2020 / Accepted: May 13, 2020 / Online: May 15, 2020

patterns or information in almost every data mining task. Also,

I. INTRODUCTION working in high dimensional data increases the difficulty of
Nowadays, data mining and knowledge discovery have a knowledge discovery and pattern classification because there
great role in several digital applications. Knowledge is detected are a lot of redundant and irrelevant features. Reducing high
by processing and analyzing a large amount of the previously dimensional datasets to a low dimensional dataset by filter or
collected data [1]. Data generated in a huge volume in different remove redundant and noise information is a method to solve
fields, and it is on continuous growth in size, complexity, and this problem, and this is known as dimensionality reduction [6].
dimensionality [2, 3]. A dataset with high dimensionality
features its numerous features, but few samples have a direct Dimensionality reduction is a process for decreasing
relation with data mining and machine learning tasks [4, 5]. features’ dimensionality, but the data is still present. In the
Therefore, these issues of data become a big challenge for reduced or low dimension dataset, the crucial features remain
extracting potentially useful, and ultimately understandable even if some particular pattern vanishes [7, 8]. Also, it utilizes

56
doi: 10.38094/jastt1224
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

to reduce the size of input data and then preserve much variance TABLE I. THE SUMMARY OF DIMENSION REDUCTION TECHNIQUES
of essential features compared to the dataset with the larger Method Main concept Pros Cons
size. In real-world data, it will become easy to detect and use
Preserves the Not sufficient
for data mining applications and gain high accuracy Summarize the
original, relative enough in the
performance [1, 9]. Moreover, the role of dimensionality Feature dataset by creating
distance between existing of a huge
extraction linear combinations
reduction is to enhance the accuracy and efficiency of the data of the features
covers latent number of
mining computation, and it is considered as a vital structure, objects irrelevant features
preprocessing step. Furthermore, it provides several advantages A sublist of relevant
Feature features can be Strong against Latent structure
such as eliminating irrelevant, redundant patterns in the dataset; selection selected depending irrelevant features does not cover
as a result, to reduce the time and amount of memory required on defined criteria
for processing such data [1, 10]. By reducing the dataset, the
quality of data will improve, the algorithm will work
efficiently, achieve better accuracy, and pattern design and
examination will be clearer for researchers [11]. Additionally, A. Feature Selection
reducing the cost of computing, improving dimensions Feature selection is utilized to reduce the dimensionality
visualization, and enhancing the results [12, 13]. impact on the dataset through finding the subset of feature
This work reviewed more than forty articles of feature which efficiently define the data [18, 19]. It selects the
selection and feature extraction that have been introduced and important and relevant features to the mining task from the
published in the last three years. input data and removes redundant and irrelevant features [20,
21]. It is useful for detecting a good subset of features that is
appropriate for the given problem [2, 22]. The main purpose of
II. DIMENSIONALITY REDUCTION TECHNIQUES feature selection is to construct a subset of features as small as
Dimensionality reduction is the operation of transforming possible but represents the whole input data vital features [11,
the high dimensional representation of data in low dimensional 23]. Feature selection provides numerous advantages: reduce
representations. With the massive growth in high dimensional the size of data, decrease needed storage, prediction accuracy
data, the use of various dimensionality reduction techniques has improvement, overfitting evading, and reduce executing and
become popular in many areas of use. Moreover, several training time from easily understanding variables. Feature
modern approaches are continually emerging. Dimensionality selection algorithm phase is divided into two-phase such as (i)
reduction techniques transform the original dataset having high Subset Generation: (ii) Subset Evaluation: In subset
dimensionality and turn it into a new dataset representing low Generation, we need to generate subset from the input dataset
dimensionality while maintaining as much as possible the and to use Subset Evaluations we have to check whether the
original meanings of the data. The low dimensional generated subset is optimal or not [24, 25]. “Fig.1” shows the
representation of the original data contributes to solving the overall method of the feature selection process.
dimensionality curse problem. The low dimensional data can be
Subset Subset
easily analyzed, processed, and visualized [14]. Several Generatio Evaluation
benefits can be obtained due to applying the dimensionality n
reduction techniques applied to a dataset. (i) As the number of
Full Feature
dimensions comes down, data storage space can be reduced. (ii) Set
It takes less computation time only. (iii) Redundant, irrelevant,
and noisy data can be removed. (iv) Data quality can be No Yes
improved. (v) Some algorithms do not perform well on a greater Selected
Is stopping
number of dimensions taken. So, reducing these dimensions criteria met?
Feature
helps an algorithm to work efficiently and improves accuracy. Set
(vi) It is challenging to visualize data in higher dimensions. So,
reducing the dimension may allow us to design and examine
patterns more clearly. (vii) It simplifies the process of
classification and also improves efficiency [15, 16]. Generally, Fig. 1. Process of feature selection.
the dimensionality reduction techniques can be classified into
two main groups, or in other words, the dimensionality
reduction is achieved through two different techniques: feature B. Feature Selection Problems
selection and feature extraction. In feature selection, Various issues can benefit from the feature selection
information can be lost since some features should be excluded techniques application. High dimension, low sample size data
when the process of feature subset choice by doing this are becoming more popular in different fields. Many of the
information can be reduced. However, in feature extraction, the features of these problems do not facilitate an adequate
dimension can be decreased without losing much initial feature classification. More so, the imbalance problem happens when
dataset [2, 10, 14, 17]. Table I provides a descriptive summary one of the two classes has more samples than other classes.
of the methods of dimension reduction.
Many algorithms neglect the minority sample when
concentrating on a major sample classification. However, the
minority samples are crucial but seldom occurred. Moreover, in

57
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

machine learning, the shift of the dataset is a popular problem thus, they need extensive computation times to achieve the
that happens when the joint distribution of inputs and outputs convergences and can be intractable for large datasets [33, 34].
varies between training and test stages. A special case of dataset “Fig. 3” shows the involved steps in the wrapper method.
shift, which happens when only the input distribution changes
is called Covariate shift. Furthermore, the reduction of the Selecting the Best Subset
dimensionality and consequently feature selection is one of the
most common techniques of noisy data elimination. Eventually,
misclassification costs and test costs are the two most
significant kinds of cost in cost-sensitive learning [26-28].
Set of all Learning Performance
Generate a subset
features algorithm
C. Feature Selection Methods
Feature selection aims to select a feature subset from the
original set of features based on a/the feature’s relevance and
redundancy. Originally evaluation methods in feature selection
Fig. 3. Wrapper method for feature selection.
are divided into four kinds: filter, wrapper, embedded [10, 14,
18], and hybrid [20, 29]. Recently, another type of evaluation
method is developed, i.e., ensemble feature selection [30, 31]. Embedded method is a built-in feature selection mechanism
“Fig. 2” depicts the hierarchy of feature selection techniques. that embeds feature selection in the learning algorithm and uses
its properties to guide feature evaluation. The embedded
Feature Selection Techniques method is more effective and more tractable than the wrapper
method computationally while retaining similar performance.
This is because the embedded method avoids the repeated
Filter Embedde Wrappe Hybrid Ensemble execution of the classifier and the examination of every feature
d r subset. The embedded method combines the qualities of both
filter and wrapper methods. It selects features during the
implementation of the mining algorithm, and hence it has less
Pearson LDA Forward Backward Recursive computational expensiveness [32, 35]. Steps involved in the
correlation selection elimination Feature
Elimination
embedded method are shown in “Fig. 4”.

Fig. 2. Hierarchy of feature selection techniques. Selecting the Best Subset

Filter is considered the earliest method and also known as

an open-loop method. It checks the features relying on the
intrinsic characteristics prior to the learning tasks. It mainly Set of all
measures the feature characteristics depending on four different Features
Generate a subset Learning Algorithm and
kinds of measurement criteria, i.e. information, dependency, Performance
consistency, and distance [32]. In the filter method, the feature
selection process is performed independently of the data mining
algorithm. It uses statistical standards for evaluating the ranking
of the subset [17, 24]. Moreover, this technique is to perform Fig. 4. Embedded method for feature selection.
good performance and high-efficiency computing, easily
scalable in high dimensional datasets, and outperformed the
wrapper technique. The primary downside of this method is that Hybrid and ensemble methods the recent developments in
it neglects the integration between the selected subset and the feature selection can be represented in the hybrid method. Thus,
performance of the induction algorithm [10, 22, 26]. it can be developed either by integrating two
various methods (e.g. wrapper and filter), two methods with the
Wrapper, or it also can be called a close-loop method, same criteria, or two feature selection approaches. In the hybrid
wraps the feature selection around the learning algorithm and method, the advantages of both methods can be inherited by
uses the accuracy of the performance or the error rate of the combining their complementary strengths [36]. The
classification process as a criterion of feature evaluation. By combination of filter and wrapper methods is the most common
decreasing the estimation error of a specific classifier, it hybrid method [37]. However, ensemble method is a method
chooses the most discriminative subset of features. The wrapper that aims at building a group of feature subsets and then
method performs feature selection based on the performance of producing an aggregated result out of the group [38]. This
the learning algorithm; it selects the most optimal feature for method is depending on various subsampling techniques where
the prediction algorithm. Hence it achieves better performance a particular feature selection method is implemented on a
and high accuracy compared to the filter algorithm [22, 27, 33].
variety of subsamples, and the obtained features are merged to
The main disadvantage of this approach is computing
create a more stable subset. Table II describes the advantages
complexity and more exposure to overfitting in comparison to
the filter approach. Most wrapper methods are multivariate; and disadvantages of each method

58
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

Feature extraction algorithms are classified into linear and

nonlinear algorithms [39, 43, 45]. However, the ideal feature
TABLE II. ADVANTAGE AND DISADVANTAGE OF FEATURE SELECTION extraction based dimensionality reduction methods is Principal
METHODS. Component Analysis (PCA) [46, 47], Multi-Dimensional
Advantages Scaling (MDS), Isometric Mapping (ISOMAP), Locally Linear
Method Disadvantages
Embedding (LLE), Linear Discriminant Analysis (LDA),
- It works faster than
- The interaction
Latent Semantic Indexing (LSI) and clustering methods [48,
wrapper
between 49]. “Fig. 5” depicts the overall process of feature extraction
- Scalable method.
classifiers is
- Classifier independent
Filter neglected
- The computational Extracting new
- The dependency
complexity is better than Full Feature Set Features from
among features is
wrapper Existing Features New Feature Set
ignored
- Better generalizable
property
- It Interacts with classifier - More prone to
Fig. 5. The process of feature extraction
- Consider the dependence over-fitting
Wrapper
among features - Classifier specific
- Higher performance - Need expensive In Compare with feature selection, feature extraction
accuracy than filter computation diminishes the number of variables by transforming a
- Interact with classifier considerable number of attributes into a set of reduced features.
- Better computational Feature extraction tries to find a significant low-dimensional
complexity than wrapper
- Higher performance
representation of high-dimensional data. In other words, a
Embedded - Classifier specific lesser information loss and a higher discriminating power can
accuracy than filter
- Less prone to over-fitting be guaranteed using feature extraction rather than feature
than wrapper selection. However, the application of feature extraction in
- Consider the dependency sentiment analysis has been hindered by several difficulties.
among features First, many feature extraction algorithms that have been
- The performance accuracy
is higher than filter
classified as nonlinear methods cannot perform the mapping
Hybrid from a high-dimensional space to a low-dimensional space,
- Better computational - Classifier specific
complexity than wrapper thereby prohibiting the training of a practically usable
- Less prone to over-fitting classification model and resulting in a loss of data
than wrapper interpretability. An expert system that can automatically
- Less exposed to over- - An ensemble of determine the sentiment of documents must obtain a low-
fitting classifiers is not
Ensemble - It has more flexibly and easy to dimensional representation of new documents during the test
robustness upon high understand phase. Furthermore, unlike most feature selection techniques,
dimensional data feature extraction methods are usually unsupervised; that is,
label information cannot be utilized during the process of
dimension reduction. When implementing machine learning
III. FEATURE EXTRACTION algorithms for sentiment analysis, those documents with label
Feature extraction method extracting new features from information must be used to train a predictive model. This same
original dataset, and it is very beneficial when we want to information can also help feature extraction methods create
decrease the number of resources required for processing excellent features [45, 50].
without missing relevant feature dataset. Feature extraction can In this paper, a comprehensive review is performed for
also decrease the number of additional features for an offered the latest and most efficient methods that have been performed
study. Feature extraction produces a remarkable transformation by researchers in the past three years to reduce data dimensions
of first features to create more significant features. Feature in various areas of machine learning and data mining. Also, the
extraction is a process for creating new features that depend on details of each method, such as used algorithms/approaches,
the original input feature set to decrease the high dimensionality datasets, utilized classifiers, and the results obtained are
of the feature vector. The transformation method is done by summarized. Moreover, we performed a scientific analysis of
algebraic transformation, and according to some optimization the studied feature selection and feature extraction methods.
criteria [39, 40]. Also, feature extraction has the ability to Furthermore, we highlighted the most widely used approaches,
handle essential information during dealing with high the best-reduced dimension, the best decreased computational
dimensional issues [41, 42]. These dimensionality reduction time, and the highest achieved accuracy methods. This paper is
techniques aim to not lose a large amount of information during organized as follows: the introduction is done in section I,
the feature transformation process by conserving the original section II presents the dimensionality reduction techniques then
relative distance between features and cover the original data a review about these techniques is done III, the broad discussion
potential structure [10]. Feature extraction is less exposed to is illustrated in section V, while section IV contains the
overfitting and perform good accuracy for the classification in conclusion.
comparison to the feature selection methods. However, the data
description is lost occasionally after the transformation, and the
cost of this process is expensive in several datasets [43, 44].

59
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

IV. AREVIW ON DIMENTIONALITY REDUCTIOM METHODS learning repository and 9 from the feature selection dataset
website). Also, the SVM and decision tree (DT) used as
A. Feature Selectuion Methods
classifiers and. In addition, the presented method compared
Dimensionality reduction utilizes the feature selection with eight unsupervised feature selection methods. The results
methods to select relevant features. In this study, we discuss demonstrated that the proposed method accomplished better
several recent works for feature selection. The kinds of accuracy regarding the SVM and DT classifier except in two
literature on feature selection methods are summarized in Table datasets and better stability in both classifiers compared to the
III.
8 feature selection methods.
Churmonge and Jena [51] proposed a method to address the
Niu et al. [55] presented a method to deal with multivariate
dimensionality issue based on the clustering combined with
correlation filter subset selection. The relevant features were financial time series nonlinearity inherent to improve the
found by the K-means clustering algorithm and redundant accuracy of forecasting and make the financial decision better.
features found from clusters and removed by the correlation The proposed method involved a feature selection part, deep
measure. The presented method used on 8 text and 4 microarray learning framework, and error correlation part. In the feature
datasets and the Naive Bayes (NB) classifier depended on the selection part, the RReliefF algorithm (which is the enhanced
classification. Furthermore, the authors compared their method version of the ReliefF) cooperated wrapper-based method to
performance with the ReliefF and information gain (IG) feature remove the redundant feature. Also, the deep learning part has
selection methods relating to the accuracy and computational consisted of long-short term memory (LSTM), gated recurrent
time. The accuracy of the proposed method outperformed both unit (GRU), and the optimizer based on adaptive moment
methods in all datasets except two datasets, and in estimation (Adam). The deep leaning part was trained based on
computational, the proposed was faster than other methods in the subset generated by the first part. Furthermore, the error
all datasets. correlation used to enhance the accuracy of the method. The
Tan et al. [52] presented a feature selection method based on method performance validated on 16 benchmarks and three
the evolutionary algorithm (EA) to reduce the dimensionality datasets, and the results have shown its superiority.
of motor imagery brain-computer interface from Jain and Singh [56] proposed a hybrid feature selection
electroencephalogram (EEG) signals. The subset of important method that consisted of ReliefF and PCA algorithms. First, the
features was generated from each iteration of the EA, while the weight for each feature was calculated in the used datasets, and
redundant and insignificant features were eliminated. The a set of satisfying features was generated by the first algorithm.
experiments were performed in two different datasets: EEG The second algorithm was applied in the generated set. In the
dataset and several machine learning datasets. Also, three proposed method, two types of datasets were considered (text
classifiers depended: support vector machine (SVM), K- and microarray), and the experiments performed in ten datasets.
Nearest Neighbor (KNN), and discriminant analysis (DA). The performance of the method evaluated in terms of a number
Also, the performance of the proposed EA – feature selection of the selected features and computation time. The results
method was compared with PCA and independent component indicated that the presented method could achieve better
analysis (ICA), neighborhood component analysis (NCA), and performance in low and high dimensional datasets and reduced
variable-length particle swarm optimization (VLPSO). The half of the irrelevant and redundant features.
results showed that the introduced methods outperformed all
the above methods and could achieve high accuracy even with Hosseini and Moattar [57] presented a feature subset
a small subset of the features. selection method for imbalanced data classification with high
Hafiz et al. [53] investigated the feature selection issues in dimensional. The authors focused on the feature space, and the
the power quality events and proposed a two dimensional PSO method was based on interaction information to improve the
feature selection method. They depended on the two search process. Through each iteration of the method, multiple
dimensional in order to efficiently guide the search space of the subsets of the features generated, and the best subset involved
particle swarm. The noise measurement against the reduced in the next iteration. In more detail, the candidate features
subset was studied by the Gaussian. The used induction selected by the Symmetric Uncertainty Algorithm (SUA) first.
algorithms in this study were KNN and Naïve Bayes. After the multivariate interaction information used to test the
Moreover, the proposed method performance was compared candidate feature subset and based on the dominated
with the Genetic Algorithm (GA), Ant Colony Optimization relationship the best subset of the features selected.
(ACO), Binary PSO (BPSO), Catfish BPSO, and Chaotic Furthermore, KNN, Naïve Bayes and CART were used as
BPSO (CHBPSO). The results have shown that the presented classifiers. The efficiency of the proposed method assessed in
method could find an important and robust feature subset and 13 datasets from different repositories. Additionally, the
achieve better accuracy than the above-mentioned methods. performance of the method outperformed 10 other feature
Han et al. [54] worked on the limitation of the local linear selection methods in terms of accuracy and in a number of the
embedding (LLE) method to propose an unsupervised feature reduced features.
selection mechanism. They depended on the low dimensional Manbari et al. [58] proposed a hybrid unsupervised feature
space learning and graph matrix learning. The experiments selection method based on the clustering and binary ant system.
performed in 15 datasets (6 of them from the UGI machine The procedure of the method executed in two stages: In the first,

60
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

the clustering was performed in order to cluster the features and Duong and Hoang [62] presented a method based on
in the second stage and from each cluster the best feature Histogram of Oriented Gradient (HOG) descriptor and feature
calculated through the iteration of the Ant process. Meanwhile, selection method to classify the rice quality. They extracted
the second stage of the proposed method repeated several times HOG features from the rice image, and the score of each feature
until the dominated features collected. The presented method was calculated by the fisher score feature selection method. In
compared to seven other unsupervised feature selection the proposed method, the VNRICE dataset was used to perform
methods in eight datasets from UCI and Pablo de Olavide the experiment of the method. Also, the NN classifier was
research group. Moreover, the comparison of the method with utilized. The results have shown that the fisher score method
other methods was performed and the assessment done by using enhanced the accuracy by 42% of the classification and the
four classifiers (SVM, KNN, DT, and RF). The results computation time reduced.
demonstrated that the performance of the method was better
Alharan et al. [63] proposed a method based on feature
than other methods and significantly minimized computation
extraction and feature selection methods for texture image
time.
classification. Firstly, the set of features was extracted from the
Qu et al. [59] proposed feature selection for predicting used datasets by using three approaches (Gray Level Co-
colorectal cancer based on operational taxonomic units. Three occurrence Matrix (GLCM), Local Binary Pattern (LBP), and
feature selection methods were integrated into the proposed Gabor filter). After that, the second stage was performed, which
method. In first the subset of most significant operational was the evaluation of the extracted features, and the assessment
taxonomic units generated by multiple dimension-reduction was done by five techniques (info gain, Gain ratio, oneR,
methods. Then for reducing the dimensionality and increasing ReliefF, and symmetric). Then based on the previous
the efficiency of the correlation-based feature selection (CFS) assessment, the feature selection was accomplished by utilizing
and maximum relevance–maximum distance (MRMD) was the K-means clustering algorithm. The presented method
used as a combined method. Moreover, according to the experiment was done on three datasets, and SVM, NB, and
taxonomy file, the best features selected. The experiment was KNN classifiers were used for the classification. The results
performed in two datasets, and three classifiers (RF, Naïve showed that the NB and KNN achieved better accuracy in the
Bayes, and DT) were dependent on evaluation. The results first dataset while the SVM attained superior accuracy in the
illustrated that the correlation-based feature selection method second and third datasets.
performed better reduction, and the MRMD required more
Osman et al. [64] proposed a model involved in feature
amount of time and memory for computation. Among all used
extraction and feature selection color-based methods for
classifiers, the RF achieved better performance.
identifying origin automatically. The presented method was
Umbarkar and Shukla [60] presented a method to improve performed in three stages. In the first, the skin color information
the performance of the instruction detection system. They used was extracted from human faces by using skin color detection
IG, gain ratio (GR), and CFS algorithms for reducing the technique. Next, the wrapper subset evaluator and GA method
dimensionality in the used dataset. To obtain the best-reduced to eliminate the redundant and irrelevant features. Moreover,
set of features, they divided the original set into several parts, 1550 face images from different regions were used by the
and then each algorithm applied to those parts, and the most authors, and six classifiers (NB, Bayes Net, KNN, SVM, RF,
accurate was selected. The experiment of the proposed system and Multilayer Perceptron (MLP)) were used for the
was performed on KDD-Cup 99 dataset, and DT used as the classification. The results illustrated that the individual color
classifier. The results indicated that the correlation-based features accuracy was lower than the accuracy of the combined
feature selection method achieved better performance color features. Also, the accuracy of the SVM, NB, and Bayes
compared to other used algorithms. Net was very low, and hence they could not be used for the
proposed method.
Farokhmanesh and Sadeghi [61] proposed a feature
selection method based on sparse feature selection and deep Arshak and Eesa [65] proposed a feature selection method
neural networks. Initially, Correntropy-induced, Discriminative for dimensionality reduction based on the cuttlefish algorithm
Least Squares Regression (DLSR), and Sparse Group Lasso (CFA) for gene classification. The cuttlefish was used to
(SGL) three methods of sparse were evaluated and compared. generate a subset of the optimal features. Also, the KNN was
Next, the SGL was integrated with a deep neural network, and used as a classifier for the evaluation and classification of the
the performance of this combination also assessed. Meanwhile, proposed method. The experiment was performed on eight
the K-means algorithm was used in the SGL method in order to different datasets from ELVIRA biomedical dataset repository.
group features. The nearest neighbor (NN) algorithm was used The performance of the proposed method was compared with
as a classifier in the performance evaluation of the used SVM and DT and the hidden Markov model in terms of
techniques. The experiment and valuation process of three accuracy and computational time. The result demonstrated that
methods of the sparse and combination method was performed the presented achieved better performance in five datasets
on the MNIST dataset. The results illustrated that the SGL compared to the other methods.
combined with a deep neural network achieved better accuracy.
Zeebaree et al.[66, 67] proposed a feature selection method
based on the Convolutional neural network (CNN) for

61
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

classifying and identifying the cancer type in the microarray used by the authors. 10000 stego images have been taken from
cancer data. In the presented method, the cancer data was the BOSSbase dataset for the experiment of the presented
transformed into the array after the files of data were opened. method. The classification accuracy of the methods was
Next, the cancer data organized as matrix vectors, then the CNN evaluated with SVM, KNN, and DT classifiers and compared
was applied for the classification. The experiment was with three other PSO based methods. The results revealed that
accomplished in ten cancer datasets, and the performance of the the proposed method outperformed other methods.
method was compared with mSVM-RFE-iRF and varSeIRF
Fatima et al. [70] proposed an optimized feature selection
methods. The results indicated that the proposed CNN method
method for detecting malware in the android platform. They
achieved better classification accuracy compared to the other
used evolutionary GA to reduce the feature dimensionality to
methods. Also, it outperformed the other methods in terms of
50% of the original dataset and then to train the classifier in
decreasing cancer’s genes.
order to be capable of detecting the malware features. They
Balasaraswathi [68] introduced a feature selection method for used two APks sets (Malware/Good ware) and two classifiers
intrusion detection systems based on CAF and membrane (SVM and neural network). The experimental results
computing (MC). In the proposed method membrane demonstrated that the SVM accuracy was 96.6%, and neural
computing was integrated with the cuttlefish algorithm aiming network accuracy was 95.2%. The authors conducted the
to enhance the feature selection process. Moreover, two performance of the presented method could be enhanced by
datasets of the intrusion detection system were dependent on utilizing larger datasets.
performing the method experiment. The performance of the
Widiyanti and Endah [71] worked on a study for recognition
cuttlefish with and without MC was illustrated. Furthermore,
of the music emotion based on the feature selection algorithms.
the comparison of the proposed method’s performance and a In the first several features were extracted from the used
number of various methods of feature selection was datasets. After that, three feature selection algorithms, namely
accomplished. The results show the accuracy and computation Sequential Forward Selection (SFS), Sequential Backward
time of the CAF combined with MC was better than all other Selection (SBS), and ReliefF were used to identify emotional
methods. features. Some emotion classification has been used, such as
Kaur and Singh [69] proposed a method for image sad, angry and happy, etc. the experimental performed on songs
steganalysis based on feature selection and PSO. First, the dataset and SVM classifier was used to compare the
predominant features were selected by mutual information. performance of the used algorithms. The results explained that
Moreover, for selecting dominant features, adaptive PSO was the accuracy of the ReliefF algorithm was lower than other
algorithms that obtained a similar accuracy.

TABLE III. FEATURE SELECTION METHODS SUMMARIZATION

Computation
Ref. Year Dataset Technique(s) Classifier(s) Accuracy
Time
Text and Correlation and K- 0.5 to 10.24 The best 99.0.2% and the
[51] 2018 NB
Microarray means seconds worst 68.02%
KNN:99.68% and
[53] 2018 Power quality PSO 20.9 seconds KNN and NB
NB:99.44%
Text and
[56] 2018 ReliefF and PCA 1 to 29 seconds - -
Microarray
IG, GR and Correlation: 92.65%, IG:
[60] 2018 KDD-Cup 99 - DT
correlation 92.33% and GR: 92.54%
NB: 57%, SVM:60%,
wrapper subset and NB, KNN, SVM,
[64] 2018 Human images - KNN:64%, RF: 75% and
GA RF and MLP
MLP: 71%
0.049 to 2.11
[65] 2018 ELVIRA CFA KNN 100%
seconds
Different cancer
[66] 2018 CNN - - 100%
datasets
0.15 by CFA
[68] 2018 KDDCUP’99 CFA and MC and 0.11 j48 96.66%
CFAMC
SFS, SBS and
[71] 2018 Song dataset - SVM 43%
ReliefF
UCI, KEEL and SUA and interaction KNN, NB and 100% for KNN, NB and
[57] 2019 126.57 seconds
GitHub information CART CART
K-means and binary SVM, KNN, DT
[58] 2019 UCI 9.939 seconds -
Ant and RF
[59] 2019 [72] and[73] CFS and MRMD - RF, NB, and DT RF achieved better accuracy
DLSR, SGL and
[61] 2019 MNIST - NN 96.77%
Deep Learning

62
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

HOG and fisher

[62] 2019 VNRICE - NN 93.34%
score
IG, ReliefF,
Kelberg, And SVM, NB and 99.95% by KNN and NB,
[63] 2019 symmetric and K- -
Brodatz KNN and 99% by SVM
means
3.76 to 8.57 SVM and neural SVM: 96.6% and neural
[70] 2019 APKs GA
seconds network network: 95.2%
SVM:99.25%,
SVM, KNN and
[52] 2020 EEG EA - KNN:99.13% and
DA
DA:100%
SVM:97.71% and
[54] 2020 UGI LLE - SVM and DT
DT:97.20%
[55] 2020 SZCI and DJIA RReliefF and LSTM - - 98.62%

B. Feature Extraction Methods The result indicated that the prepared method outperformed
In previous literature, the dimensionality reduction uses the several other emotion recognition methods. Chu et al. [76]
feature selection methods to select the relevant features have proposed a system for extracting features from the machinery
been presented. The remainder of the aim of this paper reviews fault-based vibration signals. They used three feature extraction
is to review the latest literature related to feature extraction and methods: Fourier transform frequency spectrum (FTFS),
dimensionality reduction techniques. Table IV illustrates the envelopment analysis, and local mean decomposition (LMD).
summary of the recent literature. The vibration signals were analyzed by the Fourier transform
analysis to get amplitude and phase spectrums. The results have
Moghaddam et al. [41] proposed a method known as given that the envelope analysis and mean decomposition
spectral segmentation and integration (SSI) as supervised methods could extract between cancerous and non-cancerous
feature extraction for hyperspectral images. The developed tumors in the breast. The authors used a new threshold to
method divided pixels’ spectral signature curve to channels. improve LBP texture features and the LBP descriptor for
Then a mean weighted operator was used for integration of each identifying the abnormal cases. In the proposed method, the
channel band in order to extract new features in a very minimal features extracted by using CNN and SVM were used for the
number compared to the original bands. Moreover, the PSO classification. The experimental results have explained that the
algorithm was used to merge spectral signature curve pixel developed method could classify the ultrasound images with
segments so as to reduce the dimensionality of the image and to high accuracy and sensitivity.
increase the class accuracy. In the proposed technique, the
SVM was used as a classifier, and two datasets were used. The Li et al. [77] worked on fault diagnosis and used
experimental results confirmed the SSI method outperformed discriminative graph regularized autoencoder (DGAE) to
other feature extraction methods such as PCA, SRS, NWFE, design a feature extraction method. To map process data to the
DAFE, PCA, SELD, BCC, and CBFE. feature space, to avoid manually designing feature problems
and to ensure that the data characteristics is truly reflected by
Berbar [74] worked on malignant masses in mammograms the learned feature reflect they used advanced neural network
based on the feature extraction. The researcher presented Gray structure. Furthermore, the neural network structure model is
Level Co-occurrence Matrix (GLCM) texture feature extraction integrated with the graph to learn internal representation and to
by three hybrid methods that were used in the proposed method. preserve locality. Also, to improve the performance of the
The three hybrid methods called Wavelet CT1, Wavelet CT2, classification, training samples for the label information were
and ST-GLCM. The interesting point of the image was divided embedded to the graph. NN was used as a classifier. In
into sub-image then contrast stretching stage was used prior to comparison with other fault diagnosis feature extraction
feature extraction. Then the sub-image has been applied for the methods the proposed method achieved better performance.
methods of feature extraction. Next, the GLCM extracted the
seven-feature texture and have been merged with seven Nagarajan et al. [78] used Empirical Mode Decomposition
statistical features. Moreover, two datasets images were used in (EMD) to propose two feature extraction methods of
this research and SVM classifier. The proposed methods mammogram image. The interesting point of the images
outperformed the multi-resolution feature extractions methods divided into a group of different frequency components. Then
in terms of the number of the extracted feature. Also, in Area they performed their first method based on Bi-dimensional
under the Curve (AUC) measure, the researcher methods were Empirical Mode Decomposition (BEMD). Through these
superior to other feature extraction methods. groups, the GLCM and gray level rum matrix features were
extracted. However, the extracted features by BEMD first
Rahman et al. [75] worked on the emotion recognition task.
feature extraction method were less orthogonal to each other.
They used PCA and t-statistical to reduce the dimensionality of
Therefore, the researchers proposed the second feature
extracted features from emotional signals of EEG. The
extraction which was a modified version of the first one and
proposed method was applied to the dataset called SJTU called (MBEMD). The SVM and LDA classifier were used in
emotion EEG. The emotional state with extracted features has this research. Furthermore, the proposed method applied in
been classified by four classifiers: SVM, ANN, LDA, and different databases and obtained steady performance.
KNN.

63
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

Rabin et al. [79] used a short-time Fourier transform (STFT) University). The results revealed that the classification
to extract features from the human hand movement-based accuracy of the TLPP/CBS was better compared to all other
signal of electromyography (EMG). Because the space of the combination methods for both used datasets.
extracted features was large, they used PCA and diffusion maps
(DM) to reduce the dimensionality of the extracted features. Chen et al. [84] proposed a study on X-ray images for
Also, to perform the comparison for the performance of both determining bone age. They used a deep neural network to
methods with different sizes of the training set. The researchers extract features from the X-ray images. The extracted features
used KNN as a classifier in this study. The result indicated the include: Glutamate cysteine ligase modifier subunit (GCLM)
DM technique outperformed the PCA method in case of the features and Local Binary Patterns (LBP) features in the image.
limited training set. However, in the large training set, both In more detail, the features of the X-ray hand bone image could
methods achieved high performance in the term of the be automatically extracted using deep learning and the bone age
classification. was assessed automatically by the convolution neural network.
Moreover, the PCA algorithm was used to reduce the
Kuncan et al. [80] worked on the diagnosis of bearing fault dimensionality of the extracted features. The extracted features
for the rotary machine. They extracted features from the were classified by the SVM classifier. The test data, training
vibration signals of bearing by applying their proposed method data and verify data established by a captured image from
called a one-dimensional ternary pattern (1D-TP). The signals several males and females and different ages. The results
were collected from three different datasets in size, speed, and proved that the presented study achieved better performance
parts. Moreover, Random Forest (RF), KNN, SVM, Bayes Net, compared to other methods in this field.
and ANN were used as classifiers with the ternary pattern. The
results indicated that the proposed method could extract Jin et al. [85] proposed a feature extraction algorithm for
efficient features from vibration signals for classification. Also, JPEG steganalysis based on the adaptive scale adjustment
all used classifiers attained effective accuracy. algorithm. In this algorithm, the scale of feature extraction was
adjusted adaptively according to the quality of JPEG images.
Liu et al. [81] built a method for feature extractions based They mainly depended on the Boss Base 1.01 database, and
on the incorporation of the discriminant analysis and the low they applied their algorithm on the MD-CFR feature. The result
ranks representation of the original data samples. The has shown that the performance of the steganalysis improved
supervised proposed feature extraction method is called by the proposed method. Also, the dimensionality of extracted
discriminative low rank preserving projection (DLRPP). The features was reduced by the presented method and hence it
presented method performance was compared with seven other could be used based on residual images in the other steganalysis
feature extraction methods (LPP, LSDA, DPSR, LRRDP, methods.
LSPP, LRPP and FOLP) on images of six different datasets.
Liu et al. [67] presented a method for feature extraction
The DLRPP method achieved better performance in the term of
recognition rate. depending on graph-based space to construct an optimal
algorithm for semi-supervised learning. Particularly the
Ma and Yuan [82] proposed a method for extraction presented method was a combination of sparse representation,
features from images based on the deep CNN and PCA. They discriminative projection, and manifold learning for
used a neural network to extract features. Due to the high dimensionality reduction. They designed and performed their
dimension of the extracted features, they improved and method to obtain semi-supervised feature extraction and spars
optimized the PCA algorithm by deep learning through structure local manifolds at the same time. Moreover, the
simulation experiments. Then the researchers compared the optimal value was accomplished by modifying the similarity
performance of the PCA before and after improvement. The matrix in each iteration. The experiment was executed in five
memory usage before optimizing the algorithms was more than datasets and the performance of the presented method
6000 MB and after optimizing the memory utilization compared with six other methods (PCA, MSEC, DLSR,
decreased to less than 1000 MB. Also, there was a big NLDLSR, SOGFS, and SDR). The offered method
difference in the time consumed by the PCA algorithm before outperformed all other methods in all used datasets.
and after optimization. Rather than the performance of the
improved PCA was effective, also the classifier accuracy was Lin et al. [68] worked on the discriminative graph signal to
enhanced which was done by the SVM algorithm. propose a feature extraction method that could extract good
features to perform the desired classification. All the training
Sellami and Farah [83] presented a combination of feature samples in the contained graph were established. Moreover,
extraction and band selection methods to reduce the they used eigenvector decomposition in order to attain the
dimensionality of hyperspectral images. They used several Fourier base of the graph. Numerous discriminative signals
feature extraction methods such as PCA, TLPP, KPCA and LE were extracted concurrently for achieving high accuracy
as linear, nonlinear methods. Also, the researchers utilized MI, especially in a problem that has multiple classes. The proposed
DM, CBS, and PA supervised and unsupervised band selection method was performed in four different experiments with
methods. The SVM algorithm was used as a classifier in the several datasets. The results indicated that the presented
proposed implementation. They combine both groups of the methods could achieve encouraging performance, and it was
algorithms as follows: TLPP/CBS, PCA/CBS, KPCA/BS, considered to be more effective with supervised classification.
PCA/MI, LE/CBS, TLPP/MI, LE/MI, and KPCA/MI. The
authors used two datasets (the first dataset’s images were from Kasongo and Sun [69] proposed a method for wireless
Indian Pines and the second dataset’s images were from Pavia intrusion detection systems based on the feed-forward Deep

64
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

Neural Network (DNN). They used a wrapper feature transformation methods to reduce dimensionality. The results
extraction unit with the DNN framework in order to extract the showed that using the Random forest feature selection method
optimal feature vector. Then the extra trees algorithm was used with Ensemble bagged classiﬁer and using Neighborhood
as a classifier. Moreover, two intrusion detection datasets component analysis along with Ensemble bagged classiﬁer
(UNSW-NB15 and the AWID) were dependent as well to achieved better accuracy.
examine their methods’ efficiency. Moreover, the experiment
of the proposed method performed in two different types of Zhang et al. [86] proposed a system for the hyperspectral
attacks, which were binary and multiclass. Furthermore, the images to minimize the dimensionality. They depended on the
researchers compared their method with RF, KNN, SVM, DT, sparse graph and spatial as the integrated method. They utilized
and NB. The proposed method of detection accuracy PCA and entropy rate in order to divide hyperspectral images
outperformed the other methods. Whereas the attack to superpixel patch. Moreover, trained data of the graph was
classification by feed-forward deep neural network was better constructed by using superpixel segmentation. Then, they
than all the above classifiers. extracted spatial-spectral information when the sparse and low-
rank graphs generated on the obtained data. After that, to
Liu and Sui [70] worked on different methods to minimize transform the graph embedding to nonlinear space and map the
the dimensionality in content-based public cultural video input data into a high-dimensional space, they used the kernel
retrieval. The public cultural videos’ content features extracted trick. The proposed method was evaluated by two datasets
by using the combination of the deep learning framework (Indian Pines data set and the University of Pavia (Pavia-U)
(Caffe) and Alex's net network model. Due to the high data set) and the SVM algorithm was dependent as a classifier.
dimension of the extracted features, the researchers used PCA The results show that the accuracy of the presented method is
to reduce that dimensionality. The researchers examined their higher than other methods.
work in several videos. The results indicated that the video
contents of the used datasets could effectively be compressed Alipourfard et al. [87] worked on the hyperspectral images
by the PCA algorithm, and only minor contents of the video high dimensionality and proposed a system to reduce it. The
retrieval were lost while lowering the dimensionality of the proposed system was a combination of CNN and the subspace
extracted feature. feature extraction method. The authors reduced the
dimensionality of the hyperspectral images by the subspace
Dehzangi and Sahu [71] worked on human activity method in order to generate high-quality training samples for
recognition. They used spectral and temporal analysis to extract the convolutional neural network and for logistic regression that
features from the Inertial Measurement Unit. Moreover, several they used as the classifier. Moreover, the presented method was
feature extraction methods evaluated and particularly the examined by the researchers in two famous two datasets
methods based on the time and frequency domain such as power (Indiana Pines and the Pavia University scenes). The
spectral density and Autocorrelation. Also, a number of the experimental results proved that the proposed method accuracy
classifiers (DT, KNN, SVM, Neural network, and Ensemble has been improved and achieved higher marks, even under the
bagged) were utilized for human activity recognition that was limited samples of the training samples.
used in the proposed system. In addition to reducing the
dimensionality of the extracted features, the researchers used
PCA and KPCA. Although they used feature selection and

TABLE IV. FEATURE EXTRACTION METHODS SUMMARY

Computation
Ref. Year Dataset Technique(s) Classifier(s) Accuracy
Time
[74] 2018 DDSM and MIAS Wavelet and GLCM - SVM 97.89%
[76] 2018 - FTFS and LMD - - -
PCA: 10,TLPP:
Indian Pines and PCA, TLPP, KPCA, and
[83] 2018 12, KPCA: 12 SVM 96.96%
Pavia University LE
and LE: 28
deep learning, Alex net,
[88] 2018 public cultural - - -
and PCA
KNN: 95.4%, DT:
90.9%, SVM: 93.2%,
Spectral, temporal analysis DT, KNN, SVM, Neural
[89] 2018 UCI - Neural Network: 90.6%
and PCA network and Ensemble bagged
and Ensemble bagged:
96.9%
Indian Pines and sparse graph, spatial and
[90] 2018 - SVM 93.01%
Pavia PCA
Indiana Pines and the
[91] 2018 CNN and subspace feature - logistic regression 98.3%
Pavia
[92] 2019 US breast datasets LBP and CNN - SVM 96%
[80] 2019 TE process DGAE and neural network 2.47 seconds NN 83.95%
MIAS, DDSM and SVM: 96.2% and LDA:
[78] 2019 EMD - SVM and LDA
MGM Hospital 92.59%

65
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

RF, KNN, SVM, Bayes Net and

[80] 2019 Artificial Fault Type 1D-TP 0.352 second 100% for all classifiers
ANN
public image
[81] 2019 DLRPP NN -
databases
100 to 1300
[82] 2019 - CNN and PCA SVM 91.3%
seconds
[79] 2020 UCI STFT, PCA and DM - KNN 94.8%
[85] 2020 Boss Base 1.01 adaptive scale adjustment - - -
[94], [95], [96], and sparse representation and
[93] 2020 - - -
[97] discriminative projection
[86] 2020 UCI eigenvector decomposition - NN, LDA, NB and SVM 97.5%
UNSW-NB15 and wrapper feature extraction
[87] 2020 - Extra trees 99.77%
AWID and DNN
[41] 2020 ROSIS and AVIRIS SSI and PSO - SVM 84.83 %
ANN: 86.57%, SVM:
[75] 2020 SJTU PCA and t-statistical - SVM, ANN, LDA and KNN 85.85%, LDA: 82.50%
KNN:73.42%
[41] 2020 ROSIS and AVIRIS SSI and PSO - SVM 84.83 %

[66]) based on the neural network where the accuracy of both

V. DISCUSSION of them enhanced about 19% and 8.75%.
In the proposed feature selection methods, different The traditional feature selection methods were used in
techniques/algorithms used to get the dimensionality reduction different methods such as ReliefF was utilized in [55], [56],
of the dataset, minimize computation time, and improve [63], and Information Gain used in [60] and [63]. The
classification accuracy. Through the literature and table III important task of the ReliefF approached in the used methods
there are three methods ([51, 58, 63]) dependent on the was to select relevant features and in the most researches was
clustering technique using K-means. The authors in [51] used utilized with other techniques for example in [55] it used with
K-means for removing non-relevant features, while [58] in the LSTM, in [56] the PCA algorithm was utilized with it, and in
similarity value was used to separate the features in multiple [63] IG, symmetric and K-means were used with it. Similarly,
clusters, and in [44] the algorithm was used to divide the the IG algorithm was used to reduce the dimensionality of the
features into the most relevant and noisy clusters. The datasets and this technique was used with other approaches
mentioned three methods were performed in the highly such as GR, correlation-based feature and symmetric
dimension datasets such as text, microarray and texture image techniques.
classification. In [51] the computational time was reduced to
more than 50% as the researchers compared with the ReliefF In general, the accuracy of the reviewed methods is varied
and IG methods. Also, in the [58], the authors compared the from approach to another one. However, the optimization-
fulfillment of their methods with other three ant colony-based based feature selection methods achieved better performance
feature selection methods and illustrated that the computation than the traditional methods. The most used classifiers in the
time was better 5-8 times than those of the other methods. methods are SVM and KNN, both were used in 8 methods and
NB was used in 6 studies. In some articles, more than one
Also, about 50% of the reviewed feature selection methods classifier was used. Nevertheless, the SVM obtained better
were based on the optimization approaches. Where these accuracy compared to others in those manuscripts who used
methods used several optimized algorithms such as PSO in [53] multiple classifiers. Also, there is a variety in the computational
and [69], EA in [52], Ant in [58], GA in [64] and [70], CAF in time and the role of clustering and optimization algorithms was
[65] and [68] and deep learning in [61] and [65]. In [53] and significant for reducing the computational complexity.
[69] PSO algorithm was used to incorporate the features Whereas, in research [51] the execution time was better, about
information into search space and hence selecting the most 50% to 70% when compared with IG and ReliefF techniques.
desired features and removing not required ones. In [52] EA Moreover, in the researches [65] and [70], execution time was
was used to reduce the dimensionality of the search space by reduced by 40% and 50% by the CFA and GA. However, the
eliminating the unnecessary features from each iteration best computational time was achieved when the Ant system
process, then influential features were selected at the same time. integrated with the clustering algorithm in the research [58]
In [58] after the features clustered, the binary Ant used to rank where it was 5-8 times better compared to the other three ACO
them from each cluster. Then from each iteration of the Ant algorithms.
process, the wanted set of the important features constructed.
The GA was used due to its capability for reducing the dataset On the other hand, 21 feature extraction methods have been
dimensionality. According to the research [64] the average of reviewed in this study and summarized in Table IV. Among
reduction was 93%, and in [70] used feature dataset was them 7 methods ([92], [77], [82], [84], [87], [88], and [91])
reduced by 50%. Also, the CFA has a great role in reducing the depended on deep learning (CNN and DNN). CNN and LBP
dimensionality and computational time in [65] the dataset was were used in research [92] and achieved high-efficiency
reduced up to 93% and in [68] the time was less than compared accuracy. In [77] the neural network was integrated with graph
to other optimization algorithms the authors did a comparison autoencoder. Particularly, the neural network was used to map
with. The deep neural network was used in order to improve the process data to feature space. In [82] the PCA algorithm was
classification accuracy. In the literature two methods ([61] and optimized by CNN to efficiently reduce dimensionality in big

66
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

data. As a result, about 70% of the features reduced, executed selection and feature extraction methods. We observed that the
time minimized and memory utilized decreased and hence the trend of the researchers for reducing the dimensionality based
accuracy increased by 3%. Also, in research [84] the accuracy on the feature selection methods is to use the optimization
was improved around 7% by using CNN and PCA together. In algorithms, and about half of the reviewed researches were
[87] DNN framework was used with a wrapper feature relaying on the different techniques of optimization. Also, the
extraction algorithm. In the used datasets the dimensionality most used classifiers are the SVM and KNN, and the best-
was reduced efficiently, and accuracy increased by about more achieved accuracy was the SVM algorithm. On the other hand,
than 6%. In [88] the dimensionality of the public culture videos for feature extraction methods, CNN and DNN techniques take
was reduced proficiently by the deep learning framework and a great role and have been used in 7 methods of the studied
PCA algorithm. The classification accuracy enhanced about 4% research. While the PCA is still a widely used algorithm in the
by the integrated CNN and feature subspace reduction in the feature extraction works, it has been used in 8 methods.
research [91]. Additionally, the optimized PCA could achieve better
performance in terms of accuracy, computational time, and the
Moreover, the PCA algorithms were used in 8 feature number of reduced features.
extraction methods [70], [79], [82], [83], [84], [88], [89] and
[90] which exist in the literature. In [55] the PCA role was to
reduce the redundant information rather than extracting the REFERENCES
features. Also, in [79], [83], [84], [88] and [89], the [1] N. Sharma and K. Saroha, "Study of dimension reduction methodologies
dimensionality of the extracted features were reduced by the in data mining," in International Conference on Computing,
PCA. While in research [90] the PCA algorithm was used for Communication & Automation, 2015, pp. 133-137: IEEE.
extracting the initial component in the process of converting the [2] S. Ayesha, M. K. Hanif, and R. Talib, "Overview and comparative study
HSI images into a superpixel patch. In research [41] the PSO of dimensionality reduction techniques for high dimensional data,"
was used to minimize the dimensionality and enhance the Information Fusion, vol. 59, pp. 44-58, 2020.
classification accuracy. The wavelet transform was utilized to [3] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and S. R. Zeebaree,
decrease the number of extracted features in [74]. Also, the "Combination of K-means clustering with Genetic Algorithm: A review,"
study [76] proved that extracting the features in the strong noise International Journal of Applied Engineering Research, vol. 12, no. 24,
pp. 14238-14245, 2017.
was only accomplished by the FTFS when compared with
envelopment analysis and LMD. Further, the SVM algorithm [4] Z. Cheng and Z. Lu, "A novel efficient feature dimensionality reduction
was the most used classifier in the summarized feature method and its application in engineering," Complexity, vol. 2018, 2018.
extraction methods in this paper, it was used in 12 methods. [5] D. A. Zebari, H. Haron, D. Q. Zeebaree, and A. M. Zain, "A Simultaneous
Approach for Compression and Encryption Techniques Using
The best-achieved accuracy of the reviewed methods and Deoxyribonucleic Acid," in 2019 13th International Conference on
from those who depended on the PCA algorithms is research Software, Knowledge, Information Management and Applications
[82] for a reason; they optimized the algorithm by deep (SKIMA), 2019, pp. 1-6: IEEE.
learning. Also, the enhanced PCA by the CNN in the research [6] M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, "Fast hybrid
[82] reduced a great computational time, it was 1300 without dimensionality reduction method for classification based on feature
CNN and became 100 when the PCA algorithm improved by selection and grouped feature extraction," Expert Systems with
the CNN. But in the research [41] the computational complexity Applications, vol. 150, p. 113277, 2020.
of the SSI that has been combined with PSO was more [7] A. P. Pandian, R. Palanisamy, and K. Ntalianis, Proceeding of the
compared to the spectral region splitting (SRS). International Conference on Computer Networks, Big Data and IoT
(ICCBI-2019). Springer Nature, 2020.

VI. CONCLUSION [8] M. A. Mohammed, B. Al-Khateeb, A. N. Rashid, D. A. Ibrahim, M. K.

A. Ghani, and S. A. Mostafa, "Neural network and multi-fractal
The high dimensionality of data has a direct impact on the dimension features for breast cancer classification from ultrasound
learning algorithm, computational time, computer resources images," Computers & Electrical Engineering, vol. 70, pp. 871-882, 2018.
(memory), and model accuracy. Therefore, reducing [9] O. Saini and S. Sharma, "A review on dimension reduction techniques in
dimensionality and tackling its curse became an exciting topic data mining," Computer engineering and intelligent systems, vol. 9, pp.
in search and development areas to provide the most reliable, 7-14, 2018.
flexible, and high accurate computerized tools and applications. [10] N. Abd-Alsabour, "On the Role of Dimensionality Reduction," JCP, vol.
Hence, several methods and techniques accomplished in the last 13, no. 5, pp. 571-579, 2018.
two decades based on the feature selection and feature [11] S. Velliangiri and S. Alagumuthukrishnan, "A Review of Dimensionality
extraction. Reduction Techniques for Efficient Computation," Procedia Computer
Science, vol. 165, pp. 104-111, 2019.
This paper reviews the most recent studies in several fields
such as medical disease analysis, ethnicity identification, [12] W. Wang, W.-g. Shen, Y.-x. Sun, B. Chen, and R. Zhu, "Dimensionality
emotion recognition, genes classification, text classification, reduction via adjusting data distribution density," in 2018 5th
International Conference on Systems and Informatics (ICSAI), 2018, pp.
image Steganalysis, data visualization, Hyperspectral images 1052-1055: IEEE.
classification, network malware detection and several
engineering tasks, etc. moreover, the details used [13] J. Stuckman, J. Walden, and R. Scandariato, "The effect of dimensionality
reduction on software vulnerability prediction models," IEEE
techniques/algorithms, datasets, classifiers approaches were Transactions on Reliability, vol. 66, no. 1, pp. 17-37, 2016.
used by the authors and attained results relating to the accuracy
and computational time are summarized for each of the feature

67
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

[14] M. Verleysen and D. François, "The curse of dimensionality in data [32] M. Dash and H. Liu, "Feature selection for classification," Intelligent data
mining and time series prediction," in International Work-Conference on analysis, vol. 1, no. 3, pp. 131-156, 1997.
Artificial Neural Networks, 2005, pp. 758-770: Springer.
[33] D. Jain and V. Singh, "Feature selection and classification systems for
[15] L. Liu and M. T. Özsu, Encyclopedia of database systems. Springer New chronic disease prediction: A review," Egyptian Informatics Journal, vol.
York, NY, USA:, 2009. 19, no. 3, pp. 179-189, 2018.
[16] A. Juvonen, T. Sipola, and T. Hämäläinen, "Online anomaly detection [34] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, "Machine
using dimensionality reduction techniques for HTTP log analysis," learning and Region Growing for Breast Cancer Segmentation," in 2019
Computer Networks, vol. 91, pp. 46-56, 2015. International Conference on Advanced Science and Engineering
(ICOASE), 2019, pp. 88-93: IEEE.
[17] X. Huang, L. Wu, and Y. Ye, "A Review on Dimensionality Reduction
Techniques," International Journal of Pattern Recognition and Artificial [35] D. Zebari, H. Haron, and S. Zeebaree, "Security Issues in DNA Based on
Intelligence, vol. 33, no. 10, p. 1950017, 2019. Data Hiding: A Review," International Journal of Applied Engineering
Research,vol. 12,no. 24, ISSN, pp. 0973-4562, 2017.
[18] D. L. Padmaja and B. Vishnuvardhan, "Comparative study of feature
subset selection methods for dimensionality reduction on scientific data," [36] M. M. Kabir, M. M. Islam, and K. Murase, "A new wrapper feature
in 2016 IEEE 6th International Conference on Advanced Computing selection approach using neural network," Neurocomputing, vol. 73, no.
(IACC), 2016, pp. 31-34: IEEE. 16-18, pp. 3273-3283, 2010.
[19] M. B. Abdulrazzaq and J. N. Saeed, "A Comparison of Three [37] Y. Peng, Z. Wu, and J. Jiang, "A novel feature selection approach for
Classification Algorithms for Handwritten Digit Recognition," in 2019 biomedical data classification," Journal of Biomedical Informatics, vol.
International Conference on Advanced Science and Engineering 43, no. 1, pp. 15-23, 2010.
(ICOASE), 2019, pp. 58-63: IEEE.
[38] Q. Shen, R. Diao, and P. Su, "Feature Selection Ensemble," Turing-100,
[20] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A novel feature-selection vol. 10, pp. 289-306, 2012.
approach based on the cuttlefish optimization algorithm for intrusion
detection systems," Expert Systems with Applications, vol. 42, no. 5, pp. [39] M. K. Elhadad, K. M. Badran, and G. I. Salama, "A novel approach for
2670-2679, 2015. ontology-based dimensionality reduction for web text document
classification," International Journal of Software Innovation (IJSI), vol. 5,
[21] A. S. Eesa, A. M. A. Brifcani, and Z. Orman, "Cuttlefish algorithm-a no. 4, pp. 44-58, 2017.
novel bio-inspired optimization algorithm," International Journal of
Scientific & Engineering Research, vol. 4, no. 9, pp. 1978-1986, 2013. [40] D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Enhance
the Mammogram Images for Both Segmentation and Feature Extraction
[22] P. Jindal and D. Kumar, "A review on dimensionality reduction Using Wavelet Transform," in 2019 International Conference on
techniques," International journal of computer applications, vol. 173, no. Advanced Science and Engineering (ICOASE), 2019, pp. 100-105: IEEE.
2, pp. 42-46, 2017.
[41] S. H. A. Moghaddam, M. Mokhtarzade, and B. A. Beirami, "A feature
[23] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A new feature selection extraction method based on spectral segmentation and integration of
model based on ID3 and bees algorithm for intrusion detection system," hyperspectral images," International Journal of Applied Earth
Turkish Journal of Electrical Engineering & Computer Sciences, vol. 23, Observation and Geoinformation, vol. 89, p. 102097, 2020.
no. 2, pp. 615-622, 2015.
[42] D. M. Sulaiman, A. M. Abdulazeez, H. Haron, and S. S. Sadiq,
[24] U. M. Khaire and R. Dhanalakshmi, "Stability of feature selection "Unsupervised Learning Approach-Based New Optimization K-Means
algorithm: A review," Journal of King Saud University-Computer and Clustering for Finger Vein Image Localization," in 2019 International
Information Sciences, 2019. Conference on Advanced Science and Engineering (ICOASE), 2019, pp.
82-87: IEEE.
[25] S. Visalakshi and V. Radha, "A literature review of feature selection
techniques and applications: Review of feature selection in data mining," [43] R. Aziz, C. Verma, and N. Srivastava, "Dimension reduction methods for
in 2014 IEEE International Conference on Computational Intelligence microarray data: a review," AIMS. Bioengineering, vol. 4, no. 1, pp. 179-
and Computing Research, 2014, pp. 1-6: IEEE. 197, 2017.
[26] C. M. Teng, "Combining noise correction with feature selection," in [44] A. S. Eesa, A. M. Abdulazeez, and Z. Orman, "A DIDS Based on The
International Conference on Data Warehousing and Knowledge Combination of Cuttlefish Algorithm and Decision Tree," Science Journal
Discovery, 2003, pp. 340-349: Springer. of University of Zakho, vol. 5, no. 4, pp. 313-318, 2017.
[27] H. Zhao, F. Min, and W. Zhu, "Cost-sensitive feature selection of numeric [45] Z. M. Hira and D. F. Gillies, "A review of feature selection and feature
data with measurement errors," Journal of Applied Mathematics, vol. extraction methods applied on microarray data," Advances in
2013, 2013. bioinformatics, vol.170. 2015, 2015.
[28] J. N. Saeed, "A SURVEY OF ULTRASONOGRAPHY BREAST [46] A. M. Abdulazeez and A. S. Issa, "Intrusion detection system based on
CANCER IMAGE SEGMENTATION TECHNIQUES," Academic neural networks using bipolar input with bipolar sigmoid activation
Journal of Nawroz University, vol. 9, no. 1, pp. 1-14, 2020. function," AL-Rafidain Journal of Computer Sciences and Mathematics,
vol. 8, no. 2, pp. 79-86, 2011.
[29] Y. Leung and Y. Hung, "A multiple-filter-multiple-wrapper approach to
gene selection and microarray data classification," IEEE/ACM [47] D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Multi-Level
Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, of DNA Encryption Technique Based on DNA Arithmetic and Biological
pp. 108-117, 2008. Operations," in 2018 International Conference on Advanced Science and
Engineering (ICOASE), 2018, pp. 312-317: IEEE.
[30] C. Lazar et al., "A survey on filter techniques for feature selection in gene
expression microarray analysis," IEEE/ACM Transactions on [48] O. M. S. Hassan, A. M. Abdulazeez, and V. M. TİRYAKİ, "Gait-based
Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1106-1119, human gender classification using lifting 5/3 wavelet and principal
2012. component analysis," in 2018 International Conference on Advanced
Science and Engineering (ICOASE), 2018, pp. 173-178: IEEE.
[31] M. R. Mahmood and A. M. Abdulazeez, "A Comparative Study of a New
Hand Recognition Model Based on Line of Features and Other [49] F. P. Shah and V. Patel, "A review on feature selection and feature
Techniques," in International Conference of Reliable Information and extraction for text classification," in 2016 International Conference on
Communication Technology, 2017, pp. 420-432: Springer. Wireless Communications, Signal Processing and Networking
(WiSPNET), 2016, pp. 2264-2268: IEEE.

68
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

[50] H. Sadeeq, A. Abdulazeez, N. Kako, and A. Abrahim, "A Novel Hybrid 2018 International Conference on Advanced Science and Engineering
Bird Mating Optimizer with Differential Evolution for Engineering (ICOASE), 2018, pp. 145-150: IEEE.
Design Optimization Problems," in International Conference of Reliable
Information and Communication Technology, 2017, pp. 522-534: [67] O. Ahmed and A. Brifcani, "Gene Expression Classification Based on
Springer. Deep Learning," in 2019 4th Scientific International Conference Najaf
(SICN), 2019, pp. 145-149: IEEE.
[51] S. Chormunge and S. Jena, "Correlation based feature selection with
clustering for high dimensional data," Journal of Electrical Systems and [68] V. Balasaraswathi, "Enhanced Cuttle Fish Algorithm Using Membrane
Information Technology, vol. 5, no. 3, pp. 542-549, 2018. Computing for feature selection of intrusion detection.",vol.10, special
issue,2018.
[52] P. Tan, X. Wang, and Y. Wang, "Dimensionality reduction in
evolutionary algorithms-based feature selection for motor imagery brain- [69] J. Kaur and S. Singh, "Feature selection using mutual information and
computer interface," Swarm and Evolutionary Computation, vol. 52, p. adaptive particle swarm optimization for image steganalysis," in 2018 7th
100597, 2020. International Conference on Reliability, Infocom Technologies and
Optimization (Trends and Future Directions)(ICRITO), 2018, pp. 538-
[53] F. Hafiz, A. Swain, C. Naik, and N. Patel, "Efficient feature selection of 544: IEEE.
power quality events using two dimensional (2D) particle swarms,"
Applied Soft Computing, vol. 81, p. 105498, 2019. [70] A. Fatima, R. Maurya, M. K. Dutta, R. Burget, and J. Masek, "Android
Malware Detection Using Genetic Algorithm based Optimized Feature
[54] X. Han, P. Liu, L. Wang, and D. Li, "Unsupervised feature selection via Selection and Machine Learning," in 2019 42nd International Conference
graph matrix learning and the low-dimensional space learning for on Telecommunications and Signal Processing (TSP), 2019, pp. 220-223:
classification," Engineering Applications of Artificial Intelligence, vol. IEEE.
87, p. 103283, 2020.
[71] E. Widiyanti and S. N. Endah, "Feature Selection for Music Emotion
[55] T. Niu, J. Wang, H. Lu, W. Yang, and P. Du, "Developing a deep learning Recognition," in 2018 2nd International Conference on Informatics and
framework with two-stage feature selection for multivariate financial time Computational Sciences (ICICoS), 2018, pp. 1-5: IEEE.
series forecasting," Expert Systems with Applications, vol. 148, p.
113237, 2020. [72] G. Zeller et al., "Potential of fecal microbiota for early‐stage detection
of colorectal cancer," Molecular systems biology, vol. 10, no. 11, 2014.
[56] D. Jain and V. Singh, "An efficient hybrid feature selection model for
dimensionality reduction," Procedia Computer Science, vol. 132, pp. 333- [73] J. Zackular, M. Rogers, and M. Ruffin, "4th, Schloss PD," The human gut
341, 2018. microbiome as a screening tool for colorectal cancer. Cancer Prev Res
(Phila), vol. 7, no. 11, pp. 1112-21, 2014.
[57] E. S. Hosseini and M. H. Moattar, "Evolutionary feature subsets selection
based on interaction information for high dimensional imbalanced data [74] M. A. Berbar, "Hybrid methods for feature extraction for breast masses
classification," Applied Soft Computing, vol. 82, p. 105581, 2019. classification," Egyptian informatics journal, vol. 19, no. 1, pp. 63-73,
2018.
[58] Z. Manbari, F. AkhlaghianTab, and C. Salavati, "Hybrid fast
unsupervised feature selection for high-dimensional data," Expert [75] M. A. Rahman, M. F. Hossain, M. Hossain, and R. Ahmmed, "Employing
Systems with Applications, vol. 124, pp. 97-118, 2019. PCA and t-statistical approach for feature extraction and classification of
emotion from multichannel EEG signal," Egyptian Informatics Journal,
[59] K. Qu, F. Gao, F. Guo, and Q. Zou, "Taxonomy dimension reduction for vol. 21, no. 1, pp. 23-35, 2020.
colorectal cancer prediction," Computational biology and chemistry, vol.
83, p. 107160, 2019. [76] C. Chu, Z. Zuo-xi, K. Xin-rong, and G. Yun-zhi, "The Research of
Machinery Fault Feature Extraction Methods Based On Vibration
[60] S. Umbarkar and S. Shukla, "Analysis of heuristic based feature reduction Signal," IFAC-PapersOnLine, vol. 51, no. 17, pp. 346-352, 2018.
method in intrusion detection system," in 2018 5th International
Conference on Signal Processing and Integrated Networks (SPIN), 2018, [77] Y. Li, Y. Chai, H. Zhou, and H. Yin, "A novel feature extraction method
pp. 717-720: IEEE. based on discriminative graph regularized autoencoder for fault
diagnosis," IFAC-PapersOnLine, vol. 52, no. 24, pp. 272-277, 2019.
[61] F. Farokhmanesh and M. T. Sadeghi, "Deep Feature Selection using an
Enhanced Sparse Group Lasso Algorithm," in 2019 27th Iranian [78] V. Nagarajan, E. C. Britto, and S. M. Veeraputhiran, "Feature extraction
Conference on Electrical Engineering (ICEE), 2019, pp. 1549-1552: based on empirical mode decomposition for automatic mass classification
IEEE. of mammogram images," Medicine in Novel Technology and Devices,
vol. 1, p. 100004, 2019.
[62] H.-T. Duong and V. T. Hoang, "Dimensionality Reduction Based on
Feature Selection for Rice Varieties Recognition," in 2019 4th [79] N. Rabin, M. Kahlon, S. Malayev, and A. Ratnovsky, "Classification of
International Conference on Information Technology (InCIT), 2019, pp. human hand movements based on EMG signals using nonlinear
199-202: IEEE. dimensionality reduction and data fusion techniques," Expert Systems
with Applications, vol. 149, p. 113281, 2020.
[63] A. F. Alharan, H. K. Fatlawi, and N. S. Ali, "A cluster-based feature
selection method for image texture classification," Indonesian Journal of [80] M. Kuncan, K. Kaplan, M. R. Mi̇ naz, Y. Kaya, and H. M. Ertunç, "A
Electrical Engineering and Computer Science, vol. 14, no. 3, pp. 1433- novel feature extraction method for bearing fault classification with one
1442, 2019. dimensional ternary patterns,",vol.100,p.346-357.ISA transactions, 2020.

[64] M. Z. Osman, M. A. Maarof, M. F. Rohani, K. Moorthy, and S. Awang, [81] Z. Liu, J. Wang, G. Liu, and L. Zhang, "Discriminative low-rank
"Multi-Scale Skin Sample Approach for Dynamic Skin Color Detection: preserving projection for dimensionality reduction," Applied Soft
An Analysis," Advanced Science Letters, vol. 24, no. 10, pp. 7662-7667, Computing, vol. 85, p. 105768, 2019.
2018. [82] J. Ma and Y. Yuan, "Dimension reduction of image deep feature using
[65] Y. Arshak and A. Eesa, "A New Dimensional Reduction Based on PCA," Journal of Visual Communication and Image Representation, vol.
Cuttlefish Algorithm for Human Cancer Gene Expression," in 2018 63, p. 102578, 2019.
International Conference on Advanced Science and Engineering [83] A. Sellami and M. Farah, "Comparative study of dimensionality reduction
(ICOASE), 2018, pp. 48-53: IEEE. methods for remote sensing images interpretation," in 2018 4th
[66] D. Q. Zeebaree, H. Haron, and A. M. Abdulazeez, "Gene selection and International Conference on Advanced Technologies for Signal and
classification of microarray data using convolutional neural network," in Image Processing (ATSIP), 2018, pp. 1-6: IEEE.

69
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)

[84] X. Chen, J. Li, Y. Zhang, Y. Lu, and S. Liu, "Automatic feature extraction [91] T. Alipourfard, H. Arefi, and S. Mahmoudi, "A novel deep learning
in X-ray image based on deep learning approach for determination of bone framework by combination of subspace-based feature extraction and
age," Future Generation Computer Systems, 2019 Oct 31. convolutional neural networks for hyperspectral images classification," in
IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing
[85] Z. Jin, G. Feng, Y. Ren, and X. Zhang, "Feature Extraction Optimization Symposium, 2018, pp. 4780-4783: IEEE.
of JPEG Steganalysis Based on Residual Images," Signal
Processing,Vol.170, p. 107455, 2020. [92] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari,
"Trainable Model Based on New Uniform LBP Feature to Identify the
[86] W. Lin, J. Huang, C. Y. Suen, and L. Yang, "A feature extraction model Risk of the Breast Cancer," in 2019 International Conference on
based on discriminative graph signals," Expert Systems with Advanced Science and Engineering (ICOASE), 2019, pp. 106-111: IEEE.
Applications, vol. 139, p. 112861, 2020.
[93] Z. Liu, Z. Lai, W. Ou, K. Zhang, and R. Zheng, "Structured optimal graph
[87] S. M. Kasongo and Y. Sun, "A deep learning method with wrapper based based sparse feature extraction for semi-supervised learning," Signal
feature extraction for wireless intrusion detection system," Computers & Processing,vol.170, p. 107456, 2020.
Security, vol. 92, p. 101752, 2020.
[94] A. M. Martinez, "The AR face database," CVC Technical Report24, 1998.
[88] Y. Liu and A. Sui, "Research on Feature Dimensionality Reduction in
Content Based Public Cultural Video Retrieval," in 2018 IEEE/ACIS 17th [95] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, "The FERET
International Conference on Computer and Information Science (ICIS), database and evaluation procedure for face-recognition algorithms,"
2018, pp. 718-722: IEEE. Image and vision computing, vol. 16, no. 5, pp. 295-306, 1998.
[89] O. Dehzangi and V. Sahu, "IMU-Based Robust Human Activity [96] Y. Xu, X. Li, J. Yang, Z. Lai, and D. Zhang, "Integrating conventional
Recognition using Feature Analysis, Extraction, and Reduction," in 2018 and inverse representation for face recognition," IEEE transactions on
24th International Conference on Pattern Recognition (ICPR), 2018, pp. cybernetics, vol. 44, no. 10, pp. 1738-1746, 2013.
1402-1407: IEEE.
[97] F. S. Samaria and A. C. Harter, "Parameterisation of a stochastic model
[90] X. Zhang et al., "Spatial-Spectral Graph-Based Nonlinear Embedding for human face identification," in Proceedings of 1994 IEEE workshop on
Dimensionality Reduction for Hyperspectral Image Classificaiton," in applications of computer vision, 1994, pp. 138-142: IEEE.
IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing
Symposium, 2018, pp. 8472-8475: IEEE. .

View publication stats

Cybersecurity For Industrial Control Systems: A Survey
No ratings yet
Cybersecurity For Industrial Control Systems: A Survey
19 pages
Intrusion Detection Systems Based On Machine Learning Algorithms
No ratings yet
Intrusion Detection Systems Based On Machine Learning Algorithms
7 pages
TMAE Updated V2
No ratings yet
TMAE Updated V2
9 pages
PID5829865
No ratings yet
PID5829865
7 pages
TheEffectofExamStressonStudentsEatinghabitsandlifestyleofuniversitystudentsinErbil
No ratings yet
TheEffectofExamStressonStudentsEatinghabitsandlifestyleofuniversitystudentsinErbil
8 pages
The Effect of Filler ER4043 and ER5356 On Porosity Distribution of Welded AA6061 Aluminum Alloy
No ratings yet
The Effect of Filler ER4043 and ER5356 On Porosity Distribution of Welded AA6061 Aluminum Alloy
6 pages
Design of RF to DC conversion circuit for energy harvesting in CMOS 0.13-μm technology
No ratings yet
Design of RF to DC conversion circuit for energy harvesting in CMOS 0.13-μm technology
11 pages
Work_Stress_Experienced_by_the_Teaching_Staff_of_U
No ratings yet
Work_Stress_Experienced_by_the_Teaching_Staff_of_U
10 pages
Understanding Memorable Tourism Experiences and Behavioural Intentions of Heritage Tourists
No ratings yet
Understanding Memorable Tourism Experiences and Behavioural Intentions of Heritage Tourists
15 pages
JFMH61721450729800-2 (1)
No ratings yet
JFMH61721450729800-2 (1)
13 pages
Successful Application of CADD
No ratings yet
Successful Application of CADD
16 pages
energies-16-01556
No ratings yet
energies-16-01556
18 pages
BulgarianChemicalCommunications
No ratings yet
BulgarianChemicalCommunications
8 pages
Review On Available Technologies For Hydrogen Production
No ratings yet
Review On Available Technologies For Hydrogen Production
9 pages
Vetworld 16 536
No ratings yet
Vetworld 16 536
11 pages
Study of 3-Nm Cylindrical Gaafets With Variations in High-K Dielectric Gate-Oxide Materials
No ratings yet
Study of 3-Nm Cylindrical Gaafets With Variations in High-K Dielectric Gate-Oxide Materials
6 pages
BonorowowetlandJournal2017_1786-85-2059-1-10-20170823
No ratings yet
BonorowowetlandJournal2017_1786-85-2059-1-10-20170823
6 pages
1
No ratings yet
1
6 pages
Maxillary and Mandibular Interarch Width Among Different Malocclusions
No ratings yet
Maxillary and Mandibular Interarch Width Among Different Malocclusions
4 pages
Final-article-attachemnt-with-doi-B-4339
No ratings yet
Final-article-attachemnt-with-doi-B-4339
9 pages
Hydrodyamic Modelling of River Estuary Pahang
No ratings yet
Hydrodyamic Modelling of River Estuary Pahang
16 pages
Real-Time PCG Anomaly Detection by Adaptive 1D Convolutional Neural Networks
No ratings yet
Real-Time PCG Anomaly Detection by Adaptive 1D Convolutional Neural Networks
12 pages
Web-Based System For Visualisation of Water Quality Index Web-Based System For Visualisation of Water Quality Index
No ratings yet
Web-Based System For Visualisation of Water Quality Index Web-Based System For Visualisation of Water Quality Index
9 pages
Alzheimer Disease Detection Techniques and Methods
No ratings yet
Alzheimer Disease Detection Techniques and Methods
14 pages
Applications of Waste Material in The Pervious Concrete Pavement: A Review
No ratings yet
Applications of Waste Material in The Pervious Concrete Pavement: A Review
8 pages
Journalof Nanosciencesin Press
No ratings yet
Journalof Nanosciencesin Press
7 pages
Artificial Intelligent Modelling Based Optimizationofsteamturbinetosupportnet-Zero
No ratings yet
Artificial Intelligent Modelling Based Optimizationofsteamturbinetosupportnet-Zero
18 pages
Methimazole Discontinuation Before Radioiodine Therapy in Patients With Graves' Disease
No ratings yet
Methimazole Discontinuation Before Radioiodine Therapy in Patients With Graves' Disease
7 pages
Aris (2018) PDF
No ratings yet
Aris (2018) PDF
5 pages
A Comparative Study of Task Based Approach and Traditional Approach in English Language Teaching
No ratings yet
A Comparative Study of Task Based Approach and Traditional Approach in English Language Teaching
18 pages
DFM and DFA Approach On Designing Pressure Vessel
No ratings yet
DFM and DFA Approach On Designing Pressure Vessel
6 pages
DFM and DFA Approach On Designing Pressure Vessel PDF
No ratings yet
DFM and DFA Approach On Designing Pressure Vessel PDF
6 pages
2008-DFM and DFA Approach On Designing Pressure Vessel
No ratings yet
2008-DFM and DFA Approach On Designing Pressure Vessel
6 pages
Mapping
No ratings yet
Mapping
9 pages
Digital Pathology - Transforming Diagnosis in The Digital Age
No ratings yet
Digital Pathology - Transforming Diagnosis in The Digital Age
13 pages
Rizwan Muneer
No ratings yet
Rizwan Muneer
4 pages
PoultryScience-2016-Sedaghat-ps_pew247
No ratings yet
PoultryScience-2016-Sedaghat-ps_pew247
11 pages
GPS Based Portable Dual-Axis Solar Tracking System Using Astronomical Equation
No ratings yet
GPS Based Portable Dual-Axis Solar Tracking System Using Astronomical Equation
6 pages
15 .Enhancing Wrist Fracture Detection with YOLO Analysis of
No ratings yet
15 .Enhancing Wrist Fracture Detection with YOLO Analysis of
16 pages
Atmospheric Influences On Satellite Communications
No ratings yet
Atmospheric Influences On Satellite Communications
5 pages
The Therapeutic Properties and Applications of Aloe Vera: A Review
No ratings yet
The Therapeutic Properties and Applications of Aloe Vera: A Review
11 pages
JournalPaper Processes
No ratings yet
JournalPaper Processes
20 pages
Industrial Perspectives of Lactic Acid Bacteria For Biopreservation and Food Safety
No ratings yet
Industrial Perspectives of Lactic Acid Bacteria For Biopreservation and Food Safety
12 pages
Single Facility Location Selectionproblem For A Pakistan Based Ice Cream Company: A Case Study
No ratings yet
Single Facility Location Selectionproblem For A Pakistan Based Ice Cream Company: A Case Study
9 pages
Ojcm 2014010710572870
No ratings yet
Ojcm 2014010710572870
11 pages
10 1108 - BPMJ 11 2019 0464
No ratings yet
10 1108 - BPMJ 11 2019 0464
27 pages
Penentuan Energi Celah Pita Optik Film Tio2 Menggunakan Metode Tauc Plot
No ratings yet
Penentuan Energi Celah Pita Optik Film Tio2 Menggunakan Metode Tauc Plot
7 pages
Water Salinity Detection Using A Smartphone: February 2017
No ratings yet
Water Salinity Detection Using A Smartphone: February 2017
10 pages
ArsenicIshwardi
No ratings yet
ArsenicIshwardi
5 pages
Differences in Goal Scoring and Passing EURO 2012
No ratings yet
Differences in Goal Scoring and Passing EURO 2012
7 pages
Effectof Technology
No ratings yet
Effectof Technology
12 pages
7) GH. SADEGHI Et Al, 2014
No ratings yet
7) GH. SADEGHI Et Al, 2014
4 pages
Claim Management System
No ratings yet
Claim Management System
3 pages
Paper 84
No ratings yet
Paper 84
9 pages
Mechanical Properties of Industrial Tyre Rubber Co
No ratings yet
Mechanical Properties of Industrial Tyre Rubber Co
5 pages
2018-5ArtikelNadhirah
No ratings yet
2018-5ArtikelNadhirah
8 pages
PC 1 2012 Askari 148
No ratings yet
PC 1 2012 Askari 148
10 pages
A Short Study To Test The Compliance of Various Pakistani Ordinary Portland Cements With ASTM Composition Standards
No ratings yet
A Short Study To Test The Compliance of Various Pakistani Ordinary Portland Cements With ASTM Composition Standards
8 pages
The_Role_of_Machine_Learning_in_Digital
No ratings yet
The_Role_of_Machine_Learning_in_Digital
13 pages
Detectron2 in Practice: Definitive Reference for Developers and Engineers
From Everand
Detectron2 in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Final PPT Lung
100% (4)
Final PPT Lung
21 pages
CS8082-Machine Learning Techniques
No ratings yet
CS8082-Machine Learning Techniques
13 pages
Hu Y. AI Techniques in EV Motor and Inverter Fault Detection and Diagnosis 2024
No ratings yet
Hu Y. AI Techniques in EV Motor and Inverter Fault Detection and Diagnosis 2024
293 pages
Exploring the Synergy of Artificial Intelligence in Microbiology
No ratings yet
Exploring the Synergy of Artificial Intelligence in Microbiology
10 pages
AI Report Presentation
No ratings yet
AI Report Presentation
14 pages
1 PB
No ratings yet
1 PB
9 pages
Ieee Format
No ratings yet
Ieee Format
8 pages
Music Genre Classification Using Machine Learning: Prajwal R, Shubham Sharma, Prasanna Naik, Mrs. Sugna MK
No ratings yet
Music Genre Classification Using Machine Learning: Prajwal R, Shubham Sharma, Prasanna Naik, Mrs. Sugna MK
5 pages
Breast Cancer Classification Using Deep Learning Final Ppt (1)
No ratings yet
Breast Cancer Classification Using Deep Learning Final Ppt (1)
19 pages
Machine Learning For Predicting Patient Wait Time
No ratings yet
Machine Learning For Predicting Patient Wait Time
7 pages
INT423 Roll.17
No ratings yet
INT423 Roll.17
9 pages
5 - AIML - Module3 - PPT
No ratings yet
5 - AIML - Module3 - PPT
37 pages
Medicinal Plant Classification Using Particle Swarm Optimized Cascaded Network
No ratings yet
Medicinal Plant Classification Using Particle Swarm Optimized Cascaded Network
14 pages
A contest of sentiment analysis: k-nearest neighbor versus neural network
No ratings yet
A contest of sentiment analysis: k-nearest neighbor versus neural network
9 pages
Machine Learning 2M&10M Qpaper
No ratings yet
Machine Learning 2M&10M Qpaper
3 pages
Research Paper
No ratings yet
Research Paper
6 pages
Whitepaper KX
No ratings yet
Whitepaper KX
230 pages
Edited - Django Website For Disease Prediction Using Machine Learning
No ratings yet
Edited - Django Website For Disease Prediction Using Machine Learning
7 pages
Eda6 - Sklearn - neighbors.KNeighborsClassifier
No ratings yet
Eda6 - Sklearn - neighbors.KNeighborsClassifier
6 pages
KNNPVFaulty Identification Algorithm
No ratings yet
KNNPVFaulty Identification Algorithm
29 pages
Prediction of Customer Engagement Response to E-wallet Content Based
No ratings yet
Prediction of Customer Engagement Response to E-wallet Content Based
14 pages
Brain Tumour Detection Using M-IRO-Journals-3 4 5
No ratings yet
Brain Tumour Detection Using M-IRO-Journals-3 4 5
12 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Argument Reality
No ratings yet
Argument Reality
18 pages
Big Data Penelitian Indonesia
No ratings yet
Big Data Penelitian Indonesia
52 pages
Epileptic Seizure Detection in EEG Signals Using Machine Learning and Deep Learning Techniques
No ratings yet
Epileptic Seizure Detection in EEG Signals Using Machine Learning and Deep Learning Techniques
12 pages
Movie Recommendation System
No ratings yet
Movie Recommendation System
22 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
Predicting Brain Age using ml algorithms
No ratings yet
Predicting Brain Age using ml algorithms
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Dimensionality Reduction

Uploaded by

Dimensionality Reduction

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A Comprehensive Review of Dimensionality Reduction Techniques for Feature

Article in Journal of Applied Science and Technology Trends · May 2020

Rizgar R. Zebari Adnan Mohsin Abdulazeez

53 PUBLICATIONS 2,384 CITATIONS

Diyar Zeebaree Dilovan Zebari

SEE PROFILE SEE PROFILE

Technical SCIENCE ⚙ View project

Client-Server and Video Broadcasting View project

The user has requested enhancement of the downloaded file.

JOURNAL OF APPLIED SCIENCE AND TECHNOLOGY TRENDS

A Comprehensive Review of Dimensionality Reduction

patterns or information in almost every data mining task. Also,

Fig. 2. Hierarchy of feature selection techniques. Selecting the Best Subset

Filter is considered the earliest method and also known as

Feature extraction algorithms are classified into linear and

TABLE III. FEATURE SELECTION METHODS SUMMARIZATION

HOG and fisher

TABLE IV. FEATURE EXTRACTION METHODS SUMMARY

RF, KNN, SVM, Bayes Net and

[66]) based on the neural network where the accuracy of both

VI. CONCLUSION [8] M. A. Mohammed, B. Al-Khateeb, A. N. Rashid, D. A. Ibrahim, M. K.

View publication stats

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.