Dimensionality Reduction
Dimensionality Reduction
net/publication/341413445
CITATIONS READS
292 4,761
5 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Rizgar R. Zebari on 16 May 2020.
Rizgar R. Zebari 1, *, Adnan Mohsin Abdulazeez2, Diyar Qader Zeebaree3, Dilovan Asaad Zebari4, Jwan Najeeb Saeed5
1
IT Department. Technical College of Informatics Akre, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
rizgar.ramadhan@dpu.edu.krd
2
Presidency of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, Adnan.mohsin@dpu.edu.krd
3,4
Research Center of Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq, dqszeebaree@dpu.edu.krd,
dilovan.zebari@dpu.edu.krd
5
IT Department. Duhok Technical Institute, Duhok Polytechnic University, Duhok, Kurdistan Region, Iraq,
jwan.najeeb@dpu.edu.krd
* Correspondence: rizgar.ramadhan@dpu.edu.krd
Abstract
Due to sharp increases in data dimensions, working on every data mining or machine learning (ML) task requires more efficient
techniques to get the desired results. Therefore, in recent years, researchers have proposed and developed many methods and techniques
to reduce the high dimensions of data and to attain the required accuracy. To ameliorate the accuracy of learning features as well as to
decrease the training time dimensionality reduction is used as a pre-processing step, which can eliminate irrelevant data, noise, and
redundant features. Dimensionality reduction (DR) has been performed based on two main methods, which are feature selection (FS) and
feature extraction (FE). FS is considered an important method because data is generated continuously at an ever-increasing rate; some
serious dimensionality problems can be reduced with this method, such as decreasing redundancy effectively, eliminating irrelevant data,
and ameliorating result comprehensibility. Moreover, FE transacts with the problem of finding the most distinctive, informative, and
decreased set of features to ameliorate the efficiency of both the processing and storage of data. This paper offers a comprehensive
approach to FS and FE in the scope of DR. Moreover, the details of each paper, such as used algorithms/approaches, datasets, classifiers,
and achieved results are comprehensively analyzed and summarized. Besides, a systematic discussion of all of the reviewed methods to
highlight authors' trends, determining the method(s) has been done, which significantly reduced computational time, and selecting the
most accurate classifiers. As a result, the different types of both methods have been discussed and analyzed the findings.
Keywords: dimension reduction, dimension reduction techniques, feature selection, feature extraction.
Received: April 29, 2020 / Accepted: May 13, 2020 / Online: May 15, 2020
56
doi: 10.38094/jastt1224
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
to reduce the size of input data and then preserve much variance TABLE I. THE SUMMARY OF DIMENSION REDUCTION TECHNIQUES
of essential features compared to the dataset with the larger Method Main concept Pros Cons
size. In real-world data, it will become easy to detect and use
Preserves the Not sufficient
for data mining applications and gain high accuracy Summarize the
original, relative enough in the
performance [1, 9]. Moreover, the role of dimensionality Feature dataset by creating
distance between existing of a huge
extraction linear combinations
reduction is to enhance the accuracy and efficiency of the data of the features
covers latent number of
mining computation, and it is considered as a vital structure, objects irrelevant features
preprocessing step. Furthermore, it provides several advantages A sublist of relevant
Feature features can be Strong against Latent structure
such as eliminating irrelevant, redundant patterns in the dataset; selection selected depending irrelevant features does not cover
as a result, to reduce the time and amount of memory required on defined criteria
for processing such data [1, 10]. By reducing the dataset, the
quality of data will improve, the algorithm will work
efficiently, achieve better accuracy, and pattern design and
examination will be clearer for researchers [11]. Additionally, A. Feature Selection
reducing the cost of computing, improving dimensions Feature selection is utilized to reduce the dimensionality
visualization, and enhancing the results [12, 13]. impact on the dataset through finding the subset of feature
This work reviewed more than forty articles of feature which efficiently define the data [18, 19]. It selects the
selection and feature extraction that have been introduced and important and relevant features to the mining task from the
published in the last three years. input data and removes redundant and irrelevant features [20,
21]. It is useful for detecting a good subset of features that is
appropriate for the given problem [2, 22]. The main purpose of
II. DIMENSIONALITY REDUCTION TECHNIQUES feature selection is to construct a subset of features as small as
Dimensionality reduction is the operation of transforming possible but represents the whole input data vital features [11,
the high dimensional representation of data in low dimensional 23]. Feature selection provides numerous advantages: reduce
representations. With the massive growth in high dimensional the size of data, decrease needed storage, prediction accuracy
data, the use of various dimensionality reduction techniques has improvement, overfitting evading, and reduce executing and
become popular in many areas of use. Moreover, several training time from easily understanding variables. Feature
modern approaches are continually emerging. Dimensionality selection algorithm phase is divided into two-phase such as (i)
reduction techniques transform the original dataset having high Subset Generation: (ii) Subset Evaluation: In subset
dimensionality and turn it into a new dataset representing low Generation, we need to generate subset from the input dataset
dimensionality while maintaining as much as possible the and to use Subset Evaluations we have to check whether the
original meanings of the data. The low dimensional generated subset is optimal or not [24, 25]. “Fig.1” shows the
representation of the original data contributes to solving the overall method of the feature selection process.
dimensionality curse problem. The low dimensional data can be
Subset Subset
easily analyzed, processed, and visualized [14]. Several Generatio Evaluation
benefits can be obtained due to applying the dimensionality n
reduction techniques applied to a dataset. (i) As the number of
Full Feature
dimensions comes down, data storage space can be reduced. (ii) Set
It takes less computation time only. (iii) Redundant, irrelevant,
and noisy data can be removed. (iv) Data quality can be No Yes
improved. (v) Some algorithms do not perform well on a greater Selected
Is stopping
number of dimensions taken. So, reducing these dimensions criteria met?
Feature
helps an algorithm to work efficiently and improves accuracy. Set
(vi) It is challenging to visualize data in higher dimensions. So,
reducing the dimension may allow us to design and examine
patterns more clearly. (vii) It simplifies the process of
classification and also improves efficiency [15, 16]. Generally, Fig. 1. Process of feature selection.
the dimensionality reduction techniques can be classified into
two main groups, or in other words, the dimensionality
reduction is achieved through two different techniques: feature B. Feature Selection Problems
selection and feature extraction. In feature selection, Various issues can benefit from the feature selection
information can be lost since some features should be excluded techniques application. High dimension, low sample size data
when the process of feature subset choice by doing this are becoming more popular in different fields. Many of the
information can be reduced. However, in feature extraction, the features of these problems do not facilitate an adequate
dimension can be decreased without losing much initial feature classification. More so, the imbalance problem happens when
dataset [2, 10, 14, 17]. Table I provides a descriptive summary one of the two classes has more samples than other classes.
of the methods of dimension reduction.
Many algorithms neglect the minority sample when
concentrating on a major sample classification. However, the
minority samples are crucial but seldom occurred. Moreover, in
57
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
machine learning, the shift of the dataset is a popular problem thus, they need extensive computation times to achieve the
that happens when the joint distribution of inputs and outputs convergences and can be intractable for large datasets [33, 34].
varies between training and test stages. A special case of dataset “Fig. 3” shows the involved steps in the wrapper method.
shift, which happens when only the input distribution changes
is called Covariate shift. Furthermore, the reduction of the Selecting the Best Subset
dimensionality and consequently feature selection is one of the
most common techniques of noisy data elimination. Eventually,
misclassification costs and test costs are the two most
significant kinds of cost in cost-sensitive learning [26-28].
Set of all Learning Performance
Generate a subset
features algorithm
C. Feature Selection Methods
Feature selection aims to select a feature subset from the
original set of features based on a/the feature’s relevance and
redundancy. Originally evaluation methods in feature selection
Fig. 3. Wrapper method for feature selection.
are divided into four kinds: filter, wrapper, embedded [10, 14,
18], and hybrid [20, 29]. Recently, another type of evaluation
method is developed, i.e., ensemble feature selection [30, 31]. Embedded method is a built-in feature selection mechanism
“Fig. 2” depicts the hierarchy of feature selection techniques. that embeds feature selection in the learning algorithm and uses
its properties to guide feature evaluation. The embedded
Feature Selection Techniques method is more effective and more tractable than the wrapper
method computationally while retaining similar performance.
This is because the embedded method avoids the repeated
Filter Embedde Wrappe Hybrid Ensemble execution of the classifier and the examination of every feature
d r subset. The embedded method combines the qualities of both
filter and wrapper methods. It selects features during the
implementation of the mining algorithm, and hence it has less
Pearson LDA Forward Backward Recursive computational expensiveness [32, 35]. Steps involved in the
correlation selection elimination Feature
Elimination
embedded method are shown in “Fig. 4”.
58
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
59
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
IV. AREVIW ON DIMENTIONALITY REDUCTIOM METHODS learning repository and 9 from the feature selection dataset
website). Also, the SVM and decision tree (DT) used as
A. Feature Selectuion Methods
classifiers and. In addition, the presented method compared
Dimensionality reduction utilizes the feature selection with eight unsupervised feature selection methods. The results
methods to select relevant features. In this study, we discuss demonstrated that the proposed method accomplished better
several recent works for feature selection. The kinds of accuracy regarding the SVM and DT classifier except in two
literature on feature selection methods are summarized in Table datasets and better stability in both classifiers compared to the
III.
8 feature selection methods.
Churmonge and Jena [51] proposed a method to address the
Niu et al. [55] presented a method to deal with multivariate
dimensionality issue based on the clustering combined with
correlation filter subset selection. The relevant features were financial time series nonlinearity inherent to improve the
found by the K-means clustering algorithm and redundant accuracy of forecasting and make the financial decision better.
features found from clusters and removed by the correlation The proposed method involved a feature selection part, deep
measure. The presented method used on 8 text and 4 microarray learning framework, and error correlation part. In the feature
datasets and the Naive Bayes (NB) classifier depended on the selection part, the RReliefF algorithm (which is the enhanced
classification. Furthermore, the authors compared their method version of the ReliefF) cooperated wrapper-based method to
performance with the ReliefF and information gain (IG) feature remove the redundant feature. Also, the deep learning part has
selection methods relating to the accuracy and computational consisted of long-short term memory (LSTM), gated recurrent
time. The accuracy of the proposed method outperformed both unit (GRU), and the optimizer based on adaptive moment
methods in all datasets except two datasets, and in estimation (Adam). The deep leaning part was trained based on
computational, the proposed was faster than other methods in the subset generated by the first part. Furthermore, the error
all datasets. correlation used to enhance the accuracy of the method. The
Tan et al. [52] presented a feature selection method based on method performance validated on 16 benchmarks and three
the evolutionary algorithm (EA) to reduce the dimensionality datasets, and the results have shown its superiority.
of motor imagery brain-computer interface from Jain and Singh [56] proposed a hybrid feature selection
electroencephalogram (EEG) signals. The subset of important method that consisted of ReliefF and PCA algorithms. First, the
features was generated from each iteration of the EA, while the weight for each feature was calculated in the used datasets, and
redundant and insignificant features were eliminated. The a set of satisfying features was generated by the first algorithm.
experiments were performed in two different datasets: EEG The second algorithm was applied in the generated set. In the
dataset and several machine learning datasets. Also, three proposed method, two types of datasets were considered (text
classifiers depended: support vector machine (SVM), K- and microarray), and the experiments performed in ten datasets.
Nearest Neighbor (KNN), and discriminant analysis (DA). The performance of the method evaluated in terms of a number
Also, the performance of the proposed EA – feature selection of the selected features and computation time. The results
method was compared with PCA and independent component indicated that the presented method could achieve better
analysis (ICA), neighborhood component analysis (NCA), and performance in low and high dimensional datasets and reduced
variable-length particle swarm optimization (VLPSO). The half of the irrelevant and redundant features.
results showed that the introduced methods outperformed all
the above methods and could achieve high accuracy even with Hosseini and Moattar [57] presented a feature subset
a small subset of the features. selection method for imbalanced data classification with high
Hafiz et al. [53] investigated the feature selection issues in dimensional. The authors focused on the feature space, and the
the power quality events and proposed a two dimensional PSO method was based on interaction information to improve the
feature selection method. They depended on the two search process. Through each iteration of the method, multiple
dimensional in order to efficiently guide the search space of the subsets of the features generated, and the best subset involved
particle swarm. The noise measurement against the reduced in the next iteration. In more detail, the candidate features
subset was studied by the Gaussian. The used induction selected by the Symmetric Uncertainty Algorithm (SUA) first.
algorithms in this study were KNN and Naïve Bayes. After the multivariate interaction information used to test the
Moreover, the proposed method performance was compared candidate feature subset and based on the dominated
with the Genetic Algorithm (GA), Ant Colony Optimization relationship the best subset of the features selected.
(ACO), Binary PSO (BPSO), Catfish BPSO, and Chaotic Furthermore, KNN, Naïve Bayes and CART were used as
BPSO (CHBPSO). The results have shown that the presented classifiers. The efficiency of the proposed method assessed in
method could find an important and robust feature subset and 13 datasets from different repositories. Additionally, the
achieve better accuracy than the above-mentioned methods. performance of the method outperformed 10 other feature
Han et al. [54] worked on the limitation of the local linear selection methods in terms of accuracy and in a number of the
embedding (LLE) method to propose an unsupervised feature reduced features.
selection mechanism. They depended on the low dimensional Manbari et al. [58] proposed a hybrid unsupervised feature
space learning and graph matrix learning. The experiments selection method based on the clustering and binary ant system.
performed in 15 datasets (6 of them from the UGI machine The procedure of the method executed in two stages: In the first,
60
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
the clustering was performed in order to cluster the features and Duong and Hoang [62] presented a method based on
in the second stage and from each cluster the best feature Histogram of Oriented Gradient (HOG) descriptor and feature
calculated through the iteration of the Ant process. Meanwhile, selection method to classify the rice quality. They extracted
the second stage of the proposed method repeated several times HOG features from the rice image, and the score of each feature
until the dominated features collected. The presented method was calculated by the fisher score feature selection method. In
compared to seven other unsupervised feature selection the proposed method, the VNRICE dataset was used to perform
methods in eight datasets from UCI and Pablo de Olavide the experiment of the method. Also, the NN classifier was
research group. Moreover, the comparison of the method with utilized. The results have shown that the fisher score method
other methods was performed and the assessment done by using enhanced the accuracy by 42% of the classification and the
four classifiers (SVM, KNN, DT, and RF). The results computation time reduced.
demonstrated that the performance of the method was better
Alharan et al. [63] proposed a method based on feature
than other methods and significantly minimized computation
extraction and feature selection methods for texture image
time.
classification. Firstly, the set of features was extracted from the
Qu et al. [59] proposed feature selection for predicting used datasets by using three approaches (Gray Level Co-
colorectal cancer based on operational taxonomic units. Three occurrence Matrix (GLCM), Local Binary Pattern (LBP), and
feature selection methods were integrated into the proposed Gabor filter). After that, the second stage was performed, which
method. In first the subset of most significant operational was the evaluation of the extracted features, and the assessment
taxonomic units generated by multiple dimension-reduction was done by five techniques (info gain, Gain ratio, oneR,
methods. Then for reducing the dimensionality and increasing ReliefF, and symmetric). Then based on the previous
the efficiency of the correlation-based feature selection (CFS) assessment, the feature selection was accomplished by utilizing
and maximum relevance–maximum distance (MRMD) was the K-means clustering algorithm. The presented method
used as a combined method. Moreover, according to the experiment was done on three datasets, and SVM, NB, and
taxonomy file, the best features selected. The experiment was KNN classifiers were used for the classification. The results
performed in two datasets, and three classifiers (RF, Naïve showed that the NB and KNN achieved better accuracy in the
Bayes, and DT) were dependent on evaluation. The results first dataset while the SVM attained superior accuracy in the
illustrated that the correlation-based feature selection method second and third datasets.
performed better reduction, and the MRMD required more
Osman et al. [64] proposed a model involved in feature
amount of time and memory for computation. Among all used
extraction and feature selection color-based methods for
classifiers, the RF achieved better performance.
identifying origin automatically. The presented method was
Umbarkar and Shukla [60] presented a method to improve performed in three stages. In the first, the skin color information
the performance of the instruction detection system. They used was extracted from human faces by using skin color detection
IG, gain ratio (GR), and CFS algorithms for reducing the technique. Next, the wrapper subset evaluator and GA method
dimensionality in the used dataset. To obtain the best-reduced to eliminate the redundant and irrelevant features. Moreover,
set of features, they divided the original set into several parts, 1550 face images from different regions were used by the
and then each algorithm applied to those parts, and the most authors, and six classifiers (NB, Bayes Net, KNN, SVM, RF,
accurate was selected. The experiment of the proposed system and Multilayer Perceptron (MLP)) were used for the
was performed on KDD-Cup 99 dataset, and DT used as the classification. The results illustrated that the individual color
classifier. The results indicated that the correlation-based features accuracy was lower than the accuracy of the combined
feature selection method achieved better performance color features. Also, the accuracy of the SVM, NB, and Bayes
compared to other used algorithms. Net was very low, and hence they could not be used for the
proposed method.
Farokhmanesh and Sadeghi [61] proposed a feature
selection method based on sparse feature selection and deep Arshak and Eesa [65] proposed a feature selection method
neural networks. Initially, Correntropy-induced, Discriminative for dimensionality reduction based on the cuttlefish algorithm
Least Squares Regression (DLSR), and Sparse Group Lasso (CFA) for gene classification. The cuttlefish was used to
(SGL) three methods of sparse were evaluated and compared. generate a subset of the optimal features. Also, the KNN was
Next, the SGL was integrated with a deep neural network, and used as a classifier for the evaluation and classification of the
the performance of this combination also assessed. Meanwhile, proposed method. The experiment was performed on eight
the K-means algorithm was used in the SGL method in order to different datasets from ELVIRA biomedical dataset repository.
group features. The nearest neighbor (NN) algorithm was used The performance of the proposed method was compared with
as a classifier in the performance evaluation of the used SVM and DT and the hidden Markov model in terms of
techniques. The experiment and valuation process of three accuracy and computational time. The result demonstrated that
methods of the sparse and combination method was performed the presented achieved better performance in five datasets
on the MNIST dataset. The results illustrated that the SGL compared to the other methods.
combined with a deep neural network achieved better accuracy.
Zeebaree et al.[66, 67] proposed a feature selection method
based on the Convolutional neural network (CNN) for
61
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
classifying and identifying the cancer type in the microarray used by the authors. 10000 stego images have been taken from
cancer data. In the presented method, the cancer data was the BOSSbase dataset for the experiment of the presented
transformed into the array after the files of data were opened. method. The classification accuracy of the methods was
Next, the cancer data organized as matrix vectors, then the CNN evaluated with SVM, KNN, and DT classifiers and compared
was applied for the classification. The experiment was with three other PSO based methods. The results revealed that
accomplished in ten cancer datasets, and the performance of the the proposed method outperformed other methods.
method was compared with mSVM-RFE-iRF and varSeIRF
Fatima et al. [70] proposed an optimized feature selection
methods. The results indicated that the proposed CNN method
method for detecting malware in the android platform. They
achieved better classification accuracy compared to the other
used evolutionary GA to reduce the feature dimensionality to
methods. Also, it outperformed the other methods in terms of
50% of the original dataset and then to train the classifier in
decreasing cancer’s genes.
order to be capable of detecting the malware features. They
Balasaraswathi [68] introduced a feature selection method for used two APks sets (Malware/Good ware) and two classifiers
intrusion detection systems based on CAF and membrane (SVM and neural network). The experimental results
computing (MC). In the proposed method membrane demonstrated that the SVM accuracy was 96.6%, and neural
computing was integrated with the cuttlefish algorithm aiming network accuracy was 95.2%. The authors conducted the
to enhance the feature selection process. Moreover, two performance of the presented method could be enhanced by
datasets of the intrusion detection system were dependent on utilizing larger datasets.
performing the method experiment. The performance of the
Widiyanti and Endah [71] worked on a study for recognition
cuttlefish with and without MC was illustrated. Furthermore,
of the music emotion based on the feature selection algorithms.
the comparison of the proposed method’s performance and a In the first several features were extracted from the used
number of various methods of feature selection was datasets. After that, three feature selection algorithms, namely
accomplished. The results show the accuracy and computation Sequential Forward Selection (SFS), Sequential Backward
time of the CAF combined with MC was better than all other Selection (SBS), and ReliefF were used to identify emotional
methods. features. Some emotion classification has been used, such as
Kaur and Singh [69] proposed a method for image sad, angry and happy, etc. the experimental performed on songs
steganalysis based on feature selection and PSO. First, the dataset and SVM classifier was used to compare the
predominant features were selected by mutual information. performance of the used algorithms. The results explained that
Moreover, for selecting dominant features, adaptive PSO was the accuracy of the ReliefF algorithm was lower than other
algorithms that obtained a similar accuracy.
Computation
Ref. Year Dataset Technique(s) Classifier(s) Accuracy
Time
Text and Correlation and K- 0.5 to 10.24 The best 99.0.2% and the
[51] 2018 NB
Microarray means seconds worst 68.02%
KNN:99.68% and
[53] 2018 Power quality PSO 20.9 seconds KNN and NB
NB:99.44%
Text and
[56] 2018 ReliefF and PCA 1 to 29 seconds - -
Microarray
IG, GR and Correlation: 92.65%, IG:
[60] 2018 KDD-Cup 99 - DT
correlation 92.33% and GR: 92.54%
NB: 57%, SVM:60%,
wrapper subset and NB, KNN, SVM,
[64] 2018 Human images - KNN:64%, RF: 75% and
GA RF and MLP
MLP: 71%
0.049 to 2.11
[65] 2018 ELVIRA CFA KNN 100%
seconds
Different cancer
[66] 2018 CNN - - 100%
datasets
0.15 by CFA
[68] 2018 KDDCUP’99 CFA and MC and 0.11 j48 96.66%
CFAMC
SFS, SBS and
[71] 2018 Song dataset - SVM 43%
ReliefF
UCI, KEEL and SUA and interaction KNN, NB and 100% for KNN, NB and
[57] 2019 126.57 seconds
GitHub information CART CART
K-means and binary SVM, KNN, DT
[58] 2019 UCI 9.939 seconds -
Ant and RF
[59] 2019 [72] and[73] CFS and MRMD - RF, NB, and DT RF achieved better accuracy
DLSR, SGL and
[61] 2019 MNIST - NN 96.77%
Deep Learning
62
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
B. Feature Extraction Methods The result indicated that the prepared method outperformed
In previous literature, the dimensionality reduction uses the several other emotion recognition methods. Chu et al. [76]
feature selection methods to select the relevant features have proposed a system for extracting features from the machinery
been presented. The remainder of the aim of this paper reviews fault-based vibration signals. They used three feature extraction
is to review the latest literature related to feature extraction and methods: Fourier transform frequency spectrum (FTFS),
dimensionality reduction techniques. Table IV illustrates the envelopment analysis, and local mean decomposition (LMD).
summary of the recent literature. The vibration signals were analyzed by the Fourier transform
analysis to get amplitude and phase spectrums. The results have
Moghaddam et al. [41] proposed a method known as given that the envelope analysis and mean decomposition
spectral segmentation and integration (SSI) as supervised methods could extract between cancerous and non-cancerous
feature extraction for hyperspectral images. The developed tumors in the breast. The authors used a new threshold to
method divided pixels’ spectral signature curve to channels. improve LBP texture features and the LBP descriptor for
Then a mean weighted operator was used for integration of each identifying the abnormal cases. In the proposed method, the
channel band in order to extract new features in a very minimal features extracted by using CNN and SVM were used for the
number compared to the original bands. Moreover, the PSO classification. The experimental results have explained that the
algorithm was used to merge spectral signature curve pixel developed method could classify the ultrasound images with
segments so as to reduce the dimensionality of the image and to high accuracy and sensitivity.
increase the class accuracy. In the proposed technique, the
SVM was used as a classifier, and two datasets were used. The Li et al. [77] worked on fault diagnosis and used
experimental results confirmed the SSI method outperformed discriminative graph regularized autoencoder (DGAE) to
other feature extraction methods such as PCA, SRS, NWFE, design a feature extraction method. To map process data to the
DAFE, PCA, SELD, BCC, and CBFE. feature space, to avoid manually designing feature problems
and to ensure that the data characteristics is truly reflected by
Berbar [74] worked on malignant masses in mammograms the learned feature reflect they used advanced neural network
based on the feature extraction. The researcher presented Gray structure. Furthermore, the neural network structure model is
Level Co-occurrence Matrix (GLCM) texture feature extraction integrated with the graph to learn internal representation and to
by three hybrid methods that were used in the proposed method. preserve locality. Also, to improve the performance of the
The three hybrid methods called Wavelet CT1, Wavelet CT2, classification, training samples for the label information were
and ST-GLCM. The interesting point of the image was divided embedded to the graph. NN was used as a classifier. In
into sub-image then contrast stretching stage was used prior to comparison with other fault diagnosis feature extraction
feature extraction. Then the sub-image has been applied for the methods the proposed method achieved better performance.
methods of feature extraction. Next, the GLCM extracted the
seven-feature texture and have been merged with seven Nagarajan et al. [78] used Empirical Mode Decomposition
statistical features. Moreover, two datasets images were used in (EMD) to propose two feature extraction methods of
this research and SVM classifier. The proposed methods mammogram image. The interesting point of the images
outperformed the multi-resolution feature extractions methods divided into a group of different frequency components. Then
in terms of the number of the extracted feature. Also, in Area they performed their first method based on Bi-dimensional
under the Curve (AUC) measure, the researcher methods were Empirical Mode Decomposition (BEMD). Through these
superior to other feature extraction methods. groups, the GLCM and gray level rum matrix features were
extracted. However, the extracted features by BEMD first
Rahman et al. [75] worked on the emotion recognition task.
feature extraction method were less orthogonal to each other.
They used PCA and t-statistical to reduce the dimensionality of
Therefore, the researchers proposed the second feature
extracted features from emotional signals of EEG. The
extraction which was a modified version of the first one and
proposed method was applied to the dataset called SJTU called (MBEMD). The SVM and LDA classifier were used in
emotion EEG. The emotional state with extracted features has this research. Furthermore, the proposed method applied in
been classified by four classifiers: SVM, ANN, LDA, and different databases and obtained steady performance.
KNN.
63
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
Rabin et al. [79] used a short-time Fourier transform (STFT) University). The results revealed that the classification
to extract features from the human hand movement-based accuracy of the TLPP/CBS was better compared to all other
signal of electromyography (EMG). Because the space of the combination methods for both used datasets.
extracted features was large, they used PCA and diffusion maps
(DM) to reduce the dimensionality of the extracted features. Chen et al. [84] proposed a study on X-ray images for
Also, to perform the comparison for the performance of both determining bone age. They used a deep neural network to
methods with different sizes of the training set. The researchers extract features from the X-ray images. The extracted features
used KNN as a classifier in this study. The result indicated the include: Glutamate cysteine ligase modifier subunit (GCLM)
DM technique outperformed the PCA method in case of the features and Local Binary Patterns (LBP) features in the image.
limited training set. However, in the large training set, both In more detail, the features of the X-ray hand bone image could
methods achieved high performance in the term of the be automatically extracted using deep learning and the bone age
classification. was assessed automatically by the convolution neural network.
Moreover, the PCA algorithm was used to reduce the
Kuncan et al. [80] worked on the diagnosis of bearing fault dimensionality of the extracted features. The extracted features
for the rotary machine. They extracted features from the were classified by the SVM classifier. The test data, training
vibration signals of bearing by applying their proposed method data and verify data established by a captured image from
called a one-dimensional ternary pattern (1D-TP). The signals several males and females and different ages. The results
were collected from three different datasets in size, speed, and proved that the presented study achieved better performance
parts. Moreover, Random Forest (RF), KNN, SVM, Bayes Net, compared to other methods in this field.
and ANN were used as classifiers with the ternary pattern. The
results indicated that the proposed method could extract Jin et al. [85] proposed a feature extraction algorithm for
efficient features from vibration signals for classification. Also, JPEG steganalysis based on the adaptive scale adjustment
all used classifiers attained effective accuracy. algorithm. In this algorithm, the scale of feature extraction was
adjusted adaptively according to the quality of JPEG images.
Liu et al. [81] built a method for feature extractions based They mainly depended on the Boss Base 1.01 database, and
on the incorporation of the discriminant analysis and the low they applied their algorithm on the MD-CFR feature. The result
ranks representation of the original data samples. The has shown that the performance of the steganalysis improved
supervised proposed feature extraction method is called by the proposed method. Also, the dimensionality of extracted
discriminative low rank preserving projection (DLRPP). The features was reduced by the presented method and hence it
presented method performance was compared with seven other could be used based on residual images in the other steganalysis
feature extraction methods (LPP, LSDA, DPSR, LRRDP, methods.
LSPP, LRPP and FOLP) on images of six different datasets.
Liu et al. [67] presented a method for feature extraction
The DLRPP method achieved better performance in the term of
recognition rate. depending on graph-based space to construct an optimal
algorithm for semi-supervised learning. Particularly the
Ma and Yuan [82] proposed a method for extraction presented method was a combination of sparse representation,
features from images based on the deep CNN and PCA. They discriminative projection, and manifold learning for
used a neural network to extract features. Due to the high dimensionality reduction. They designed and performed their
dimension of the extracted features, they improved and method to obtain semi-supervised feature extraction and spars
optimized the PCA algorithm by deep learning through structure local manifolds at the same time. Moreover, the
simulation experiments. Then the researchers compared the optimal value was accomplished by modifying the similarity
performance of the PCA before and after improvement. The matrix in each iteration. The experiment was executed in five
memory usage before optimizing the algorithms was more than datasets and the performance of the presented method
6000 MB and after optimizing the memory utilization compared with six other methods (PCA, MSEC, DLSR,
decreased to less than 1000 MB. Also, there was a big NLDLSR, SOGFS, and SDR). The offered method
difference in the time consumed by the PCA algorithm before outperformed all other methods in all used datasets.
and after optimization. Rather than the performance of the
improved PCA was effective, also the classifier accuracy was Lin et al. [68] worked on the discriminative graph signal to
enhanced which was done by the SVM algorithm. propose a feature extraction method that could extract good
features to perform the desired classification. All the training
Sellami and Farah [83] presented a combination of feature samples in the contained graph were established. Moreover,
extraction and band selection methods to reduce the they used eigenvector decomposition in order to attain the
dimensionality of hyperspectral images. They used several Fourier base of the graph. Numerous discriminative signals
feature extraction methods such as PCA, TLPP, KPCA and LE were extracted concurrently for achieving high accuracy
as linear, nonlinear methods. Also, the researchers utilized MI, especially in a problem that has multiple classes. The proposed
DM, CBS, and PA supervised and unsupervised band selection method was performed in four different experiments with
methods. The SVM algorithm was used as a classifier in the several datasets. The results indicated that the presented
proposed implementation. They combine both groups of the methods could achieve encouraging performance, and it was
algorithms as follows: TLPP/CBS, PCA/CBS, KPCA/BS, considered to be more effective with supervised classification.
PCA/MI, LE/CBS, TLPP/MI, LE/MI, and KPCA/MI. The
authors used two datasets (the first dataset’s images were from Kasongo and Sun [69] proposed a method for wireless
Indian Pines and the second dataset’s images were from Pavia intrusion detection systems based on the feed-forward Deep
64
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
Neural Network (DNN). They used a wrapper feature transformation methods to reduce dimensionality. The results
extraction unit with the DNN framework in order to extract the showed that using the Random forest feature selection method
optimal feature vector. Then the extra trees algorithm was used with Ensemble bagged classifier and using Neighborhood
as a classifier. Moreover, two intrusion detection datasets component analysis along with Ensemble bagged classifier
(UNSW-NB15 and the AWID) were dependent as well to achieved better accuracy.
examine their methods’ efficiency. Moreover, the experiment
of the proposed method performed in two different types of Zhang et al. [86] proposed a system for the hyperspectral
attacks, which were binary and multiclass. Furthermore, the images to minimize the dimensionality. They depended on the
researchers compared their method with RF, KNN, SVM, DT, sparse graph and spatial as the integrated method. They utilized
and NB. The proposed method of detection accuracy PCA and entropy rate in order to divide hyperspectral images
outperformed the other methods. Whereas the attack to superpixel patch. Moreover, trained data of the graph was
classification by feed-forward deep neural network was better constructed by using superpixel segmentation. Then, they
than all the above classifiers. extracted spatial-spectral information when the sparse and low-
rank graphs generated on the obtained data. After that, to
Liu and Sui [70] worked on different methods to minimize transform the graph embedding to nonlinear space and map the
the dimensionality in content-based public cultural video input data into a high-dimensional space, they used the kernel
retrieval. The public cultural videos’ content features extracted trick. The proposed method was evaluated by two datasets
by using the combination of the deep learning framework (Indian Pines data set and the University of Pavia (Pavia-U)
(Caffe) and Alex's net network model. Due to the high data set) and the SVM algorithm was dependent as a classifier.
dimension of the extracted features, the researchers used PCA The results show that the accuracy of the presented method is
to reduce that dimensionality. The researchers examined their higher than other methods.
work in several videos. The results indicated that the video
contents of the used datasets could effectively be compressed Alipourfard et al. [87] worked on the hyperspectral images
by the PCA algorithm, and only minor contents of the video high dimensionality and proposed a system to reduce it. The
retrieval were lost while lowering the dimensionality of the proposed system was a combination of CNN and the subspace
extracted feature. feature extraction method. The authors reduced the
dimensionality of the hyperspectral images by the subspace
Dehzangi and Sahu [71] worked on human activity method in order to generate high-quality training samples for
recognition. They used spectral and temporal analysis to extract the convolutional neural network and for logistic regression that
features from the Inertial Measurement Unit. Moreover, several they used as the classifier. Moreover, the presented method was
feature extraction methods evaluated and particularly the examined by the researchers in two famous two datasets
methods based on the time and frequency domain such as power (Indiana Pines and the Pavia University scenes). The
spectral density and Autocorrelation. Also, a number of the experimental results proved that the proposed method accuracy
classifiers (DT, KNN, SVM, Neural network, and Ensemble has been improved and achieved higher marks, even under the
bagged) were utilized for human activity recognition that was limited samples of the training samples.
used in the proposed system. In addition to reducing the
dimensionality of the extracted features, the researchers used
PCA and KPCA. Although they used feature selection and
65
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
66
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
data. As a result, about 70% of the features reduced, executed selection and feature extraction methods. We observed that the
time minimized and memory utilized decreased and hence the trend of the researchers for reducing the dimensionality based
accuracy increased by 3%. Also, in research [84] the accuracy on the feature selection methods is to use the optimization
was improved around 7% by using CNN and PCA together. In algorithms, and about half of the reviewed researches were
[87] DNN framework was used with a wrapper feature relaying on the different techniques of optimization. Also, the
extraction algorithm. In the used datasets the dimensionality most used classifiers are the SVM and KNN, and the best-
was reduced efficiently, and accuracy increased by about more achieved accuracy was the SVM algorithm. On the other hand,
than 6%. In [88] the dimensionality of the public culture videos for feature extraction methods, CNN and DNN techniques take
was reduced proficiently by the deep learning framework and a great role and have been used in 7 methods of the studied
PCA algorithm. The classification accuracy enhanced about 4% research. While the PCA is still a widely used algorithm in the
by the integrated CNN and feature subspace reduction in the feature extraction works, it has been used in 8 methods.
research [91]. Additionally, the optimized PCA could achieve better
performance in terms of accuracy, computational time, and the
Moreover, the PCA algorithms were used in 8 feature number of reduced features.
extraction methods [70], [79], [82], [83], [84], [88], [89] and
[90] which exist in the literature. In [55] the PCA role was to
reduce the redundant information rather than extracting the REFERENCES
features. Also, in [79], [83], [84], [88] and [89], the [1] N. Sharma and K. Saroha, "Study of dimension reduction methodologies
dimensionality of the extracted features were reduced by the in data mining," in International Conference on Computing,
PCA. While in research [90] the PCA algorithm was used for Communication & Automation, 2015, pp. 133-137: IEEE.
extracting the initial component in the process of converting the [2] S. Ayesha, M. K. Hanif, and R. Talib, "Overview and comparative study
HSI images into a superpixel patch. In research [41] the PSO of dimensionality reduction techniques for high dimensional data,"
was used to minimize the dimensionality and enhance the Information Fusion, vol. 59, pp. 44-58, 2020.
classification accuracy. The wavelet transform was utilized to [3] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and S. R. Zeebaree,
decrease the number of extracted features in [74]. Also, the "Combination of K-means clustering with Genetic Algorithm: A review,"
study [76] proved that extracting the features in the strong noise International Journal of Applied Engineering Research, vol. 12, no. 24,
pp. 14238-14245, 2017.
was only accomplished by the FTFS when compared with
envelopment analysis and LMD. Further, the SVM algorithm [4] Z. Cheng and Z. Lu, "A novel efficient feature dimensionality reduction
was the most used classifier in the summarized feature method and its application in engineering," Complexity, vol. 2018, 2018.
extraction methods in this paper, it was used in 12 methods. [5] D. A. Zebari, H. Haron, D. Q. Zeebaree, and A. M. Zain, "A Simultaneous
Approach for Compression and Encryption Techniques Using
The best-achieved accuracy of the reviewed methods and Deoxyribonucleic Acid," in 2019 13th International Conference on
from those who depended on the PCA algorithms is research Software, Knowledge, Information Management and Applications
[82] for a reason; they optimized the algorithm by deep (SKIMA), 2019, pp. 1-6: IEEE.
learning. Also, the enhanced PCA by the CNN in the research [6] M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, "Fast hybrid
[82] reduced a great computational time, it was 1300 without dimensionality reduction method for classification based on feature
CNN and became 100 when the PCA algorithm improved by selection and grouped feature extraction," Expert Systems with
the CNN. But in the research [41] the computational complexity Applications, vol. 150, p. 113277, 2020.
of the SSI that has been combined with PSO was more [7] A. P. Pandian, R. Palanisamy, and K. Ntalianis, Proceeding of the
compared to the spectral region splitting (SRS). International Conference on Computer Networks, Big Data and IoT
(ICCBI-2019). Springer Nature, 2020.
67
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
[14] M. Verleysen and D. François, "The curse of dimensionality in data [32] M. Dash and H. Liu, "Feature selection for classification," Intelligent data
mining and time series prediction," in International Work-Conference on analysis, vol. 1, no. 3, pp. 131-156, 1997.
Artificial Neural Networks, 2005, pp. 758-770: Springer.
[33] D. Jain and V. Singh, "Feature selection and classification systems for
[15] L. Liu and M. T. Özsu, Encyclopedia of database systems. Springer New chronic disease prediction: A review," Egyptian Informatics Journal, vol.
York, NY, USA:, 2009. 19, no. 3, pp. 179-189, 2018.
[16] A. Juvonen, T. Sipola, and T. Hämäläinen, "Online anomaly detection [34] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, "Machine
using dimensionality reduction techniques for HTTP log analysis," learning and Region Growing for Breast Cancer Segmentation," in 2019
Computer Networks, vol. 91, pp. 46-56, 2015. International Conference on Advanced Science and Engineering
(ICOASE), 2019, pp. 88-93: IEEE.
[17] X. Huang, L. Wu, and Y. Ye, "A Review on Dimensionality Reduction
Techniques," International Journal of Pattern Recognition and Artificial [35] D. Zebari, H. Haron, and S. Zeebaree, "Security Issues in DNA Based on
Intelligence, vol. 33, no. 10, p. 1950017, 2019. Data Hiding: A Review," International Journal of Applied Engineering
Research,vol. 12,no. 24, ISSN, pp. 0973-4562, 2017.
[18] D. L. Padmaja and B. Vishnuvardhan, "Comparative study of feature
subset selection methods for dimensionality reduction on scientific data," [36] M. M. Kabir, M. M. Islam, and K. Murase, "A new wrapper feature
in 2016 IEEE 6th International Conference on Advanced Computing selection approach using neural network," Neurocomputing, vol. 73, no.
(IACC), 2016, pp. 31-34: IEEE. 16-18, pp. 3273-3283, 2010.
[19] M. B. Abdulrazzaq and J. N. Saeed, "A Comparison of Three [37] Y. Peng, Z. Wu, and J. Jiang, "A novel feature selection approach for
Classification Algorithms for Handwritten Digit Recognition," in 2019 biomedical data classification," Journal of Biomedical Informatics, vol.
International Conference on Advanced Science and Engineering 43, no. 1, pp. 15-23, 2010.
(ICOASE), 2019, pp. 58-63: IEEE.
[38] Q. Shen, R. Diao, and P. Su, "Feature Selection Ensemble," Turing-100,
[20] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A novel feature-selection vol. 10, pp. 289-306, 2012.
approach based on the cuttlefish optimization algorithm for intrusion
detection systems," Expert Systems with Applications, vol. 42, no. 5, pp. [39] M. K. Elhadad, K. M. Badran, and G. I. Salama, "A novel approach for
2670-2679, 2015. ontology-based dimensionality reduction for web text document
classification," International Journal of Software Innovation (IJSI), vol. 5,
[21] A. S. Eesa, A. M. A. Brifcani, and Z. Orman, "Cuttlefish algorithm-a no. 4, pp. 44-58, 2017.
novel bio-inspired optimization algorithm," International Journal of
Scientific & Engineering Research, vol. 4, no. 9, pp. 1978-1986, 2013. [40] D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Enhance
the Mammogram Images for Both Segmentation and Feature Extraction
[22] P. Jindal and D. Kumar, "A review on dimensionality reduction Using Wavelet Transform," in 2019 International Conference on
techniques," International journal of computer applications, vol. 173, no. Advanced Science and Engineering (ICOASE), 2019, pp. 100-105: IEEE.
2, pp. 42-46, 2017.
[41] S. H. A. Moghaddam, M. Mokhtarzade, and B. A. Beirami, "A feature
[23] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A new feature selection extraction method based on spectral segmentation and integration of
model based on ID3 and bees algorithm for intrusion detection system," hyperspectral images," International Journal of Applied Earth
Turkish Journal of Electrical Engineering & Computer Sciences, vol. 23, Observation and Geoinformation, vol. 89, p. 102097, 2020.
no. 2, pp. 615-622, 2015.
[42] D. M. Sulaiman, A. M. Abdulazeez, H. Haron, and S. S. Sadiq,
[24] U. M. Khaire and R. Dhanalakshmi, "Stability of feature selection "Unsupervised Learning Approach-Based New Optimization K-Means
algorithm: A review," Journal of King Saud University-Computer and Clustering for Finger Vein Image Localization," in 2019 International
Information Sciences, 2019. Conference on Advanced Science and Engineering (ICOASE), 2019, pp.
82-87: IEEE.
[25] S. Visalakshi and V. Radha, "A literature review of feature selection
techniques and applications: Review of feature selection in data mining," [43] R. Aziz, C. Verma, and N. Srivastava, "Dimension reduction methods for
in 2014 IEEE International Conference on Computational Intelligence microarray data: a review," AIMS. Bioengineering, vol. 4, no. 1, pp. 179-
and Computing Research, 2014, pp. 1-6: IEEE. 197, 2017.
[26] C. M. Teng, "Combining noise correction with feature selection," in [44] A. S. Eesa, A. M. Abdulazeez, and Z. Orman, "A DIDS Based on The
International Conference on Data Warehousing and Knowledge Combination of Cuttlefish Algorithm and Decision Tree," Science Journal
Discovery, 2003, pp. 340-349: Springer. of University of Zakho, vol. 5, no. 4, pp. 313-318, 2017.
[27] H. Zhao, F. Min, and W. Zhu, "Cost-sensitive feature selection of numeric [45] Z. M. Hira and D. F. Gillies, "A review of feature selection and feature
data with measurement errors," Journal of Applied Mathematics, vol. extraction methods applied on microarray data," Advances in
2013, 2013. bioinformatics, vol.170. 2015, 2015.
[28] J. N. Saeed, "A SURVEY OF ULTRASONOGRAPHY BREAST [46] A. M. Abdulazeez and A. S. Issa, "Intrusion detection system based on
CANCER IMAGE SEGMENTATION TECHNIQUES," Academic neural networks using bipolar input with bipolar sigmoid activation
Journal of Nawroz University, vol. 9, no. 1, pp. 1-14, 2020. function," AL-Rafidain Journal of Computer Sciences and Mathematics,
vol. 8, no. 2, pp. 79-86, 2011.
[29] Y. Leung and Y. Hung, "A multiple-filter-multiple-wrapper approach to
gene selection and microarray data classification," IEEE/ACM [47] D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Multi-Level
Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, of DNA Encryption Technique Based on DNA Arithmetic and Biological
pp. 108-117, 2008. Operations," in 2018 International Conference on Advanced Science and
Engineering (ICOASE), 2018, pp. 312-317: IEEE.
[30] C. Lazar et al., "A survey on filter techniques for feature selection in gene
expression microarray analysis," IEEE/ACM Transactions on [48] O. M. S. Hassan, A. M. Abdulazeez, and V. M. TİRYAKİ, "Gait-based
Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1106-1119, human gender classification using lifting 5/3 wavelet and principal
2012. component analysis," in 2018 International Conference on Advanced
Science and Engineering (ICOASE), 2018, pp. 173-178: IEEE.
[31] M. R. Mahmood and A. M. Abdulazeez, "A Comparative Study of a New
Hand Recognition Model Based on Line of Features and Other [49] F. P. Shah and V. Patel, "A review on feature selection and feature
Techniques," in International Conference of Reliable Information and extraction for text classification," in 2016 International Conference on
Communication Technology, 2017, pp. 420-432: Springer. Wireless Communications, Signal Processing and Networking
(WiSPNET), 2016, pp. 2264-2268: IEEE.
68
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
[50] H. Sadeeq, A. Abdulazeez, N. Kako, and A. Abrahim, "A Novel Hybrid 2018 International Conference on Advanced Science and Engineering
Bird Mating Optimizer with Differential Evolution for Engineering (ICOASE), 2018, pp. 145-150: IEEE.
Design Optimization Problems," in International Conference of Reliable
Information and Communication Technology, 2017, pp. 522-534: [67] O. Ahmed and A. Brifcani, "Gene Expression Classification Based on
Springer. Deep Learning," in 2019 4th Scientific International Conference Najaf
(SICN), 2019, pp. 145-149: IEEE.
[51] S. Chormunge and S. Jena, "Correlation based feature selection with
clustering for high dimensional data," Journal of Electrical Systems and [68] V. Balasaraswathi, "Enhanced Cuttle Fish Algorithm Using Membrane
Information Technology, vol. 5, no. 3, pp. 542-549, 2018. Computing for feature selection of intrusion detection.",vol.10, special
issue,2018.
[52] P. Tan, X. Wang, and Y. Wang, "Dimensionality reduction in
evolutionary algorithms-based feature selection for motor imagery brain- [69] J. Kaur and S. Singh, "Feature selection using mutual information and
computer interface," Swarm and Evolutionary Computation, vol. 52, p. adaptive particle swarm optimization for image steganalysis," in 2018 7th
100597, 2020. International Conference on Reliability, Infocom Technologies and
Optimization (Trends and Future Directions)(ICRITO), 2018, pp. 538-
[53] F. Hafiz, A. Swain, C. Naik, and N. Patel, "Efficient feature selection of 544: IEEE.
power quality events using two dimensional (2D) particle swarms,"
Applied Soft Computing, vol. 81, p. 105498, 2019. [70] A. Fatima, R. Maurya, M. K. Dutta, R. Burget, and J. Masek, "Android
Malware Detection Using Genetic Algorithm based Optimized Feature
[54] X. Han, P. Liu, L. Wang, and D. Li, "Unsupervised feature selection via Selection and Machine Learning," in 2019 42nd International Conference
graph matrix learning and the low-dimensional space learning for on Telecommunications and Signal Processing (TSP), 2019, pp. 220-223:
classification," Engineering Applications of Artificial Intelligence, vol. IEEE.
87, p. 103283, 2020.
[71] E. Widiyanti and S. N. Endah, "Feature Selection for Music Emotion
[55] T. Niu, J. Wang, H. Lu, W. Yang, and P. Du, "Developing a deep learning Recognition," in 2018 2nd International Conference on Informatics and
framework with two-stage feature selection for multivariate financial time Computational Sciences (ICICoS), 2018, pp. 1-5: IEEE.
series forecasting," Expert Systems with Applications, vol. 148, p.
113237, 2020. [72] G. Zeller et al., "Potential of fecal microbiota for early‐stage detection
of colorectal cancer," Molecular systems biology, vol. 10, no. 11, 2014.
[56] D. Jain and V. Singh, "An efficient hybrid feature selection model for
dimensionality reduction," Procedia Computer Science, vol. 132, pp. 333- [73] J. Zackular, M. Rogers, and M. Ruffin, "4th, Schloss PD," The human gut
341, 2018. microbiome as a screening tool for colorectal cancer. Cancer Prev Res
(Phila), vol. 7, no. 11, pp. 1112-21, 2014.
[57] E. S. Hosseini and M. H. Moattar, "Evolutionary feature subsets selection
based on interaction information for high dimensional imbalanced data [74] M. A. Berbar, "Hybrid methods for feature extraction for breast masses
classification," Applied Soft Computing, vol. 82, p. 105581, 2019. classification," Egyptian informatics journal, vol. 19, no. 1, pp. 63-73,
2018.
[58] Z. Manbari, F. AkhlaghianTab, and C. Salavati, "Hybrid fast
unsupervised feature selection for high-dimensional data," Expert [75] M. A. Rahman, M. F. Hossain, M. Hossain, and R. Ahmmed, "Employing
Systems with Applications, vol. 124, pp. 97-118, 2019. PCA and t-statistical approach for feature extraction and classification of
emotion from multichannel EEG signal," Egyptian Informatics Journal,
[59] K. Qu, F. Gao, F. Guo, and Q. Zou, "Taxonomy dimension reduction for vol. 21, no. 1, pp. 23-35, 2020.
colorectal cancer prediction," Computational biology and chemistry, vol.
83, p. 107160, 2019. [76] C. Chu, Z. Zuo-xi, K. Xin-rong, and G. Yun-zhi, "The Research of
Machinery Fault Feature Extraction Methods Based On Vibration
[60] S. Umbarkar and S. Shukla, "Analysis of heuristic based feature reduction Signal," IFAC-PapersOnLine, vol. 51, no. 17, pp. 346-352, 2018.
method in intrusion detection system," in 2018 5th International
Conference on Signal Processing and Integrated Networks (SPIN), 2018, [77] Y. Li, Y. Chai, H. Zhou, and H. Yin, "A novel feature extraction method
pp. 717-720: IEEE. based on discriminative graph regularized autoencoder for fault
diagnosis," IFAC-PapersOnLine, vol. 52, no. 24, pp. 272-277, 2019.
[61] F. Farokhmanesh and M. T. Sadeghi, "Deep Feature Selection using an
Enhanced Sparse Group Lasso Algorithm," in 2019 27th Iranian [78] V. Nagarajan, E. C. Britto, and S. M. Veeraputhiran, "Feature extraction
Conference on Electrical Engineering (ICEE), 2019, pp. 1549-1552: based on empirical mode decomposition for automatic mass classification
IEEE. of mammogram images," Medicine in Novel Technology and Devices,
vol. 1, p. 100004, 2019.
[62] H.-T. Duong and V. T. Hoang, "Dimensionality Reduction Based on
Feature Selection for Rice Varieties Recognition," in 2019 4th [79] N. Rabin, M. Kahlon, S. Malayev, and A. Ratnovsky, "Classification of
International Conference on Information Technology (InCIT), 2019, pp. human hand movements based on EMG signals using nonlinear
199-202: IEEE. dimensionality reduction and data fusion techniques," Expert Systems
with Applications, vol. 149, p. 113281, 2020.
[63] A. F. Alharan, H. K. Fatlawi, and N. S. Ali, "A cluster-based feature
selection method for image texture classification," Indonesian Journal of [80] M. Kuncan, K. Kaplan, M. R. Mi̇ naz, Y. Kaya, and H. M. Ertunç, "A
Electrical Engineering and Computer Science, vol. 14, no. 3, pp. 1433- novel feature extraction method for bearing fault classification with one
1442, 2019. dimensional ternary patterns,",vol.100,p.346-357.ISA transactions, 2020.
[64] M. Z. Osman, M. A. Maarof, M. F. Rohani, K. Moorthy, and S. Awang, [81] Z. Liu, J. Wang, G. Liu, and L. Zhang, "Discriminative low-rank
"Multi-Scale Skin Sample Approach for Dynamic Skin Color Detection: preserving projection for dimensionality reduction," Applied Soft
An Analysis," Advanced Science Letters, vol. 24, no. 10, pp. 7662-7667, Computing, vol. 85, p. 105768, 2019.
2018. [82] J. Ma and Y. Yuan, "Dimension reduction of image deep feature using
[65] Y. Arshak and A. Eesa, "A New Dimensional Reduction Based on PCA," Journal of Visual Communication and Image Representation, vol.
Cuttlefish Algorithm for Human Cancer Gene Expression," in 2018 63, p. 102578, 2019.
International Conference on Advanced Science and Engineering [83] A. Sellami and M. Farah, "Comparative study of dimensionality reduction
(ICOASE), 2018, pp. 48-53: IEEE. methods for remote sensing images interpretation," in 2018 4th
[66] D. Q. Zeebaree, H. Haron, and A. M. Abdulazeez, "Gene selection and International Conference on Advanced Technologies for Signal and
classification of microarray data using convolutional neural network," in Image Processing (ATSIP), 2018, pp. 1-6: IEEE.
69
Zebari et al. / Journal of Applied Science and Technology Trends Vol. 01, No. 02, pp. 56 –70, (2020)
[84] X. Chen, J. Li, Y. Zhang, Y. Lu, and S. Liu, "Automatic feature extraction [91] T. Alipourfard, H. Arefi, and S. Mahmoudi, "A novel deep learning
in X-ray image based on deep learning approach for determination of bone framework by combination of subspace-based feature extraction and
age," Future Generation Computer Systems, 2019 Oct 31. convolutional neural networks for hyperspectral images classification," in
IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing
[85] Z. Jin, G. Feng, Y. Ren, and X. Zhang, "Feature Extraction Optimization Symposium, 2018, pp. 4780-4783: IEEE.
of JPEG Steganalysis Based on Residual Images," Signal
Processing,Vol.170, p. 107455, 2020. [92] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari,
"Trainable Model Based on New Uniform LBP Feature to Identify the
[86] W. Lin, J. Huang, C. Y. Suen, and L. Yang, "A feature extraction model Risk of the Breast Cancer," in 2019 International Conference on
based on discriminative graph signals," Expert Systems with Advanced Science and Engineering (ICOASE), 2019, pp. 106-111: IEEE.
Applications, vol. 139, p. 112861, 2020.
[93] Z. Liu, Z. Lai, W. Ou, K. Zhang, and R. Zheng, "Structured optimal graph
[87] S. M. Kasongo and Y. Sun, "A deep learning method with wrapper based based sparse feature extraction for semi-supervised learning," Signal
feature extraction for wireless intrusion detection system," Computers & Processing,vol.170, p. 107456, 2020.
Security, vol. 92, p. 101752, 2020.
[94] A. M. Martinez, "The AR face database," CVC Technical Report24, 1998.
[88] Y. Liu and A. Sui, "Research on Feature Dimensionality Reduction in
Content Based Public Cultural Video Retrieval," in 2018 IEEE/ACIS 17th [95] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, "The FERET
International Conference on Computer and Information Science (ICIS), database and evaluation procedure for face-recognition algorithms,"
2018, pp. 718-722: IEEE. Image and vision computing, vol. 16, no. 5, pp. 295-306, 1998.
[89] O. Dehzangi and V. Sahu, "IMU-Based Robust Human Activity [96] Y. Xu, X. Li, J. Yang, Z. Lai, and D. Zhang, "Integrating conventional
Recognition using Feature Analysis, Extraction, and Reduction," in 2018 and inverse representation for face recognition," IEEE transactions on
24th International Conference on Pattern Recognition (ICPR), 2018, pp. cybernetics, vol. 44, no. 10, pp. 1738-1746, 2013.
1402-1407: IEEE.
[97] F. S. Samaria and A. C. Harter, "Parameterisation of a stochastic model
[90] X. Zhang et al., "Spatial-Spectral Graph-Based Nonlinear Embedding for human face identification," in Proceedings of 1994 IEEE workshop on
Dimensionality Reduction for Hyperspectral Image Classificaiton," in applications of computer vision, 1994, pp. 138-142: IEEE.
IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing
Symposium, 2018, pp. 8472-8475: IEEE. .
70