Samantaray2022 Article PredictionOfGroundwater-levelU
Samantaray2022 Article PredictionOfGroundwater-levelU
https://doi.org/10.1007/s12517-022-09900-y
ORIGINAL PAPER
Received: 17 August 2021 / Accepted: 15 March 2022 / Published online: 7 April 2022
© Saudi Society for Geosciences 2022
Abstract
Accurate and reliable prediction of groundwater level (GWL) fluctuation is vital and significant in water resources planning
and management. Because of intricacies in underground geological arrangement, efficiency of real-time GWL prediction is
inadequate. In the present study, a hybrid support vector machine (SVM) incorporated with ant lion optimizer (SVM-ALO)
was employed to estimate monthly GWL using data collected from two observational wells in Purba-Medinipur, India,
considering ten input combinations. The accuracy of hybrid SVM-ALO model is assessed against hybrid SVM-FOA (fruit
fly optimization algorithm), SVM-FFA (firefly algorithm), and conventional SVM models using five statistical performance
indices, i.e. root mean-squared error (RMSE), Willmott Index (WI), Pearson’s correlation coefficient (PCC), coefficient of
determination (R2), and graphical analysis. Analysis of results indicated that SVM-ALO model showed superior performance
than SVM-FOA and SVM-FFA models for all scenarios at both stations. The SVM-ALO-M10 model had the lowest value
of RMSE = 7.6638/8.838 and the highest value of PCC = 0.9815/0.98001, and WI = 0.98215/0.98067 for testing period at
Sherkhanchawk/Basantia high school, respectively. The developed robust SVM-ALO model is more efficient and appropri-
ate compared to SVM-FOA, SVM-FFA, and standalone SVM models for estimating monthly ground water levels in the
study region.
Keywords Ant lion optimizer · Fruit fly optimization · Firefly algorithm · Ground water · India
13
Vol.:(0123456789)
723 Page 2 of 22 Arab J Geosci (2022) 15: 723
2012). A suitable alternate to these models is machine learn- people’s livelihood in an area. Non-stationary and non-lin-
ing models that can provide precise prediction outcomes earity of hydrological applications cannot be captured by
with less cost and effective calibration time where avail- conventional models which are linear. For improving accu-
ability of data is scarce and physical process is not the main rateness of predicted results, a common practice adopted is
study motivation (Mohanty et al. 2015). Among several integrating conventional techniques with a meta-heuristic
data-driven techniques, artificial neural networks (ANNs) optimization algorithm for forming a robust hybrid model
and SVM are considered conventional nonlinear predic- where distinctive features of all models are integrated for
tors, surpassing barriers of physical models (Behzad et al. capturing various patterns in data. Several nature-inspired
2010; Samantaray et al. 2020b, Das et al. 2019). Yet, we are algorithms like particle swarm optimisation (PSO); FFA;
familiar with the fact that ANN model sustains problems of FOA; Multi-Verse Optimiser (MVO); Grey Wolf Optimiser
over fitting and drawbacks of local minima. To overcome (GWO); ALO, and Whale Optimisation Algorithm (WOA)
these shortcomings, SVM model can be utilised (Hosseini have successfully been employed in fields of hydrology and
and Mahjouri 2016). Relative studies on ANN and SVM water resources for optimization, modelling, prediction,
have been conducted in perspective of groundwater level and forecasting (Samantaray and Sahoo 2020; Ferreira and
predictions. Cunha 2020; Malik et al. 2020; Mattar 2018; Samantaray
et al. 2020c; Granata 2019; Sanikhani et al. 2018; Karbasi
Previous studies 2018). Assessments based on theoretical and empirical out-
comes recommend that hybrid techniques can be efficient
Performances of several data driven techniques were evalu- and effective in increasing forecasting capability (Malekza-
ated for GWL forecasting at different locations for differ- deh et al. 2019). In the current study, SVM method coupled
ent prediction periods (Jalalkamali et al. 2011; Moosavi with ALO optimisation algorithm for increasing efficacy of
et al. 2013; Shirmohammadi et al. 2013; Samantaray et al. the model.
2020a). Yoon et al. (2016) utilised a weighted error func- Dash et al. (2010) proposed an integrated ANN-GA
tion method for improving performances of ANN and SVM (genetic algorithm) model to accurately predict GWL in
to predict GWL on long-term basis in South Korea with lower Mahanadi river of Orissa, India, and compared its
respect to rainfall. Outcomes revealed that weighted error performance with other ANN algorithms. Results revealed
function improved accuracy and stability of prediction mod- that hybrid ANN-GA algorithm produced better perfor-
els, with SVM model performing superiorly compared to mance and accuracy in medium and high GWL prediction
ANN model. Nie et al. (2017) applied radial basis function than conventional ANN methods. Suryanarayana et al.
network (RBFN) and SVM for simulating GWL variations in (2014) proposed wavelet based SVM (W-SVM) and ANN
Da’an region of China. Based on assessment of results using (W-ANN) models for predicting monthly GWL fluctua-
statistical indices, it was found that SVM model had superior tions in Visakhapatnam city situated in Andhra Pradesh,
capability to simulate and predict GWL at a specified loca- India. Results indicated that W-SVM model provided bet-
tion. Sattari et al. (2018) utilised M5 decision tree and SVM ter accuracy in GWL prediction than conventional ANN,
for GWL prediction in Ardebil plain, Iran. Results specified SVM, and auto-regressive integrated moving average
that performances of both models are better for GWL predic- (ARIMA) models. Al-Shammari et al. (2016) proposed
tion at proposed location. Tang et al. (2019) proposed a new SVM-FFA model for predicting dew-point temperature
two-stage data-driven structure for modelling time-series ( T dew) of Isfahan Province in Iran on daily basis and
GWL in north county of UK with spatiotemporal study and compared its prediction capability against ANN, genetic
least square SVM (LS-SVM). Results of LS-SVM are com- programming (GP), and SVM models. Ghorbani et al.
pared with ANN, k-nearest neighbour, random forest, and (2017) applied hybrid SVM-FFA model for predicting
conventional SVM methods. They concluded that LS-SVM permanent wilting point and soil field capacity utilising
outperformed other data driven models in GWL prediction. certain easily accessible soil properties. Zare and Koch
Wang et al. (2018) established a customary decision-tree- (2018) comparatively evaluated the potential of ANFIS
based model for GWL prediction appropriate for conditions and Wavelet-ANFIS for simulating and predicting GWL
where low-dimension input data are obtainable. Findings fluctuations in Miandarband plain, Iran. Results showed
from their study indicated that proposed model produced that applied models can be utilised with suitable accu-
higher accuracy in forecasting GWL variations. rateness for GWL prediction, with W-ANFIS performing
While deciding management of water, results from vari- slightly better. Huang et al. (2017) applied SVM-PSO
ous kinds of techniques are considered by hydrologists which model based on chaotic theory for predicting GWLs at
help them in achieving their goal. Depending on a particular two locations in China and assessed its performance by
approach can be too dangerous, predominantly in hydro- comparing linear SVM-PSO and chaotic BPNN mod-
logical field of study since outcomes of the decisions reflect els. Results revealed that chaotic SVM-PSO model had
13
Arab J Geosci (2022) 15: 723 Page 3 of 22 723
higher prediction accurateness compared to other models. the best among all in GWL prediction. Seidu et al. (2021)
Ebrahimi and Rajaee (2017) explored potential of wavelet applied WT-SaDE-ELM (wavelet transform-self adap-
based ANN (W-ANN), SVM (W-SVM) and multi linear tive diferential evolutionary-extreme learning machine),
regression (W-MLR) models in simulating GWL consid- variational mode decomposition-self adaptive diferential-
ering groundwater data from two wells located in Qom ELM, and empirical wavelet transform-self adaptive difer-
floodplain, Iran. They observed that implementation of ential ELM to estimate and predict GWLs in sub-Saharan
wavelet analysis improved performance of simple neural Africa. Rezaei et al. (2021) integrated MODFLOW with
network models in GWL simulation. Moazenzadeh et al. shark smell optimization (SSO), PSO, and FFA to esti-
(2018) aimed at predicting evaporation on a daily scale at mate monthly GWL in Dezful-Andimeshk plain, Iran.
two meteorological stations in northern Iran using hybrid Results revealed that MODFLOW-SSO estimated GWL
SVM-FFA and simple SVM models. Mehr et al. (2019) better than MODFLOW-PSO, MODFLOW-FFA, and
proposed a hybrid SVM-FFA technique for 1-month ahead MODFLOW. Cui et al. (2021) proposed ANFIS-IAGWO
rainfall forecasts at two rain gauging stations situated in (Improved Alpha-Guided Grey Wolf optimisation) algo-
north-western region of Iran. Obtained results from the rithm for predicting GWL reliably in a severely irrigated
above studies indicated that proposed SVM-FFA model area of Northwest Bangladesh. Effectiveness of projected
provided most encouraging predictions with more accu- model was compared with ANN, ANFIS, and ANFIS-PSO
racy and reliability outperforming other applied models. models and found that ANFIS-IAGWO showed better per-
Wang et al. (2019) developed a water quality mechanism formance in GWL prediction.
model based on FOA which focuses on studying eutrophi- This study aims to predict monthly GWL at coastal region
cation risk assessment techniques of reservoirs and lakes. of Purba Medinipur situated in West Bengal, India, using a
Moayedi et al. (2019) applied ANN-SHO (spotted hyena novel SVM-ALO model. The efficiency of the hybrid SVM-
optimizer) and ANN-ALO hybrid models for simulating ALO model is evaluated by comparing it with hybrid SVM-
shear strength of soil. Findings based on comparison of FOA, SVM-FFA, and conventional SVM models. Strength
results revealed that ANN-ALO model performed more of SVM lies in its capability to optimise SVM parameters
effectively than ANN-SHO model in the prediction of like bias values and weight connections. In the present work,
soil shear strength. Samadianfard et al. (2019) proposed an innovative ALO algorithm was utilised for training SVM
SVM, SVM-FOA, and M5 model tree for forecasting river model. It is worth stating that in the present work, perfor-
flow in Urmia basin, Iran. Performance of applied models mance of applied hybrid SVM models was studied for the
was evaluated based on different statistical indicators, and first time for monthly GWL prediction at the selected study
results showed that SVM-FOA model had best forecast- area. The working procedure of the present study is illus-
ing performance of river flow at proposed study location. trated in the form of a flowchart, as shown in Fig. 1.
Chen et al. (2020) developed a model for predicting dam
deformation by applying a hybrid neural network model
integrating LS-SVM with ALO to assess operational state Study area
of concrete dams. Obtained results confirmed that pro-
posed hybrid model outperformed other applied models The district of Purba Medinipur (Fig. 2) is a part of lower
in dam deformation prediction. Tikhamarine et al. (2020) Eastern coastal plains and Indo-Gangetic Plain. It is sur-
employed hybrid SVM-WOA, SVM-ALO, and SVM- rounded by South 24 Parganas district and River Hooghly
MVO (Multi-Verse Optimizer) for estimating monthly to east; Bay of Bengal in south; Paschim Medinipur dis-
ETo at Tlemcen and Algiers meteorological stations situ- trict at north and west border and Odisha state to southwest.
ated in northern Algeria. Based on assessment of proposed Elevation of this district is inside 10 m above mean sea level
model performances, they found that SVM-WOA model (MSL). Coastal region of East Midinipur district is consid-
performed superiorly in all scenarios than SVM-ALO and ered for this study which is situated near Bay of Bengal and
SVM-MVO models. Cao et al. (2020) intended at predict- at foothill of Rajmahal and Singhbhum. Precise location of
ing GWL of Tangjiao landslide, China by applying sin- wells (Sherkhanchawk and Basantia high school) consid-
gle-factor SVM-GA, multi-factor back propagation neural ered for the present study in East Midinipur is illustrated in
network (BPNN), and SVM-GA. Findings suggested that Fig. 2, which falls between geographical location 21°38′ N
multi-factor SVM-GA gave the best performance than sin- and 22°30′ N latitudes to 87°27′ E and 88°11′ E longitudes.
gle-factor SVM-GA and multi-factor BPNN. Banadkooki Coastal region of East Midinipur was mostly built by strati-
et al. (2020) applied genetic programming (GP), hybrid fied arranged layers formed by steady deposition of boulders,
RBF-WOA, and hybrid multilayer perception (MLP- silts, cobbles, and gravels transported through Ganga, Bhagi-
WOA) models for predicting GWL at Yazd province in rathi, and other tenacious rivers. Basic statistical indicators
Iran. Results revealed that MLP-WOA model performed of applied data are specified in Table 1. Figure 3 shows the
13
723 Page 4 of 22 Arab J Geosci (2022) 15: 723
( )
di −(𝜔𝜑) xi + bi ≤ ε + ξk
subject to { 𝜔𝜑 xi + bi − di ≤ ε + ξ∗k
ξk , ξ∗k ≥ 0, i = 1 … l
FFA
13
Arab J Geosci (2022) 15: 723 Page 5 of 22 723
13
723 Page 6 of 22 Arab J Geosci (2022) 15: 723
Table 1 Basic statistics of training, testing, and total available data- FOA
sets
Statistical Training set (480) Testing set (120) Total data set Fruitfly is an insect existing broadly in tropical climatic
parameters (600) regions around the world. It is superior to other species in
Sherkhanchawk terms of osphresis and vision. In the course of food hunt-
Min 0.02 1.0090576 0.02 ing, a fruitfly primarily smells a specific odour utilising its
Max 7.913582903 6.986976797 7.913582903 osphresis organs, sends and obtains information from its
Mean 3.581144709 3.320436032 3.529002974 neighbours, and relates fitness with present best position.
Kurt − 0.767513811 − 0.956915936 − 0.764489914 Fitness values are identified by flies using their taste and
Skew 0.372053368 0.417876741 0.393185553 then flying towards location having improved fitness. They
SD 1.789036933 1.586420352 1.752236204 utilise their delicate vision for finding food and fly further in
Basantia high school that direction. Pan (2012) introduced an evolutionary com-
Min 0.056 1.02826832 0.056 putational technique called FOA, based on characteristics of
Max 7.889394405 7.839543172 7.889394405 food finding by fruitfly swarm. Process involved in an FOA
Mean 3.560599631 3.493485156 3.547176736 can be described in the following steps.
Kurt − 0.850979953 − 0.80434208 − 0.847847889
Skew 0.368161153 0.54065221 0.400512799 Step 1: Essential factors for initialisation comprise size of
SD 1.841958962 1.815904858 1.835466534 population (Sizepop), number of iteration (MaxFEs), and
location of preliminary population (Xaxis,Yaxis).
Yaxis = rand(),
Yaxis = rand(). (8)
Table 2 Initial parameter settings for the algorithms
Algorithm Parameter Value Step 2: Entities perform global search in arbitrary direc-
FFA Population (𝛽) 50, 100, 150,
tion and arbitrary distance.
200, 250,
Xi = Xaxis + RandValue,
300 (9)
Yi = Yaxis + RandValue.
Attractiveness (𝛽0) 0.9
Light absorption coefficient (𝛾 ) 0.7 Step 3: Compute concentration food’s taste in air.
Randomisation parameter (𝛼) 0.3
√
Number of iterations 500 Dist i = X2i + Y2i (10)
FOA Location range (LR) [0, 1]
maxgen 300
sizepop 30 1
Si =
(Dist i ) (11)
Flight range (FR) [− 10, 10]
ALO Number of antlions 50
Step 4: Decision value of every individual’s position is
Maximum number iterations 100
substituted to fitness function for figuring taste value.
13
Arab J Geosci (2022) 15: 723 Page 7 of 22 723
Step 6: All entities utilise their visual organs for flying to where rand is the arbitrary number between 0 and 1. Hunting
best individual location. procedure of ALO can be mathematically explained by the
Smellbest = bestSmell, following equation:
Xaxis = X(bestIndex), (14) ( )
Xi (t) − αi ∗ (di (t) − ci (t))
Yaxis = Y(bestIndex). Xi (t) = + ci (t) (17)
(bi − αi )
Step 7: For iterative search repeat steps 2 to 5. When
concentration of smell reaches present accuracy value or where bi, αi is the maximum and minimum of random walk
iterative number reaches maximum iteration number, the in correspondence to ith variable in respective order. Di(t)
circulation halts. and Ci(t) are the maximum and minimum of ith variables at
ith iteration which can be described as:
ALO ci (t) = Antlionj (t) + c(t) (18)
13
723 Page 8 of 22 Arab J Geosci (2022) 15: 723
ant by Antlion and reconstructing pit for catching new prey data, utilised for training, and second set, corresponding to
can be defined using the following formula: a period from 2010 to 2019 (20% of the data) was used for
{ testing the network, permitting inside cross-validation on
Ant i (t) if f(Ant
( i (t)) >) f(Antlionj (t)) a part of data set. GWL data (1970–2019) was collected
Antlionj (t) =
Antlionj (t) if f Ant i (t) ≤ f(Antlionj (t)) from central ground water board (CGWB), Eastern Region,
(23) Kolkata. In the present investigation, to forecast GWL utilis-
where Anti(t) is the location of specified ith ant at ith ing proposed methods (SVM, SVM-FFA, SVM-FOA, and
iteration. SVM-ALO), various time delays of historic GWL data are
Figures 5, 6, and 7 show the pseudo-code of FFA, FOA, collected from observation wells and taken into considera-
and ALO, respectively. Figure 8 represents a general execution tion as input combination. Number of variables is from 1 to
procedure of SVM-based models optimised by FFA, FOA, and 6 because of statistical lagging effect of monthly and annu-
ALO optimisation techniques. In addition, it shows optimisa- ally varying patterns in historic GWL dataset. Considered
tion processes employed by each optimisation technique, i.e. models based on input combinations are given below:
FFA, FOA, and ALO, and their roles in three hybrid models of
SVM-FFA, SVM-FOA, and SVM-ALO for GWL prediction. M1 (Model 1) ∶ h(t + 1) = f(h(t − 1))
M2 (Model 2) ∶ h(t + 1) = f(h(t − 1), h(t − 2))
Model construction and performance measures M3 (Model 3) ∶ h(t + 1) = f(h(t − 1), h(t − 2), h(t − 3))
M4 (Model 4) ∶ h(t + 1) = f(h(t − 1), h(t − 2), h(t − 3), h(t − 4))
Time series were distributed into two groups: first set cor- M5 (Model 5) ∶ h(t + 1) = f(h(t − 1), h(t − 2), h(t − 3), h(t − 4), h(t − 5))
responds to a period of 1970–2009, which is 80% of total M6 (Model 6) ∶ h(t + 1) = f(h(t − 1), h(t − 12))
13
Arab J Geosci (2022) 15: 723 Page 9 of 22 723
13
723 Page 10 of 22 Arab J Geosci (2022) 15: 723
13
Arab J Geosci (2022) 15: 723 Page 11 of 22 723
Prediction of GWL
Data interpretation
Initialization of
parameter
SVM-FFA Model
SVM-FOA Model
SVM-ALO Model
Fig. 8 Flowchart of hybrid models based on SVM optimised by FFA, FOA, and ALO techniques
∑N
k=1
(fo − fi )2 Results
WI = 1 − [ 2
](0 < WI ≤ 1) (26)
∑N � � � �
i=1
(�fi − fo � + �fi − fi �)
� � � � For assessing the impact of various input lag time on per-
formance of prediction models, a primary investigation of
∑N GWL prediction on 2 wells is reported in this section. Based
i=1 (fo − fo )(f i − fi ) on the results of preliminary investigation, a more in depth
PCC = � (−1 < PCC < 1)
∑N 2 ∑N 2 regional analysis was carried out where two observational
(fo − fo ) (fi − fi )
i=1 i=1 wells are situated. For this purpose, hybrid neural network
(27)
models were applied as a nonlinear modelling system for
∑N
i=1
(fi − fo )2 producing a model which can predict GWL based on input
R2 = 1 − − 2
(0 < R2 < 1) (28) combinations. Efficiency of models was assessed by estimat-
∑N
(f − fi )
i=1 i ing RMSE, WI, PCC, NSE, and R2 values. Five performance
measures (Eqs. 25, 26, 27, and 28) were implemented for
where fi is the predicted GWL values assessing performance of proposed SVM models. GWL pre-
fo is the observed GWL values diction results of hybrid SVM models at Sherkhanchawk and
fi is the mean of predicted GWL values Basantia high school wells are presented in Tables 3 and 4.
fo is the mean of observed monthly GWL values It was observed that there was a significant improvement
N is the number of data points in the performance of conventional SVM model when its
Moreover, developed hybrid SVM models with the lowest parameters were optimised utilising ALO in comparison to
RMSE value and the highest value of PCC, NSE, R2, and WI FOA and FFA.
are better performing for monthly GWL estimation. In the testing period, RMSE value was found to be
42.7583–39.5638 for SVM model, while it was ranged
from 38.6627 to 28.7893 for SVM-FFA, from 29.9672 to
13
723 Page 12 of 22 Arab J Geosci (2022) 15: 723
17.3726 for the SVM-FOA models, and 13.2872–7.6638 for Likewise, the results obtained in testing period for Basantia
SVM-ALO models, respectively, at Sherkhanchawk station. high school are provided in Table 4. For ten models, RMSE
Similarly, best values of WI, PCC, and R2 were found to value ranges from 43.5728 to 39.7724 for the SVM model-
be 0.98215, 0.9815, and 0.97846 respectively, with model ling approach, while for SVM-FFA, SVM-FOA, and SVM-
SVM-ALO-M10. On the contrary, best values of WI, PCC, ALO models, it ranged from 39.0275 to 32.3358, 31.6894
and R2 were found with SVM-FFA-M10 (0.9558, 0.95211, to 18.9063, and from 17.0025 to 8.838 in respective order.
and 0.94347) and SVM-FOA-M10 (0.97031, 0.96891, and Best values of WI, PCC, and R2 were found to be 0.98067,
0.96223). 0.98001, and 0.97628, respectively, considering model SVM-
ALO-M10. On the contrary, best values of RMSE, WI, PCC,
13
Arab J Geosci (2022) 15: 723 Page 13 of 22 723
and R2 obtained with SVM-FFA-M10 and SVM-FOA-M10 are be understood from Tables 3 and 4 that SVM-ALO performs
32.3358, 0.9519, 0.94893, and 0.94076 and 18.9063, 0.9693, best for all models (different input combinations) with mini-
0.9671, and 0.9602 respectively. Best performing models dur- mum RMSE value and SVM showing the worst performance
ing testing phase at Sherkhanchawk and Basantia high school in GWL prediction. Based on Tables 3 and 4, it is evident
stations are highlighted in bold in Tables 3 and 4. that SVM-ALO contributes to higher prediction accuracy
Robustness of SVM-ALO model was assessed in regional than SVM-FOA and SVM-FFA algorithms. The overall
investigation, attaining suitable values of statistical indica- computational outcomes demonstrate that SVM-ALO is
tors in training and in testing phase that proved to be the more efficient compared to other hybrid and conventional
most challenging circumstances for GWL prediction. It can models tested in monthly GWL prediction. It is evident that
13
723 Page 14 of 22 Arab J Geosci (2022) 15: 723
as the time delay increases prediction accuracy of SVM- models (Fig. 9). Closeness of R2 value to 1 is a suggestion
ALO models (M1–M10) increases. Hence, SVM-ALO has that both observed and predicted values are fitted well.
proven to be a reliable tool to investigate fluctuations in From Fig. 9, it is observed that SVM-ALO model fits the
wells positioned on various hydrogeological arrangements. data well with precise GWL predictions. High value of
In particular, sub-regional investigation emphasised differ- R2 and less value of RMSE obtained by SVM-ALO model
ences among wells located in arid regions. indicate that data are well fitted. Outcomes found from
Figure 9 illustrates the coefficient of determination assessment of data indicated that SVM-ALO model is the
(R2) plot for the best performing model of four proposed best fitting for GWL prediction. This recommends that
techniques at Sherkhanchawk and Basantia high school. proposed SVM-ALO model can be used in other regions
Firstly, R2 values for four applied algorithms in ten dif- of similar geology.
ferent scenarios/models were found. It is observed that Figure 10 demonstrates plot between observed and simu-
R 2 for SVM-ALO-M10 model is superior to other nine lated GWLs for the best results computed with GWL data
(a)
Fig. 9 Regression plots of predicted over observed GWL data using SVM, SVM-FFA, SVM-FOA, and SVM-ALO at a Sherkhanchawk and b
Basantia high school stations for testing phases
13
Arab J Geosci (2022) 15: 723 Page 15 of 22 723
(b)
Fig. 9 (continued)
by SVM, SVM-FFA, SVM-FOA, and SVM-ALO in two Comparison between observed and estimated GWL for
observational wells. However, utilised datasets were differ- two reference wells are presented using the box plot repre-
ent for training and testing. Outcomes reveal that simulated sentation. The best performance was obtained for “SVM-
peak GWL is 8.562671 m, 8.774913 m, 8.949114 m, and ALO,” and SVM-FOA gives slightly lower predictions but
9.099785 m for SVM, SVM-FFA, SVM-FOA, and SVM- performs pre-eminently for two reference wells. Figure 11
ALO, contrary to observed peak of 9.30068 m for Sherkhan- shows the boxplot comparison between results obtained by
chawk station. The estimated peak GWL is 7.982215 m, SVM, SVM-FFA, SVM-FOA, and SVM-ALO algorithms
8.161093 m, 8.329735 m, and 8.468535 m for SVM, SVM- at both well stations. Representations indicated that SVM-
FFA, SVM-FOA, and SVM-ALO against observed peak of ALO was the most accurate model illustrating the effective-
8.675 m for Basantia high school division. Time series plots ness of applied ALO algorithm. The reason behind this is
of observed versus simulated GWL for proposed models at due to improvement on updating mechanism and combina-
specified locations are presented in Fig. 10. tion of tournament selection of ALO. It is also seen from
13
723 Page 16 of 22 Arab J Geosci (2022) 15: 723
(a)
(b)
Fig. 10 that this fluctuation is negligible. Several studies discussed earlier, we consider ten models for each applied
reveal that the impact of arbitrary collection of initial popu- technique based on different input combinations. It is quite
lation and investigation space is negligible in meta-heuristic evident from Fig. 13 that for all models (M1–M10), SVM-
algorithms. ALO shows the best results compared to other hybrid and
Figure 12 illustrates that the histogram plot of GWL esti- simple techniques, with model (SVM-ALO-M10) showing
mated with hybrid SVM models. Based on the assessment of the highest accuracy in groundwater level prediction at both
the above results, it is indicated that specified hybrid SVM- wells. There are several reasons for the selection and bet-
ALO model provides better and consistent predictions of ter performance of ALO algorithm over FOA and FFA: a
average GWL of proposed well stations. Also, it shows reli- small number of tuning parameters, appropriate for various
able results owing to low information content and stand- kinds of optimisation problems, simple to hybridise with
ard deviation values in comparison to other models. Even data-driven techniques, simple implementation, fast conver-
though average GWL is a primary and suitable standard for gence speed, accelerated process to find optimal solutions,
assessment of variations all over the plain, spatial variations lesser chance of getting stuck in local optima, efficient global
across piezometer network must be known across the plain. process with extensive search space making it more feasible,
For that purpose, hybrid models have been employed for and efficient in getting global optima and reasonable execu-
simulating GWL fluctuations in all wells. tion time.
Comparison of various model with respect to proposed
algorithms at two well stations are given in Fig. 13. As
13
Arab J Geosci (2022) 15: 723 Page 17 of 22 723
Fig. 11 Comparison of box plot at station a Sherkhanchawk and b Basantia high school
13
723 Page 18 of 22 Arab J Geosci (2022) 15: 723
(a)
(b)
that several climatic factors can have an impact on GWL. rivers can be investigated for other case studies where
Integration of data driven techniques with pre-processing these would impact GWLs.
data can ascertain value and time of different GWLs. Man-
agement of groundwater with precise prediction results
assists decision makers in allocating water equitably to Conclusions
stakeholders. In addition, groundwater and its index can
be utilised to predict drought periods. While investigat- For sustainable water resources management, prediction
ing these outcomes, it must be noted that, because models of groundwater level fluctuation plays a crucial role.
are developed considering water level from wells with no On the other hand, estimating precise GWL forecasts
observation of either water bodies or streams, the models is mostly challenging because of nonlinear relationship
seem to estimate drier scenarios than anticipated. Future between descriptive variables and GWL and their multi-
work may focus more on implementing an integrated scale behaviour, which varies with time. For developing
monitoring design for evaluating surface and groundwater an accurate ML model, one of the prerequisites is the
resources using field and remotely sensed data. Moreover, selection of optimal input variables as well as optimising
the influence of variables such as seasonal precipitation, model parameters. To address these problems, present
pumping from agricultural wells, and runoff from seasonal research investigated the capability of meta-heuristic opti-
mization algorithms (such as ALO, FOA, and FFA) that
13
Arab J Geosci (2022) 15: 723 Page 19 of 22 723
(a)
(b)
can internally perform selection of optimal input vari- statistical indices and graphical interpretation. Compari-
ables. Proposed hybrid models capable of capturing non- son of results indicated that SVM-ALO model performed
linear relationships between model inputs were provided superiorly than SVM-FOA, SVM-FFA, and simple SVM
with multiple information to enhance the forecasting abil- models in all models at both stations. Consequently, the
ity of the proposed models. most efficient model was SVM-ALO-M10, with the low-
Present study aimed to examine accurateness of a meta- est value of RMSE = 7.6638/8.838 and the highest value
heuristic algorithm, ALO, for enhancing performance of of NSE = 0.98006/0.97896, PCC = 0.9815/0.98001, and
SVM in monthly GWL estimation at coastal region of WI = 0.98215/0.98067 for testing period at Sherkhan-
Purba Medinipur district located in West Bengal state of chawk/Basantia high school, respectively. The hydrolo-
India. A comparison was made between proposed algo- gists, agronomists, and agriculturists can construct a
rithm against FOA and FFA to assess its prediction per- truthful smooth decision support system with obtained
formance. Developed hybrid SVM-ALO, SVM-FOA, and outcomes of hybrid SVM-ALO model for precise predic-
SVM-FFA models were trained and tested by exploiting tion of GWL in the selected study area.
ten different input combinations. Results of SVM-ALO
model were appraised and compared with SVM-FOA,
SVM-FFA, and conventional SVM models on basis of
13
723 Page 20 of 22 Arab J Geosci (2022) 15: 723
13
Arab J Geosci (2022) 15: 723 Page 21 of 22 723
issues and applications. Environ Model Softw 15:101–124. groundwater level. Arab J Geosci 14:1–15. https://doi.org/10.
https://doi.org/10.1016/S1364-8152(99)00007-9 1007/s12517-021-07349-z
Malekzadeh M, Kardar S, Saeb K et al (2019) A novel approach Salih SQ, Sharafati A, Ebtehaj I et al (2020) Integrative stochastic
for prediction of monthly ground water level using a hybrid model standardization with genetic algorithm for rainfall pattern
wavelet and non-tuned self-adaptive machine learning model. forecasting in tropical and semi-arid environments. Hydrol Sci J
Water Resour Manag 33:1609–1628. https://doi.org/10.1007/ 65:1145–1157. https://doi.org/10.1080/02626667.2020.1734813
s11269-019-2193-8 Samadianfard S, Jarhan S, Salwana E et al (2019) Support vector
Malik A, Tikhamarine Y, Souag-Gamane D et al (2020) Support vec- regression integrated with fruit fly optimization algorithm for
tor regression optimized by meta-heuristic algorithms for daily river flow forecasting in Lake Urmia Basin. Water 11:1934.
streamflow prediction. Stoch Environ Res Risk Assess 34:1755– https://doi.org/10.3390/w11091934
1773. https://doi.org/10.1007/s00477-020-01874-1 Samantaray S, Ghose DK (2020a) Modelling runoff in an arid water-
Mattar MA (2018) Using gene expression programming in monthly shed through integrated support vector machine. H2Open Journal,
reference evapotranspiration modeling: a case study in Egypt. IWA Publishing 3(1):256–275
Agric Water Manag 198:28–38. https://doi.org/10.1016/j.agwat. Samantaray S, Ghose DK (2020b) Assessment of suspended sediment
2017.12.017 load with neural networks in arid watershed. Journal of The Insti-
Mehr AD, Nourani V, Khosrowshahi VK, Ghorbani MA (2019) A tution of Engineers (India): Series A 101:371–380
hybrid support vector regression–firefly model for monthly rain- Samanataray S, Sahoo A (2021) A comparative study on prediction
fall forecasting. Int J Environ Sci Technol 16:335–346. https://d oi. of monthly streamflow using hybrid ANFIS-PSO approaches.
org/10.1007/s13762-018-1674-2 KSCE J Civ Eng 25:4032–4043. https:// d oi. o rg/ 1 0. 1 007/
Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98. s12205-021-2223-y
https://doi.org/10.1016/j.advengsoft.2015.01.010 Samantaray S, Sahoo A, Ghose DK (2020) Assessment of sediment
Moayedi H, Tien Bui D, Anastasios D, Kalantar B (2019) Spotted load concentration using SVM, SVM-FFA and PSR-SVM-
hyena optimizer and ant lion optimization in predicting the shear FFA in arid watershed, India: a case study. KSCE J Civ Eng
strength of soil. Appl Sci 9:4738. https://doi.org/10.3390/app92 24:1944–1957
24738 Samantaray S, Sahoo A (2020) Assessment of sediment concentration
Moazenzadeh R, Mohammadi B, Shamshirband S, Chau K (2018) Cou- through RBNN and SVM-FFA in Arid Watershed, India. In Smart
pling a firefly algorithm with support vector regression to pre- intelligent computing and applications (pp. 701–709). Springer,
dict evaporation in northern Iran. Eng Appl Comput Fluid Mech Singapore. https://doi.org/10.1007/978-981-13-9282-5_67
12:584–597. https://doi.org/10.1080/19942060.2018.1482476 Samantaray S Sahoo A (2021) Prediction of suspended sediment con-
Mohanta NR, Panda SK, Singh UK, Sahoo A, Samantaray S (2022) centration using hybrid SVM-WOA approaches. Geocarto Int
MLP-WOA is a successful algorithm for estimating sediment load :1-27. https://doi.org/10.1080/10106049.2021.192063
in Kalahandi Gauge Station, India. In Proceedings of International Samantaray S Sahoo A Ghose DK (2019) June. Assessment of ground-
Conference on Data Science and Applications (pp. 319–329). water potential using neural network: a case study. In International
Springer, Singapore. https://doi.org/10.1007/978-981-16-5120- Conference on Intelligent Computing and Communication (pp.
5_25 655–664). Springer, Singapore. https://doi.org/10.1007/978-981-
Mohanty S, Jha MK, Raul SK et al (2015) Using artificial neural net- 15-1084-7_63
work approach for simultaneous forecasting of weekly groundwa- Samantaray S Sahoo A Ghose DK (2020a) Infiltration loss affects
ter levels at multiple sites. Water Resour Manag 29:5521–5532. toward groundwater fluctuation through CANFIS in arid water-
https://doi.org/10.1007/s11269-015-1132-6 shed: a case study. In Smart Intelligent Computing and Applica-
Moosavi V, Vafakhah M, Shirmohammadi B, Behnia N (2013) A wave- tions (pp. 781–789). Springer, Singapore. https://d oi.o rg/1 0.1 007/
let-ANFIS hybrid model for groundwater level forecasting for dif- 978-981-13-9282-5_76
ferent prediction periods. Water Resour Manag 27:1301–1321. Samantaray S, Tripathy O, Sahoo A, Ghose DK (2020c) Rainfall fore-
https://doi.org/10.1007/s11269-012-0239-2 casting through ANN and SVM in Bolangir Watershed, India.
Mouassa S, Bouktir T, Salhi A (2017) Ant lion optimizer for solving In Smart intelligent computing and applications (pp. 767–774).
optimal reactive power dispatch problem in power systems. Eng Springer, Singapore. https://doi.org/10.1007/978-981-13-9282-
Sci Technol an Int J 20:885–895. https://doi.org/10.1016/j.jestch. 5_74
2017.03.006 Sanikhani H, Deo RC, Samui P et al (2018) Survey of different data-
Nie S, Bian J, Wan H et al (2017) Simulation and uncertainty analysis intelligent modeling strategies for forecasting air temperature
for groundwater levels using radial basis function neural network using geographic information as model predictors. Comput Elec-
and support vector machine models. J Water Supply Res Technol tron Agric 152:242–260. https://doi.org/10.1016/j.compag.2018.
66:15–24. https://doi.org/10.2166/aqua.2016.069 07.008
Onyutha C (2022) A hydrological model skill score and revised Sattari MT, Mirabbasi R, Sushab RS, Abraham J (2018) Prediction of
R-squared. Hydrol Res 53:51–64. https://doi.org/10.2166/nh. groundwater level in Ardebil plain using support vector regression
2021.071 and M5 tree model. Groundwater 56:636–646. https://doi.org/10.
Pan W-T (2012) A new fruit fly optimization algorithm: taking the 1111/gwat.12620
financial distress model as an example. Knowledge-Based Syst Seidu J Ewusi A Kuma JSY Ziggah YY Voigt HJ (2021) A hybrid
26:69–74. https://doi.org/10.1016/j.knosys.2011.07.001 groundwater level prediction model using signal decomposition
Quilty J, Adamowski J, Khalil B, Rathinasamy M (2016) Bootstrap and optimised extreme learning machine. Model Earth Syst Envi-
rank-ordered conditional mutual information (broCMI): a nonlin- ron :1-18. https://doi.org/10.1007/s40808-021-01319-w
ear input variable selection method for water resources modeling. Sharafati A, Tafarojnoruz A, Shourian M, Yaseen ZM (2020) Simula-
Water Resour Res 52:2299–2326. https://doi.org/10.1002/2015W tion of the depth scouring downstream sluice gate: the validation
R016959 of newly developed data-intelligent models. J Hydro Environ Res
Rezaei M, Mousavi SF, Moridi A, EshaghiGordji M, Karami H 29:20–30. https://doi.org/10.1016/j.jher.2019.11.002
(2021) A new hybrid framework based on integration of optimi- Shirmohammadi B, Vafakhah M, Moosavi V, Moghaddamnia A (2013)
zation algorithms and numerical method for estimating monthly Application of several data-driven techniques for predicting
13
723 Page 22 of 22 Arab J Geosci (2022) 15: 723
groundwater level. Water Resour Manag 27:419–432. https://doi. combination of random features. Appl Water Sci 8:1–12. https://
org/10.1007/s11269-012-0194-y doi.org/10.1007/s13201-018-0742-6
Singh A, Malik A, Kumar A, Kisi O (2018) Rainfall-runoff modeling in Wang X, Zhou Y, Zhao Z et al (2019) A novel water quality mechanism
hilly watershed using heuristic approaches with gamma test. Arab modeling and eutrophication risk assessment method of lakes and
J Geosci 11:1–12. https://doi.org/10.1007/s12517-018-3614-3 reservoirs. Nonlinear Dyn 96:1037–1053. https://d oi.o rg/1 0.1 007/
Sridharam S, Sahoo A, Samantaray S, Ghose DK (2021) Estimation s11071-019-04837-6
of water table depth using wavelet-ANFIS: a case study. In Com- Yadav B, Eliza K (2017) A hybrid wavelet-support vector machine
munication Software and Networks (pp. 747–754). Springer, Sin- model for prediction of lake water level fluctuations using hydro-
gapore. https://doi.org/10.1007/978-981-15-5397-4_76 meteorological data. Measurement 103:294–301. https://doi.org/
Suryanarayana C, Sudheer C, Mahammood V, Panigrahi BK (2014) An 10.1016/j.measurement.2017.03.003
integrated wavelet-support vector machine for groundwater level Yang X-S (2009) Firefly algorithms for multimodal optimization. In:
prediction in Visakhapatnam, India. Neurocomputing 145:324– International symposium on stochastic algorithms. Springer, pp
335. https://doi.org/10.1016/j.neucom.2014.05.026 169–178. https://doi.org/10.1007/978-3-642-04944-6_14
Tang Y, Zang C, Wei Y, Jiang M (2019) Data-driven modeling of Yang X-S, Hosseini SSS, Gandomi AH (2012) Firefly algorithm for
groundwater level with least-square support vector machine and solving non-convex economic dispatch problems with valve load-
spatial–temporal analysis. Geotech Geol Eng 37:1661–1670. ing effect. Appl Soft Comput 12:1180–1186. https://doi.org/10.
https://doi.org/10.1007/s10706-018-0713-6 1016/j.asoc.2011.09.017
Taormina R, Chau K-W, Sethi R (2012) Artificial neural network simu- Yaseen ZM, Ebtehaj I, Bonakdari H et al (2017) Novel approach for
lation of hourly groundwater levels in a coastal aquifer system of streamflow forecasting using a hybrid ANFIS-FFA model. J
the Venice lagoon. Eng Appl Artif Intell 25:1670–1676. https:// Hydrol 554:263–276. https://doi.org/10.1016/j.jhydrol.2017.09.
doi.org/10.1016/j.engappai.2012.02.009 007
Tharwat A, Hassanien AE (2018) Chaotic antlion algorithm for param- Yoon H, Hyun Y, Ha K et al (2016) A method to improve the stability
eter optimization of support vector machine. Appl Intell 48:670– and accuracy of ANN-and SVM-based time series models for
686. https://doi.org/10.1007/s10489-017-0994-0 long-term groundwater level predictions. Comput Geosci 90:144–
Tikhamarine Y, Malik A, Pandey K et al (2020) Monthly evapotranspi- 155. https://doi.org/10.1016/j.cageo.2016.03.002
ration estimation using optimal climatic parameters: efficacy of Zare M, Koch M (2018) Groundwater level fluctuations simulation and
hybrid support vector regression integrated with whale optimiza- prediction by ANFIS-and hybrid Wavelet-ANFIS/Fuzzy C-Means
tion algorithm. Environ Monit Assess 192:1–19. https://doi.org/ (FCM) clustering models: application to the Miandarband plain. J
10.1007/s10661-020-08659-7 Hydro-Environment Res 18:63–76. https://doi.org/10.1016/j.jher.
Wang X, Liu T, Zheng X et al (2018) Short-term prediction of ground- 2017.11.004
water level using improved random forest regression with a
13