Normalization of Zones
Normalization of Zones
A R T I C L E I N F O A B S T R A C T
Keywords: Management zones (MZs) are a viable economic alternative to variable-rate application (VRA) based on pre-
Standardization scription maps; however, unlike the latter, MZs can employ conventional machinery. The use of management
Fuzzy C-Means zones (MZs) is considered an economically viable alternative because of its low initial cost and high return in
Precision agriculture economic and environmental benefits. Data clustering techniques and the Fuzzy C-Means algorithm are the most
Smoothness index
widely used processes for delineating MZs. The most common similarity measurement used is Euclidean dis-
Variance reduction
Euclidean distance
tance; however, because the algorithm is sensitive to the range of the input variables, these variables are ty-
pically normalized dividing the value by the standard deviation, maximum value, average, or data set range. The
objective of this study was to assess the influence of data normalization methods for delineating MZs. The
experiment was conducted in three experimental fields with 9.9, 15.0, and 19.8 ha, located in Southern Brazil
between 2010 and 2014. The variables used for delineating MZs were selected using spatial correlation statistics
and data were normalized using methods of standard score, range, and average. The MZs were delineated using
the Fuzzy C-Means algorithm, which created two, three, and four clusters. The normalization methods were
evaluated by five indices (modified partition entropy [MPE], fuzziness performance index [FPI], variance re-
duction [VR], smoothness index [SI], and kappa), and ANOVA. It was found that when the MZs delineation uses
more than one variable with different scales in the clustering process using Euclidean distance, normalization is
required. The range method was considered the overall best normalization method.
1. Introduction the yield. Among the variables identified in the literature good poten-
tial to delineate temporally stable MZs are elevation (Bazzi et al., 2015;
The study of the spatial distribution of soil and plant variables is Fraisse et al., 2001; Jaynes et al., 2005; Peralta and Costa, 2013; Farid
important to the establishment of appropriate management zones (MZs) et al., 2016; Schepers et al., 2004), soil electrical conductivity (ECa) (Li
to be used in application of the fertilizer, soil management, and irri- et al., 2007;Farid et al., 2016), soil penetration resistance (Gavioli et al.,
gation. Appropriate MZs may maximize yield, while reducing costs and 2016), and soil texture (Farid et al., 2016).
minimizing potential environmental damage (Tilman et al., 2011; Li Techniques such as principal component analysis (PCA) (Bansod
et al., 2013; Bansod and Pandey, 2013; Hedley, 2015). and Pandey, 2013) and the Moran's bivariate spatial autocorrelation
A MZ is defined as a subregion of a field that exhibits similar statistic proposed by Czaplewski and Reich (1993), and used by Reich
combinations of yield-limiting factors (Tagarakis et al., 2013). This et al. (1994) and Bonham et al. (1995) can be used to create (when PCA
facilitates the application of precision agriculture (PA) techniques by is used) or select layers for delineation MZs. When there is more than
reducing the costs of its adoption and implementation, since MZs can one crop cultivated in the same field during the year, which is a
use constant rate equipment and may reduce the number of samples common practice in Brazil, normalizing yield data makes possible to
needed to characterize the soil nutrients availability. Delineating MZs is create a more representative variable (Bunselmeyer and Lauer, 2015) to
not a simple task because numerous variables may influence crop yield. be used in ANOVA and Tukey's test.
Considering that a MZ is often used for several years, the considered Several techniques to delineate MZs are proposed in the literature
variables should be temporally stable (Doerge, 2000) and correlated to (Pedroso et al., 2010; Xiang et al., 2007), however the most used is
⁎
Corresponding author at: Computer Science Department, Technological Federal University of Paraná, Rua Cerejeira, s/n – Bairro São Luiz, 85892-000 Santa Helena, Paraná State,
Brazil.
E-mail addresses: kschenatto@utfpr.du.br (K. Schenatto), humberto.beneduzzi@ifpr.edu.br (H.M. Beneduzzi).
http://dx.doi.org/10.1016/j.compag.2017.10.017
Received 28 February 2017; Received in revised form 18 October 2017; Accepted 21 October 2017
Available online 04 November 2017
0168-1699/ © 2017 Elsevier B.V. All rights reserved.
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
cluster analysis (Li et al., 2007; Iliadis et al., 2010). The most commonly
used clustering methods to delineate MZs are the K-means algorithm
(Rodrigues Jr. et al., 2011; Ortega and Santibañez, 2007) and fuzzy C-
means (Li et al., 2007,2013; Fu et al., 2010; Zhang et al., 2013; Moral
et al., 2010). This algorithm, that incorporates the theory of fuzzy logic
in the division algorithm, uses a weighting exponent to control the
degree of sharing between classes (Bezdek, 1981), allowing individuals
to exhibit partial adhesion in each of the classes, which is important
when dealing with the continuous variability of natural phenomena
(Burrough, 1989). Before a dataset can be formed, it is necessary to
establish an appropriate measure of similarity. Euclidean distance is
most regularly used; this measure gives equal weight to all measured
variables and is sensitive to correlated variables (Bezdek, 1981). In
geometrical terms, the Euclidean distance creates agglomerates having
a spherical shape, which rarely occur in a soil (Odeh et al., 1992).
Fridgen et al. (2004) reports that Euclidean distance should be used
only for statistically independent variables demonstrating equal var-
iances. In this sense, when the Euclidean distance is used to clustering,
the normalization data can be very important step before creating MZs.
The normalization methods such as Standard score or Z-score
method (Eq. (1)) has been used by many researchers for delineation of
MZs (Anderberg, 1973; Romesburg, 1984; Larscheid and Blackmore
1996; Stafford et al., 1996; Molin, 2002; Kitchen et al., 2005). This
method is used for transforming normal variables to standard score
where the transformed variable will have a mean of 0.0 and a variance
of 1.00.
(X −X )
Z=
s (1)
where X is the original data value; X is the sample average; and s is the
standard deviation.
Several researches reported the use of the average method (Eq. (2))
for delineation MZs (Stafford et al., 1996; Molin, 2002; Kitchen et al.,
2005) with the assumption that the average represents the dataset well;
however, the average is sensitive, can be modified by adding any
constant, and can easily change the distribution of the normalized data Fig. 1. Step-by-step flowchart of the methodology used to evaluate the normalization
(Anderberg, 1973). methods for delineation of MZ’s.
X
Z=
X (2)
Good results were also reported by Milligan and Cooper (1988), Paraná State, Brazil: Field A (15 ha), located in the municipality of Céu
Bazzi et al. (2013), Gavioli et al. (2016), and Schenatto et al. (2016) Azul (central geographical location of 25°06′32″S, 53°49′55″W, and
using the Range (Eq. (3)) normalization method. This method is average elevation of 460 m). Field B (9.9 ha) located in the municipality
bounded by 0.0 and 1.0 with at least one observed value at each of of Serranópolis do Iguaçu (central geographic location of 25°24′28″S,
these end points. The Min(X) value used in Eq. (3) can be changed for 54°00′17″W, and average elevation of 355 m) and Field C (19.8 ha)
Median(X) (Mielke and Berry, 2007) and have the same behavior be- located in the municipality of Cascavel (central geographic location of
cause Min(X) and Median(X) are constants and not change the data 24°57′08″S, 53°33′59″W, and average elevation of 650 m).
distribution. For the delineation of MZs, only variables considered temporally
X −Min (X ) stable collected between 2010 and 2014 (Table 1) were used, to meet
Z= the recommendation of Doerge (2000). To meet the constraints of
Max (X )−Min (X ) (3)
geostatistical analysis (Journel and Huijbregts, 1978) in terms of the
The goal of this study was to evaluate the performance of these minimum number of pairs (30) to calculate the semivariances of the
methods, frequently used in the data clustering process by the Fuzzy C- semivariogram, a dense sampling grid (Table 1) was used, with 2.7
Means algorithm to delineate MZs. points ha−1 for Field A, 4.2 points ha−1 for Field B, and 3.4 points ha−1
for Field C. The irregular sampling grids were defined taking into ac-
2. Materials and methods count an imaginary central line between the elevation contour lines of
each field (Fig. 2).
A step-by-step flowchart (Fig. 1) was created to show the metho- Elevation was determined with a total station (Topcon GPT-7505,
dology used. Topcon Corporation, Tokyo, Japan), and soil penetration resistance
(SPR) was determined with a soil penetrometer (penetroLOG PGL1020,
2.1. Datasets Falker Automação Agrícola, Porto Alegre, Brazil). Soil samples were
collected at a depth of 0–0.2 m and sent to the laboratory for analysis.
This research was conducted in three fields (Fig. 2) located in Soybean yield for Field A was determined with a yield monitor (AFS
239
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Table 1
Identification of the types of variables and collection periods for each experimental field.
Variables Field A (40 samples) Field B (42 samples) Field C (68 samples)
PRO 600, Case IH, Racine, USA) coupled to a harvester (CASE IH® they are referred as yield-based management zones or productivity
model 2388, Sorocaba, Brazil). For Fields B and C, yield was de- zones.
termined by hand-harvesting a sample (from an area of approximately
n n
1 m2) at each soil sampling point (42 points in field B and 68 in field C).
∑∑ Wij ∗Xi ∗Yj
Yield values of all fields were adjusted for 13% water content. To re- i=1 j=1
duce the temporal variability of yield data, which is strongly influenced IXY =
W mX2 ∗mY2 (4)
by the weather and rainfall, and for creating a single variable (Jaynes
et al., 2003, 2005) for each field, the standard score normalization where Wij is the spatial association matrix, calculated by
technique (Eq. (1)) was used (Kitchen et al., 2005; Milani et al., 2006; Wij = (1/(1 + Dij )) ; Dij is the distance between points i and j; Xi is the
Suszek et al., 2011). value of variable X transformed, at point i; Yj is the value of the variable
Y transformed, at point j; W corresponds to the sum of the degrees of
2.2. Variable selection spatial association, obtained from the Wij matrix, for i ≠ j; corresponds
to the sample variance of X ; and corresponds to the sample variance of
The Moran’s bivariate spatial autocorrelation statistic (Eq. (4)) Y . Note that the transformation of a variable Z should be interpreted as
(Czaplewski and Reich, 1993) was calculated among all the variables by the procedure performed on their values so that it is on average equal to
using SDUM (Software for definition management zones, Bazzi et al., zero, applying the Eq. Zk = (z k−Z ) , wherein is the sample average of Z .
2013). Variables were selected by the procedure proposed by Bazzi
et al. (2013): (a) removal of variables with no significant spatial au-
tocorrelation at 95% significance; (b) removal of the variables that were 2.3. Interpolation of the selected variables
not correlated with yield; (c) decreasing ordination of the remaining
variables, considering the degree of correlation with yield; and (d) re- In the geostatistical analysis of the selected variables, data were
moval of variables which are correlated with each other, with pre- adjusted to the experimental semivariogram through the models’
ference to the withdrawal of those variables with lower correlation with spherical, exponential, and Gaussian procedures (Table 4), and the best
yield. The idea is to keep only the variables that are most correlated adjusted model was determined through cross-validation statistics (Sun
with yield and remove the variables that are less influential, although et al., 2009; Arslan, 2012). The data were then interpolated by ordinary
are correlated with yield. Since these management zones are delineated kriging in order to create a grid of 5 × 5 m looking for a more dense
using parameters selected according to their relationship with yield, number of points per area and therefore delineating more smooth MZs.
240
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Fig. 3. Representation of the difference that occurs in the calculation of the Euclidean distance when used different units of measurement, with the input data: clay data (%) and elevation
(in meters (a) and in kilometers (b)).
c n
2.4. Data normalization methods P = {A1 ,A2 ,…,AC } , that satisfies ∑i = 1 Ai (xk ) = 1 and 0 < ∑k = 1 Ai (xk ) < n ,
where k ∈ {1,2,…,n} and n represents the number of elements of X. The
Euclidean distance should not be used in clustering methods when algorithm is oriented with parameters like the number of groups, a
there is not statistical independent variables demonstrating equal var- distance between the points and the centroid (m ∈ (1,∞) ) and an error
iances (Fridgen et al., 2004), because the distance between each n-di- used as a stopping criterion (ε > 0 ) (Bezdek, 1981).
mensional point and the centroid (also n dimensional) is calculated on The position of each centroid is calculated considering the distance
the basis of the values of each displayed element in the input data defined as parameter. For each C, is calculated v1(t ),…vc(t ) (Eq. (5)) for the
matrix. When these elements are presented with different measurement partition P (t ) , where the iteration is t = {1,2,…,n} . The vector vi corre-
units and unit scales (e.g., elevation in meters; SPR in kilopascals or sponds to the grouping center Ai and is the weighted average of the
megapascals; and clay, sand, and silt in%), the calculation of the gen- data in Ai . The value of the data xk is the m-th power of its relevance
erated Euclidean distance can conduce to incorrect values because of degree to the set Fuzzy Ai .
different metrics for each element, (Fig. 3). Fig. 3 shows that if the n
variable elevation is reported in meters, vector V1, which indicates the ∑ [Ai (xk )]m xk
closest element to centroid 1, receives points A and E as elements, k=1
vi = n
considering that these elements are closer to centroid 1; the other ele- ∑ [Ai (Xk )]m
ments are closer to centroid 2 and are related to vector V2. When ele- k=1 (5)
vation data are input in kilometers, despite the identical distribution of
The calculation of the relevance degree of the element xk to the class
data, owing to change in the metric units, the nearest elements to
Ai (Eq. (6)) is performed for each xk ∈ X and for the whole i ∈ {1,2,…,c } ,
centroid 1 are now points C and D. Thus, the importance of applying
if ||xk −vi(t ) ||2 > 0 .
data normalization methods before inputting the data to the clustering
1 −1
algorithm is demonstrated. ⎡ c ⎛ ||x −ν (t ) ||2 ⎞ m − 1 ⎤
For normalization of the selected variable, after interpolation by k t
Ai(t + 1) (Xk ) = ⎢ ∑ ⎜ ⎥
kriging, we used three methods: standard score, average and range. ⎢ j = 1 ||xk −νj(t ) ||2 ⎟ ⎥
⎣ ⎝ ⎠ ⎦ (6)
2.5. Delineation of the management zones (MZs) where ||xk −vi(t ) ||2
represents the distance between xk and vi .
After performing the normalization of all variables, the MZs were
Considering the widespread acceptance of the fuzzy C-means algo- delineated considering an error parameter equals to 0.0001 and a
rithm (Iliadis et al., 2010; Arno et al., 2011; Valente et al., 2012; Li weight index equals to 1.3 in the Fuzzy C-Means algorithm, thus
et al., 2013), it was used to delineate the MZs. This algorithm yields creating 2, 3, and 4 zones in Fields A, B and C, respectively. MZs were
good results (Jipkate and Gohokar, 2012; Mingoti and Lima, 2006), also delineated without applying the normalization processes to the
performs zoning automatically and in a non-subjective way (Fridgen input data, in order to compare with the other evaluated methods.
et al., 2004), and allows the division of a dataset in C-clusters with
reference to a center of mass or centroid for each built cluster (Fridgen 2.6. Evaluation of MZs
et al., 2004).
Statistically, the Fuzzy C-means technique minimizes the sum of The performance of the normalization methods in the delineation of
errors squares within each class following some criteria and the data are MZs was assessed using:
grouped iteratively to the nearest class using the minimum distance
criterion. The method assumes that a dataset X = {x1,x2,…,x n} where xk (a) Variance Reduction Index (VR, Eq. (7)) (Dobermann et al., 2003;
corresponds to a features vector xk = {xk1,xk 2,…,xkp} ∈ RP for each Xiang et al., 2007). This index was used for the normalized average
k ∈ {1,2,…,n} where RP is the p-dimensional space. The aim is to find a yield variable, with the expectation that the sum of the variances for
pseudo partition Fuzzy that corresponds to a family of C Fuzzy sets of X, each MZ will be smaller than the total variance.
which best represents the data structure and is denoted by
241
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
c
Table 2
⎛ ∑ W ∗V ⎞
i mz i Summary of the descriptive statistics calculated for the original yield data (sampling data)
⎜ ⎟
VR = ⎜1− i = 1 ⎟ ∗100
of each considered year and for the normalized yield data.
Vfield
⎜ ⎟ Field Year /crop Mean Median Max. Min. SD
⎝ ⎠ (7)
Original yield (sampling data) (t ha−1)
where c is the number of MZs; Wi is the proportion of the area in each A 2012/S 3.984 4.067 5.068 2.541 0.472
management zone; Vumi is the variance of the data from each manage- 2013/S 3.941 4.012 6.166 2.057 0.518
ment zone; Warea is the variance of the sample of data for the entire area. 2014/S 4.525 4.553 5.354 3.635 0.281
B 2012/S 5.255 5.255 6.980 2.563 0.749
2013/Co 8.945 9.195 10.799 6.678 0.992
(b) Fuzziness Performance Index (FPI, Eq. (8)) (Fridgen et al., 2004).
2013/S 5.079 5.107 6.808 3.366 0.586
This index allows for the determination of the degree of separation 2014/Co 10.276 10.415 13.342 6.763 1.095
(i.e., confusion) between the fuzzy c-clusters of a dataset X. When 2014/S 3.888 3.819 4.759 2.934 0.496
the FPI values approach 0, distinct classes are indicated with only a C 2010/S 2.638 2.565 4.340 1.550 0.606
small degree of sharing among members (data), whereas values 2011/S 3.243 3.263 4.644 2.300 0.484
close to 1 indicate no distinct classes, with a high degree of sharing Normalized yield
among members of classes. A – 0.000 −0.045 1.813 −1.184 0.601
B – 0.000 −0.017 0.635 −1.538 0.398
n c C – 0.000 0.005 1.759 −1.680 0.071
c ⎡ ⎤
FPI = 1− 1− ∑ ∑ (uij )2 / n⎥
(c−1) ⎢ j=1 i=1 (8) Co: corn; S: soybean; SD: standard deviation.
⎣ ⎦
where c is the number of clusters; n is the number of observations; uij is
the element of the fuzzy membership matrix.
there was no spatial dependence within each MZ).
(c) Modified Partition Entropy Index (MPE, Eq. (9)) (Boydell and
Mcbratney, 2002). This index estimates the amount of dis- (f) Kappa index (K, Eq. (11)) (Cohen, 1960). The MZs delineated from
organization created by a specific number of clusters. MPE values non-normalized data and normalized by the three methods (stan-
close to 1 indicate that disorganization predominates, whereas va- dard score, range, and average) were compared using K index. K
lues approaching 0 indicate better organization. evaluates the level of agreement, where 0 < K ≤ 0.2 indicates no
n c agreement, 0.2 < K ≤ 0.4 weak agreement, 0.4 < K ≤ 0.6 mod-
−∑ ∑ uij log(uij )/ n erate agreement, 0.6 < K ≤ 0.8 strong agreement, and
j=1 i=1 0.8 < K ≤ 1 very strong agreement (Landis and Koch, 1977).
MPE =
logc (9) r r
⎧ ⎫
wherein c is the number of clusters; n is the number of observations; uij n ∑ x ii− ∑ (x i +∗x+i )
⎨ i=1 ⎬
K= ⎩ ⎭
i = 1
is the ij elements of the fuzzy membership matrix. r
⎧ 2 ⎫
n − ∑ (x i +∗x+i )
(d) Smoothness Index (SI, Eq. (10)) (Gavioli et al., 2016). This index ⎨ ⎬ (11)
⎩ i=1 ⎭
calculates the frequency of shifts in classes of the thematic map in
wherein K is the Kappa concordance index; n is the total number of
horizontal, vertical, and diagonal directions. It characterizes the
observations (sample points); r is the number of error matrix classes;
smoothness of the contour curves by pixel. If a hypothetical map
xii is the number of combinations diagonally; xi + is the total of ob-
possessed a uniform area, resulting in the smoothness index would
servations in line i; x + i is the total of observations in column.
be 100% because of the lack of class changes. Likewise, if a map
was created with random values, the smoothness index would be
near zero.
k
⎛⎛ k k k
⎞ ⎞
⎜ ⎜ ∑ NMHi ∑ NMVj ∑ NMDdl ∑ NMDem ⎟ ⎟
j=1
SI = 100−⎜ ⎜ i = 1 + + l=1
+ m=1
⎟ ∗100⎟
⎜ ⎜ 4PH 4PV 4PDd 4PDe
⎟ ⎟ Table 3
⎜⎜ ⎟ ⎟ Scheme for selection and disposal of variables for the generation of management zones.
⎝⎝ ⎠ ⎠
(10) Variables Field A Field B Field C
where NMHI is the number of changes in the i line (horizontal); NMVJ is 2012 2013 2014 2012 2013 2014 2010
the number of changes in the j column (vertical); NMDDL is the number
SPR 0.0–0.1 m (Mpa) ₣ X ₣ X* X* X
of changes in the l diagonal (right diagonal - DD ); NMDEM is the number
SPR 0.1–0.2 m (Mpa) X X X X X †
of changes in the m diagonal (left diagonal - DE ); K is the maximum SPR 0.2–0.3 m (Mpa) X X* † X X †
number of pixels in the line, column or diagonal; PH is the possibility of Elevation (m) ₣ ₣ ₣
changing pixels horizontally; PV is the possibility of changing pixels Slope (°) X X*
vertically; PDD is the possibility of changes in the right diagonal - DD ; Density (g cm−3) X X
Sand (%) † X †
PDE is the possibility of changes in the left diagonal - DE . Silt (%) X X* X
Clay (%) † X* †
(e) Analysis of Variance (ANOVA): OM (%) X
The yield values were compared between MZs by using the nor- [X] - Eliminated for not having spatial autocorrelation; [X*] - Eliminated for not having
spatial correlation with yield; [†] - Eliminated for being redundant; [₣] - Selected to
malized average yield, and performing the Tukey’s range test to identify
generate the MZs.
whether the delineated sub-regions showed significant differences
(significance level of 0.05) in normalized average yield (assuming that
242
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Table 4 all three fields (Table 3) and SPR ranging 0.0–0.1 m for fields A and B.
Geostatistical analysis of the variables selected for the generation of management zones. The choice of these variables is consistent with other studies
(Peralta et al., 2013a), which reported spatial association between
Variable Field Model Nugget Sill Range
elevation and physical properties of soil with yield of soybean and
Elevation A Spherical 0 54.75 221 wheat. Good results in the delineation of MZs were also obtained in
SPR 0.0–0.1 m (2013) A Exponential 8538 1872 571 fields cultivated with soybean and corn, using elevation and soil electric
Elevation B Exponential 12.68 37.49 356
conductivity data (Jaynes et al., 2005).
SPR 0.0–0.1 m (2012) B Exponential 4523 16544 123
Elevation C Exponential 0.19 12.53 450
3.2. Thematic maps and MZs
Fig. 4. Management zones for Field A, delineated using altitude (m) and SPR 0–0.1 m (MPa) collected in 2013, as input variables, considering the original data (a), and data normalized
by the methods of standard score (b), range (c), and average (d).
243
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Fig. 5. Management zones for Field B, delineated using altitude (m) and SPR 0–0.1 m (MPa) collected in 2012, as input variables, considering the original data (a), and data normalized
by the methods of standard score (b), range (c), and average (d).
Fig. 6. Management zones for Field C, delineated using altitude (m) as input variable, considering the original data (a), and data normalized by the methods of standard score (b), range
(c), and average (d).
MZs had different average yields in Fields A and C; however, for Field yields. In Field B, although there were differences in the variance when
A, this was only when the data were normalized by standard score or data were normalized by the standard score and range methods
range. In Field C, when more than two MZs were delineated, despite the (VR = 2.9% and VR = 6.5%, respectively), average yields were equal.
reduction in variance, it was not possible to verify significantly different MPE and FPI showed diverse results for each set of normalized data,
244
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Fig. 7. Management zones for Field B, delineated using altitude (m) and SPR 0–0.1 m (kPa) collected in 2012, as input variables, considering the original data (a), and data normalized by
the methods of standard score (b), range (c), and average (d).
245
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Table 5
Evaluation indices calculated for different methods of normalization in each field.
Field N° Zones Normalization Method ANOVA and Tukeys’ test* VR% FPI MPE SI%
MZ 1 MZ 2 MZ 3 MZ 4
Fig. 8. Statistics of variance reduction index (VR), fuzziness performance index (FPI), and modified partition entropy index (MPE) obtained after clustering process in the Fields A, B, and
C.
246
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Fig. 9. Smoothness Index calculated for Fields A, B, and C, as a function of normalization methods.
Appendix A. Supplementary material Fu, Q., Wang, Z., Jiang, Q., 2010. Delineating soil nutrient management zones based on
fuzzy clustering optimized by PSO. Math. Comput. Modell. 51, 1299–1305.
Gavioli, A., Souza, E.G., Bazzi, C.L., Guedes, L.P.C., Schenatto, K., 2016. Optimization of
Supplementary data associated with this article can be found, in the management zone delineation by using spatial principal components. Comput.
online version, at http://dx.doi.org/10.1016/j.compag.2017.10.017. Electron. Agric. 127, 302–310.
Hedley, C., 2015. The role of precision agriculture for improved nutrient management on
farms. J. Sci. Food Agric. 95, 12–19.
References Iliadis, L.S., Vangeloudh, M., Spartalis, S., 2010. An intelligent system employing an
enhanced fuzzy C-Means clustering model: application in the case of forest fires.
Comput. Electron. Agric. 70, 276–284.
Anderberg, M.R., 1973. Cluster Analysis for Applications. Academic Press, New York, NY.
Jaynes, D.B., Colvin, T.S., Kaspar, T.C., 2005. Identifying potential soybean management
Arno, J., Martinez-Casasnovas, J.A., Ribes-Dasi, M., Rosell, J.R., 2011. Clustering of grape
zones from multi-year yield data. Comput. Electron. Agric. 46, 309–327.
yield maps to delineate site-specific management zones. Spanish J. Agric. Res. 9,
Jaynes, D.B., Kaspar, T.C., Colvin, T.S., James, D.E., 2003. Cluster analysis of spatio-
721–729.
temporal corn yield patterns in an Iowa field. Agron. J. 95, 574–586.
Arslan, H., 2012. Spatial and temporal mapping of groundwater salinity using ordinary
Jipkate, B.R., Gohokar, V.V., 2012. A Comparative analysis of Fuzzy C-Means clustering
kriging and indicator kriging: the case of Bafra Plain, Turkey. Agric. Water Manage.
and K-Means clustering algorithms. Int. J. Comput. Eng. 2, 737–739.
113, 57–63.
Journel, A.G., Huijbregts, C.J., 1978. Mining Geostatistics. The Blackburn Press, New
Bansod, B.S., Pandey, O.P., 2013. An application of PCA and fuzzy C-Means to delineate
York, NY.
management zones and variability analysis of soil. Eurasian Soil Sci. 46, 556–564.
Kitchen, N.R., Sudduth, K.A., Myers, D.B., Drummond, S.T., Hong, S.Y., 2005. Delineating
Bazzi, C.L., Souza, E.G., Konopatzki, M.R., Nobrega, L.H.P., Uribe-Opazo, M.A., 2015.
productivity zones on claypan soil fields using apparent soil electrical conductivity.
Management zones applied to pear orchard. Int. J. Food Agric. Environm. 13, 86–92.
Comput. Electron. Agric. 46, 285–308.
Bazzi, C.L., Souza, E.G., Uribe-Opazo, M.A., Nobrega, L.H.P., Rocha, D.M., 2013.
Landis, J.R., Koch, G.G., 1977. The measurement of observer agreement for categorical
Management zones definition using soil chemical and physical attributes in a soybean
data. Biometrics 33, 159–174.
area. Engenharia Agrícola 33, 952–964.
Larscheid, G., Blackmore, B.S., 1996. Interactions between farm managers and informa-
Bezdek, J.C., 1981. Pattern Recognition With Fuzzy Objective Function Algorithms.
tion systems with respect to yield mapping. In: Int. Conf. Precision Agric. American
Springer, US, New York, NY.
Society of Agronomy, Minneapolis, pp. 1153–1163.
Bonham, C.D., Reich, R.M., Leader, K.K., 1995. Spatial cross-correlation of Boutelua
Li, Y., Shi, Z., Li, F., Li, H., 2007. Delineation of site-specific management zones using
gracilis with site factor. Grassland Sci. 41, 196–201.
fuzzy clustering analysis in a coastal saline land. Comput. Electron. Agric. 56,
Boydell, B., Mcbratney, A.B., 2002. Identifying potential within-field management zones
174–186.
from cotton yield estimates. Precision Agric. 3, 9–23.
Li, Y., Shi, Z., Wu, H., Li, F., Li, H., 2013. Definition of management zones for enhancing
Bunselmeyer, H.A., Lauer, J.G., 2015. Using corn and soybean yield history to predict
cultivated land conservation using combined spatial data. Environ. Manage. 52,
subfield yield response. Agron. J. 107, 558–562.
792–806.
Burrough, P.A., 1989. Fuzzy mathematical methods for soil survey and land evaluation.
Mielke, P.W., Berry, K.J., 2007. Permutation Methods: A Distance Function Approach.
Eur. J. Soil Sci. 40, 477–492.
Springer, New York, NY.
Cohen, J.A., 1960. Coefficient of agreement for nominal scales. Educ. Psychol. Measur.
Milani, L., Souza, E.G., Uribe-Opazo, M.A., Gabriel Filho, A.G., Johann, J.A., Pereira, J.O.,
20, 37–46.
2006. Determination of management zones using yield data. Acta Scientiarum Agron.
Czaplewski, R.L., Reich, R.M., 1993. Expected value and variance of Moran's bivariate
28, 591–598.
spatial autocorrelation statistic under permutation. Fort Collins, CO: Research Paper.
Milligan, G.W., Cooper, M.C., 1988. A study of standardization of variables in cluster
Dobermann, A., Ping, J.L., Adamchuk, V.I., Simbahan, G.C., Ferguson, R.B., 2003.
analysis. J. Classif. 5, 181–204.
Classification of crop yield variability in irrigated production fields. Agron. J. 95,
Mingoti, S.A., Lima, J.O., 2006. Comparing SOM neural network with Fuzzy C-means, K-
1105–1120.
means and traditional hierarchical clustering algorithms. Eur. J. Oper. Res. 174,
Doerge, T.A., 2000. Management Zone Concepts. Potash & Phosphate Institute,
1742–1759.
Norcross, GA.
Molin, J.P., 2002. Definição de unidades de manejo a partir de mapas de produtividade.
Embrapa, Brazilian Agricultural Research Corporation, 2006. Brazilian System of Soil
Engenharia Agrícola 22, 83–92.
Classification. Rio de Janeiro, RJ: CNPSO.
Moral, F.J., Terrón, J.M., Silva, J.R.M., 2010. Delineation of management zones using
Farid, H.U., Bakhsh, A., Ahmad, N., Ahmad, A., Mahmood-Khan, Z., 2016. Delineating
mobile measurements of soil apparent electrical conductivity and multivariate
site-specific management zones for precision agriculture. J. Agric. Sci. 154, 273–286.
geostatistical techniques. Soil Tillage Res. 106, 335–343.
Fraisse, C.W., Sudduth, K.A., Kitchen, N.R., 2001. Delineation of site–specific manage-
Odeh, I.O.A., Mcbratney, A.B., Chittleborough, D.J., 1992. Soil pattern recognition with
ment zones by unsupervised classification of topographic attributes and soil electrical
fuzzy c-means: application to classification and soil-landform interrelationships. Soil
conductivity. Trans. ASAE 44, 155–166.
Sci. Soc. Am. J. 56, 505–516.
Fridgen, J.J., Kitchen, N.R., Sudduth, K.A., Drummond, S.T., Wiebold, W.J., Fraisse, C.W.,
Ortega, R.A., Santibáñez, O.A., 2007. Determination of management zones in corn (Zea
2004. Management zone analyst (MZA): software for subfield management zone
mays L.) based on soil fertility. Comput. Electron. Agric. 58, 49–59.
delineation. Agron. J. 96, 100–108.
247
K. Schenatto et al. Computers and Electronics in Agriculture 143 (2017) 238–248
Pedroso, M., Taylor, J., Tisseyre, B., Charnomordic, B., Guillaume, S., 2010. A segmen- Stafford, J.V., Ambler, B., Lark, R.M., Catt, J., 1996. Mapping and interpreting the yield
tation algorithm for the delineation of agricultural management zones. Comput. variation in cereal crops. Comput. Electron. Agric. 14, 101–119.
Electron. Agric. 70, 199–208. Sun, Y., Kang, S., Li, F., Zhang, L., 2009. Comparison of interpolation methods for depth
Peralta, N., Costa, J.L., Castro, F., Balzarini, M., 2013a. Delimitación de zonas de manejo to groundwater and its temporal and spatial variations in the Minqin oasis of
con modelos de elevación digital y profundidad de suelo. Interciencia 38, 418–424. northwest China. Environm. Modell. Softw. 24, 1163–1170.
Peralta, N.R., Costa, J.L., 2013. Delineation of management zones with soil apparent Suszek, G., Souza, E.G., Uribe-Opazo, M.A., Nobrega, L.H.P., 2011. Determination of
electrical conductivity to improve nutrient management. Comput. Electron. Agric. management zones from normalized and standardized equivalent productivity maps
99, 218–226. in the soybean culture. Engenharia Agrícola 31, 895–905.
Reich, R.M., Czaplewski, R.L., Bechtold, W.A., 1994. Spatial cross-correlation of un- Tagarakis, A., Liakos, V., Fountas, S., Koundouras, S., Gemtos, T.A., 2013. Management
disturbed, natural shortleaf pine stands in northern Georgia. Environ. Ecol. Stat. 1, zones delineation using fuzzy clustering techniques in grapevines. Precision Agric.
201–217. 14, 18–39.
Rodrigues Junior, F.A., VIeira, L.B., Queiroz, D.M., Santos, N.T., 2011. Geração de zonas Tilman, D., Balzer, C., Hill, J., Befort, B., 2011. Global food demand and the sustainable
de manejo para cafeicultura empregando-se sensor SPAD e análise foliar. Revista intensification of agriculture. Proc. Natl. Acad. Sci. 108, 20260–20264.
Brasileira de Engenharia Agrícola e Ambiental 15, 778–787. Valente, D.S.M., Queiroz, D.M., Pinto, F.A.C., Santos, N.T., Santos, F.L., 2012. Definition
Romesburg, H.C., 1984. Cluster Analysis for Researchers, Belmont. Lifetime Learning of management zones in coffee production fields based on apparent soil electrical
Publications, CA, pp. 333p. conductivity. Scientia Agricola 69, 173–179.
Schenatto, K., Souza, E.G., Bazzi, C.L., Bier, V.A., Betzek, N.M., Gavioli, A., 2016. Data Xiang, L., Pan, Y., Ge, Z., Zhao, C., 2007. Delineation and scale effect of precision agri-
Interpolation in the definition of management zones. Acta Scientiarum Technol. 38, culture management zones using yield monitor data over four years. Agric. Sci. Chin.
31–40. 6, 180–188.
Schepers, A.R., Shanahan, J.F., Liebig, M.A., Schepers, J.S., Johnson, S.H., Luchiari, A., Zhang, Z., Lü, X., Lv, N., Chen, J., Feng, B., Li, X.W., 2013. Defining agricultural man-
2004. Appropriateness of management zones for characterizing spatial variability of agement zones using Gis techniques: Case study of Drip-irrigated cotton fields.
soil properties and irrigated corn yields across years. Agron. J. 96, 195–203. Inform. Technol. J. 12, 6241–6246.
248