0% found this document useful (0 votes)
42 views4 pages

6151 978-1-5386-7150-4/18/$31.00 ©2018 Ieee Igarss 2018

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views4 pages

6151 978-1-5386-7150-4/18/$31.00 ©2018 Ieee Igarss 2018

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

INTRODUCING A FRAMEWORK OF SELF-ORGANIZING MAPS FOR REGRESSION OF

SOIL MOISTURE WITH HYPERSPECTRAL DATA

Felix M. Riese, Sina Keller

Karlsruhe Institute of Technology (KIT)


Institute of Photogrammetry and Remote Sensing (IPF),
Englerstraße 7, D-76131 Karlsruhe, Germany

ABSTRACT to cluster or to visualize data. Most of the supervised learn-


In this paper, we introduce a framework to solve regression ing approaches involving SOMs and using hyperspectral data
problems based on high-dimensional and small datasets. This solve classification problems [3, 4, 5]. A SOM-based model-
framework involves two self-organizing maps (SOM) and ing for non-linear regression was recently applied in the field
combines unsupervised with supervised learning. We investi- of robotics [6]. In the proposed framework, SOMs solve a re-
gate the impacts of SOM hyperparameters on the regression gression problem of modeling subsurface soil moisture solely
performance and compare the results of the SOM framework with hyperspectral data by combining unsupervised and su-
with two established regressors on a measured dataset. The pervised machine learning. This approach is model- and data-
derived results reveal the potential of the SOM framework. independent.
Finally, we propose further research aspects for the SOM We explain the single components of our framework for
framework to analyze its capabilities and limitations. We the regression of hyperspectral data in Section 2. Subse-
have published our dataset in [1] to ensure the reproducibility quently, we describe the self-conducted dataset used for the
of the results. evaluation in Section 3 and assess the results of the SOM
framework with two established regression methods in Sec-
Index Terms— Self-organizing maps, machine learning, tion 4. Finally, we conclude our studies in Section 5 and
regression, hyperspectral data, soil moisture give an overview about the future applications of SOM-based
modeling in the field of remote sensing.
1. INTRODUCTION
2. SOM FRAMEWORK FOR REGRESSION
Hyperspectral remote sensing provides non-invasive tech-
niques to monitor ecological and hydrological processes via Kohonen introduced self-organizing maps (SOM) for the first
the soil surface. With the appropriate combination of spa- time in 1990 [7]. They have been applied since then in various
tial coverage as well as spatial and temporal resolution, it ways, mostly for the visualization of a dataset, unsupervised
overcomes gaps which are left behind by in situ point mea- clustering as input for other estimators, and simple classifi-
surements. The modeling of these processes with hyperspec- cation. In this paper, we present our idea of a framework
tral data is a high-dimensional and non-linear problem. In consisting of two SOMs – described as the input and the out-
the context of modeling subsurface soil moisture, the input put SOM (see Fig. 1) – to solve regression problems based on
datasets are of limited size since the data mostly derives from high-dimensional input data. The following two subsections
field campaigns or laboratory measurements. describe the training process of the SOM framework.
In this paper, we introduce a framework which addresses
the regression of high-dimensional datasets. This framework
2.1. Unsupervised input SOM
is applied exemplarily on a measured dataset to estimate soil
moisture values. We focus on an entirely data-driven ap- The input SOM is the first part of the presented framework
proach without model biases and without the engineering of which is proceeded in an unsupervised approach. It clusters
features by combining spectral bands (e.g. see [2]). Two the input data without ground-truth information. To estimate
well-established data-driven methods for regression problems soil-moisture values, the hyperspectral data functions as sole
based on hyperspectral data are Random Forest (RF) and Sup- SOM input. A SOM of the size M × N consists of M · N
port Vector Regressor (SVR). neurons, called nodes. Each node is characterized by two dif-
The introduced framework involves two self-organizing ferent attributes: a 2D position on the map and a weight vector
maps (SOM). The majority of SOM implementations per- with the same dimension n as the input vector. Two main al-
form as unsupervised learning approaches with the objective gorithms for SOMs exist: the online algorithm which iterates

978-1-5386-7150-4/18/$31.00 ©2018 IEEE 6151 IGARSS 2018

Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
y
x1 SOM framework
training
subset x2 input SOM output SOM
input
inputSOM
input
input SOM
SOM
SOM
⠇ training (unsupervised)
(unsupervised) (supervised)
(unsupervised)
(unsupervised)
(unsupervised)
xn
yestimation
x1
x2 test
test ⠇ M⨉N⨉n M⨉N⨉1
subset xn

y evaluation score

Fig. 1: Evaluation schema of the proposed SOM framework. The dataset is split into a training and a test subset. The solid lines
represent the training of the framework and the dashed lines represent the test and evaluation part. The evaluation outcome is a
score such as the coefficient of determination R2 .

randomly through the dataset, and the batch algorithm which identical to the shape of the input SOM. The weight vec-
relies on the entire dataset in every iteration. We have imple- tors of the output SOM are one-dimensional, in contrast to
mented the online algorithm [7, 8] in the following without the n-dimensional weight vectors of the input SOM, and con-
loss of generality. Its structure is defined by four steps: tain soil-moisture values. The applied algorithm of the output
SOM proceeds similarly to the input SOM algorithm, as de-
1. Random initialization of the weight vectors. scribed in Section 2.1. But the second step of the algorithm
2. Calculation of the best matching unit (BMU) on the differs: the search for the BMU is performed within the final-
map by searching for the node with the minimal Eu- ized input SOM. This BMU is used for step 3 and 4 of the
clidean distance to the current input vector. output SOM algorithm. As before, the algorithm stops when
reaching the maximum number of iterations. This completes
3. Calculation of the BMU’s neighborhood on the map the supervised learning part and the training of the whole
based on a decreasing neighborhood radius σ(t) with SOM framework. After the training the framework is able
σmin ≤ σ(t) ≤ σmax and the current iteration t. to estimate e.g. soil moisture values based on hyperspectral
input data.
4. Adaptation of all neighboring nodes to the input vec-
tor based on a decreasing learning rate α(t) with
0 ≤ α(t) ≤ 1. The weight vector wi (t) of node i 3. SOIL MOISTURE BENCHMARK DATASET
at iteration t is adapted as follows:
The dataset [1] to benchmark the introduced SOM framework
wi (t + 1) = wi (t) + hci (t) · (x(t) − wi (t)) . (1)
was measured in a five-day field campaign lasting over two
We choose a pseudo-gaussian neighborhood function weeks in May 2017. To ensure real-world conditions and any
hci (t) defined as transfer capability, we settle on this field campaign. By con-
ducting a particular campaign, we are able to supervise e.g.
d2ci
 
the sensor settings and the weather conditions (daytime, clear
hci (t) = α(t) · exp − , (2)
2 · σ(t)2 sky). As a result, we obtain a well-defined and sophisticated
dataset. Without pre-empting we point out that we prospec-
with the map distance dci between the nodes c and i. tively will apply our SOM framework on existing datasets to
The cycle between step 2 and 4 is repeated with randomly demonstrate its generic capacity.
selected input vectors until the maximum number of itera- An undisturbed soil sample is the centerpiece of our mea-
tions is reached. This step completes the unsupervised learn- surement setup. The soil sample consists of bare soil with-
ing process. From this point on, the input SOM remains un- out any vegetation and was taken in the area near Waldbronn,
changed. Germany. The composition type is defined as strongly clayey
silt. The soil sample has a radius of 15 cm and a height of
20 cm. It is irrigated according to a defined schema. The
2.2. Supervised output SOM
resulting variation of soil moisture ranges from 25 % to 42 %
The output SOM forms the second basis of the framework (cf. Fig. 2b). The soil moisture is measured with time-domain
which is proceeded as a supervised approach. Its shape is reflectometry (TDR) sensors that are mounted parallel to the

6152

Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
0 3 0
8000 40
40 10 38

Number of datapoints
10

Soil moisture in %
Percentage of datapoints
0.14 36

Sensor counts
6000
Sensor rows

SOM rows
SOM rows
30 20
20 34
0.10 2
4000 20 32
30
30
0.06 30
2000 10 40
40 28
0.02 26
0 1
0 10 20 30 40 0 10 20 30 40
0 10 20 30 40 26 30 34 38 42 SOM columns SOM columns
Sensor columns Soil moisture in %
(a) (b)
(a) (b)
Fig. 3: (a) Datapoint distribution from the training subset per
Fig. 2: (a) Example of a hyperspectral image at the wave-
input SOM node. (b) Soil-moisture distribution of the output
length 950 nm recorded by the Cubert UHD 285. The circle-
SOM. The horizontal and vertical axis present the rows and
like pattern in the image illustrates the soil sample and the
columns of the map.
spectralon is shown partly in the upper left. (b) Histogram
and box plot of the measured soil-moisture values.
significant than the number of input SOM iterations. In gen-
eral, the regression error of the SOM framework decreases,
soil surface in various depths of 2 cm to 18 cm. For our stud-
firstly with increasing size of the SOM and, secondly, with
ies, we refer to the uppermost sensor in a depth of 2 cm since
increasing number of iterations. On the other hand, the in-
it is the best estimate of the subsurface soil moisture.
crease of these hyperparameters leads to a time-consuming
The hyperspectral remote sensing data is captured with a
training of the SOM framework. Therefore, we propose a
Cubert1 UHD 285 hyperspectral snapshot camera. The hy-
SOM size at which the training time is minimized with a si-
perspectral images consist of 50 × 50 pixels. Each pixel is
multaneously well-performing regression. In summary, each
characterized by n = 125 spectral channels between 450 nm
choice of SOM hyperparameters affects the regression results
to 950 nm with a spectral resolution of 4 nm. Fig. 2a shows
and requires detailed analysis.
an example of a hyperspectral image.
In the following, the SOM framework is set up on a rect-
The camera is installed on a tripod at 1.70 m height. The
angular 50 × 50-grid2 with 5000 iterations of the input SOM
field-of-view contents the entire soil sample as well as parts
and 10 000 iterations of the output SOM. Fig. 3a shows the
of the spectralon. For each image, one datapoint is calculated
datapoint distribution on the grid of the training subset after
as a mean spectrum of the soil surface and calibrated with
the input SOM training. The nearly uniform distribution on
the spectralon spectrum. No further preprocessing or filter-
the grid verifies the consistent adaptation of the input SOM to
ing is applied to the data. The dataset consists of 679 high-
the training subset.
dimensional datapoints with 125 hyperspectral bands and one
Fig. 3b presents the soil-moisture distribution of the out-
soil-moisture value as ground truth. The distribution of the
put SOM. The distribution is characterized by several smooth
latter is shown in Fig. 2b.
areas with similar soil-moisture values. An additional finding
is: the more σmin approximates zero, the smoother the tran-
4. RESULTS AND EVALUATION sition between the areas with similar soil-moisture values of
the output SOM becomes.
We describe the setup of the hyperparameters of the SOM
framework in Section 4.1 and compare the performance of
4.2. Assessment of the SOM performance
the SOM framework with other regressors in Section 4.2.
With respect to the following evaluation, the dataset is split
4.1. Impact of the SOM hyperparameters into a training subset with 339 datapoints and a test subset
with 340 datapoints. All regressors are trained on the training
Hyperparameters are parameters which are set before the subset and perform the estimation on the test subset. This
training process of an estimator. We examine the impacts of procedure minimizes the degree of overfitting.
these hyperparameters on the regression performance with The SOM framework is compared to two standard regres-
a grid-search approach. Examples of such hyperparameters sors RF [9] and SVR [10, 11]. Both are implemented in scikit-
are the maximum number of iterations and the size of the learn [12]. The hyperparameters of SVR are optimized with a
SOM. Regarding the impact of the former, we discover that grid search approach leading to the hyperparameters C = 100
the impact of the number of output SOM iterations is more
2 The choice of this SOM size is independent of the hyperspectral image
1 Cubert GmbH, Ulm, Germany size. Coincidentally, both sizes are equal.

6153

Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
Table 1: Results of the evaluation on the test subset with the 6. ACKNOWLEDGEMENTS
coefficient of determination R2 and RMSE.
We thank Hubert B. Keller for his ideas on the SOM frame-
Model R2 in % RMSE in % soil moisture work, Stefan Hinz as well as Conrad Jackisch for the techni-
cal support with the TDR sensors and his help with the field
SOM 96.78 0.66
measurements.
SVR 96.03 0.74
RF 93.60 0.94
7. REFERENCES

[1] Felix M. Riese and Sina Keller, “Hyper-


and γ = 100. The RF is implemented with the default param- spectral benchmark dataset on soil moisture,”
eters and 1000 estimators. The coefficient of determination doi.org/10.5281/zenodo.1227837, 2018.
(R2 ) and the root mean square error (RMSE) function as the [2] S. Fabre, X. Briottet, and A. Lesaignoux, “Estimation
quality metrics. The results of the regression performed on of soil moisture content from the spectral reflectance of
the dataset introduced in Section 3 is shown in Table 1. bare soils in the 0.4 − 2.5µm domain,” MDPI, vol. 15,
The SOM framework outperforms the two regressors in pp. 3262–3281, 2015.
both quality metrics. The differences are marginal, all three
regressors achieve a R2 score of more than 93 %. Two impor- [3] P. Martinez, J.A. Gualtieri, P.L. Aguilar, A. Plaza, R.M.
tant aspects have to be considered to interpret these results: Prez, and J.C. Preciado, “Hyperspectral image classifi-
first, the tuning of the SOM framework is time-consuming cation using a self-organizing map,” in Proceedings of
and can be extended to all hyperparameters. We expect the the Tenth JPL Airborne Earth Science Workshop, 2001,
SOM performance to increase after applying intensive tun- vol. 10, pp. 267–274.
ing techniques. At the same time, the RF regression results [4] N. Zaccarelli, G. Zurlini, G. Rizzo, E. Blasi, and
are expected to remain similar after tuning the RF regressor. M. Palazzo, Sensors for Environmental Control, chapter
Second, the SOM framework performs a model-independent Spectral Self-Organizing Map for hyperspectral image
regression: no model bias such as physical relations affect the classification, pp. 218–223, World Scientific Publishing
regression results. Company Incorporated, 2003.
[5] Y. Zhong, L. Zhang, B. Huang, and P. Li, “An unsuper-
vised artificial immune classifier for multi/hyperspectral
remote sensing imagery,” IEEE TGRS, vol. 44, pp. 420–
5. CONCLUSION
431, 2006.
[6] T. Hecht, M. Lefort, and A. Gepperth, “Using self-
We introduce a SOM framework that combines unsupervised
organizing maps for regression: the importance of the
with supervised machine learning to solve regression prob-
output function,” in ESANN, 2015, pp. 107–112.
lems based on high-dimensional data. We compare the re-
gression results of the SOM framework with RF and SVR on a [7] T. Kohonen, “The self-organizing map,” Proceedings of
measured benchmark dataset for the soil-moisture regression the IEEE, vol. 78, no. 9, pp. 1464–1480, 1990.
based on hyperspectral data. The SOM framework already
outperforms both regressors with basic tuning techniques and [8] T. Kohonen, “Essentials of the self-organizing map,”
without any feature selection. Many tuning possibilities exist Neural Networks, vol. 37, pp. 52–65, 2013.
to further enhance the SOM framework: for example a more [9] L. Breiman, “Random forests,” Machine Learning, vol.
effective initialization based on a principal component analy- 45, no. 1, pp. 5–32, 2001.
sis, more sophisticated distance measures than the Euclidean
distance, or further neighborhood and learning rate functions. [10] V.N. Vapnik, The Nature of Statistical Learning Theory,
1995.
SOMs cluster the input data in a manner that can provide
an enhanced understanding of the input data and the underly- [11] G. Camps-Valls, L. Bruzzone, J.L. Rojo-lvarez, and
ing model. Moreover, the SOM framework seems to be able F. Melgani, “Robust support vector regression for bio-
to handle incomplete or corrupt datasets for regression. Con- physical variable estimation from remotely sensed im-
sequently, we propose further detailed research on the basis ages,” IEEE GRSL, vol. 3, pp. 339–343, 2006.
of this SOM framework to evaluate its prospects in the con-
[12] T. E. Oliphant, “Python for scientific computing,” Com-
text of high-dimensional remote sensing data and also to seek
puting in Science Engineering, vol. 9, no. 3, pp. 10–20,
its limitation. The SOM framework code and the benchmark
2007.
dataset are considered to be published in prospective studies.

6154

Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy