6151 978-1-5386-7150-4/18/$31.00 ©2018 Ieee Igarss 2018
6151 978-1-5386-7150-4/18/$31.00 ©2018 Ieee Igarss 2018
Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
y
x1 SOM framework
training
subset x2 input SOM output SOM
input
inputSOM
input
input SOM
SOM
SOM
⠇ training (unsupervised)
(unsupervised) (supervised)
(unsupervised)
(unsupervised)
(unsupervised)
xn
yestimation
x1
x2 test
test ⠇ M⨉N⨉n M⨉N⨉1
subset xn
y evaluation score
Fig. 1: Evaluation schema of the proposed SOM framework. The dataset is split into a training and a test subset. The solid lines
represent the training of the framework and the dashed lines represent the test and evaluation part. The evaluation outcome is a
score such as the coefficient of determination R2 .
randomly through the dataset, and the batch algorithm which identical to the shape of the input SOM. The weight vec-
relies on the entire dataset in every iteration. We have imple- tors of the output SOM are one-dimensional, in contrast to
mented the online algorithm [7, 8] in the following without the n-dimensional weight vectors of the input SOM, and con-
loss of generality. Its structure is defined by four steps: tain soil-moisture values. The applied algorithm of the output
SOM proceeds similarly to the input SOM algorithm, as de-
1. Random initialization of the weight vectors. scribed in Section 2.1. But the second step of the algorithm
2. Calculation of the best matching unit (BMU) on the differs: the search for the BMU is performed within the final-
map by searching for the node with the minimal Eu- ized input SOM. This BMU is used for step 3 and 4 of the
clidean distance to the current input vector. output SOM algorithm. As before, the algorithm stops when
reaching the maximum number of iterations. This completes
3. Calculation of the BMU’s neighborhood on the map the supervised learning part and the training of the whole
based on a decreasing neighborhood radius σ(t) with SOM framework. After the training the framework is able
σmin ≤ σ(t) ≤ σmax and the current iteration t. to estimate e.g. soil moisture values based on hyperspectral
input data.
4. Adaptation of all neighboring nodes to the input vec-
tor based on a decreasing learning rate α(t) with
0 ≤ α(t) ≤ 1. The weight vector wi (t) of node i 3. SOIL MOISTURE BENCHMARK DATASET
at iteration t is adapted as follows:
The dataset [1] to benchmark the introduced SOM framework
wi (t + 1) = wi (t) + hci (t) · (x(t) − wi (t)) . (1)
was measured in a five-day field campaign lasting over two
We choose a pseudo-gaussian neighborhood function weeks in May 2017. To ensure real-world conditions and any
hci (t) defined as transfer capability, we settle on this field campaign. By con-
ducting a particular campaign, we are able to supervise e.g.
d2ci
the sensor settings and the weather conditions (daytime, clear
hci (t) = α(t) · exp − , (2)
2 · σ(t)2 sky). As a result, we obtain a well-defined and sophisticated
dataset. Without pre-empting we point out that we prospec-
with the map distance dci between the nodes c and i. tively will apply our SOM framework on existing datasets to
The cycle between step 2 and 4 is repeated with randomly demonstrate its generic capacity.
selected input vectors until the maximum number of itera- An undisturbed soil sample is the centerpiece of our mea-
tions is reached. This step completes the unsupervised learn- surement setup. The soil sample consists of bare soil with-
ing process. From this point on, the input SOM remains un- out any vegetation and was taken in the area near Waldbronn,
changed. Germany. The composition type is defined as strongly clayey
silt. The soil sample has a radius of 15 cm and a height of
20 cm. It is irrigated according to a defined schema. The
2.2. Supervised output SOM
resulting variation of soil moisture ranges from 25 % to 42 %
The output SOM forms the second basis of the framework (cf. Fig. 2b). The soil moisture is measured with time-domain
which is proceeded as a supervised approach. Its shape is reflectometry (TDR) sensors that are mounted parallel to the
6152
Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
0 3 0
8000 40
40 10 38
Number of datapoints
10
Soil moisture in %
Percentage of datapoints
0.14 36
Sensor counts
6000
Sensor rows
SOM rows
SOM rows
30 20
20 34
0.10 2
4000 20 32
30
30
0.06 30
2000 10 40
40 28
0.02 26
0 1
0 10 20 30 40 0 10 20 30 40
0 10 20 30 40 26 30 34 38 42 SOM columns SOM columns
Sensor columns Soil moisture in %
(a) (b)
(a) (b)
Fig. 3: (a) Datapoint distribution from the training subset per
Fig. 2: (a) Example of a hyperspectral image at the wave-
input SOM node. (b) Soil-moisture distribution of the output
length 950 nm recorded by the Cubert UHD 285. The circle-
SOM. The horizontal and vertical axis present the rows and
like pattern in the image illustrates the soil sample and the
columns of the map.
spectralon is shown partly in the upper left. (b) Histogram
and box plot of the measured soil-moisture values.
significant than the number of input SOM iterations. In gen-
eral, the regression error of the SOM framework decreases,
soil surface in various depths of 2 cm to 18 cm. For our stud-
firstly with increasing size of the SOM and, secondly, with
ies, we refer to the uppermost sensor in a depth of 2 cm since
increasing number of iterations. On the other hand, the in-
it is the best estimate of the subsurface soil moisture.
crease of these hyperparameters leads to a time-consuming
The hyperspectral remote sensing data is captured with a
training of the SOM framework. Therefore, we propose a
Cubert1 UHD 285 hyperspectral snapshot camera. The hy-
SOM size at which the training time is minimized with a si-
perspectral images consist of 50 × 50 pixels. Each pixel is
multaneously well-performing regression. In summary, each
characterized by n = 125 spectral channels between 450 nm
choice of SOM hyperparameters affects the regression results
to 950 nm with a spectral resolution of 4 nm. Fig. 2a shows
and requires detailed analysis.
an example of a hyperspectral image.
In the following, the SOM framework is set up on a rect-
The camera is installed on a tripod at 1.70 m height. The
angular 50 × 50-grid2 with 5000 iterations of the input SOM
field-of-view contents the entire soil sample as well as parts
and 10 000 iterations of the output SOM. Fig. 3a shows the
of the spectralon. For each image, one datapoint is calculated
datapoint distribution on the grid of the training subset after
as a mean spectrum of the soil surface and calibrated with
the input SOM training. The nearly uniform distribution on
the spectralon spectrum. No further preprocessing or filter-
the grid verifies the consistent adaptation of the input SOM to
ing is applied to the data. The dataset consists of 679 high-
the training subset.
dimensional datapoints with 125 hyperspectral bands and one
Fig. 3b presents the soil-moisture distribution of the out-
soil-moisture value as ground truth. The distribution of the
put SOM. The distribution is characterized by several smooth
latter is shown in Fig. 2b.
areas with similar soil-moisture values. An additional finding
is: the more σmin approximates zero, the smoother the tran-
4. RESULTS AND EVALUATION sition between the areas with similar soil-moisture values of
the output SOM becomes.
We describe the setup of the hyperparameters of the SOM
framework in Section 4.1 and compare the performance of
4.2. Assessment of the SOM performance
the SOM framework with other regressors in Section 4.2.
With respect to the following evaluation, the dataset is split
4.1. Impact of the SOM hyperparameters into a training subset with 339 datapoints and a test subset
with 340 datapoints. All regressors are trained on the training
Hyperparameters are parameters which are set before the subset and perform the estimation on the test subset. This
training process of an estimator. We examine the impacts of procedure minimizes the degree of overfitting.
these hyperparameters on the regression performance with The SOM framework is compared to two standard regres-
a grid-search approach. Examples of such hyperparameters sors RF [9] and SVR [10, 11]. Both are implemented in scikit-
are the maximum number of iterations and the size of the learn [12]. The hyperparameters of SVR are optimized with a
SOM. Regarding the impact of the former, we discover that grid search approach leading to the hyperparameters C = 100
the impact of the number of output SOM iterations is more
2 The choice of this SOM size is independent of the hyperspectral image
1 Cubert GmbH, Ulm, Germany size. Coincidentally, both sizes are equal.
6153
Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.
Table 1: Results of the evaluation on the test subset with the 6. ACKNOWLEDGEMENTS
coefficient of determination R2 and RMSE.
We thank Hubert B. Keller for his ideas on the SOM frame-
Model R2 in % RMSE in % soil moisture work, Stefan Hinz as well as Conrad Jackisch for the techni-
cal support with the TDR sensors and his help with the field
SOM 96.78 0.66
measurements.
SVR 96.03 0.74
RF 93.60 0.94
7. REFERENCES
6154
Authorized licensed use limited to: TU Delft Library. Downloaded on July 23,2020 at 14:11:27 UTC from IEEE Xplore. Restrictions apply.