0% found this document useful (0 votes)

76 views16 pages

Seismic Inversion by Newtonian Machine Learning: Yuqing Chen and Gerard T. Schuster

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views16 pages

Seismic Inversion by Newtonian Machine Learning: Yuqing Chen and Gerard T. Schuster

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

GEOPHYSICS, VOL. 85, NO. 4 (JULY-AUGUST 2020); P. WA185–WA200, 25 FIGS.

10.1190/GEO2019-0434.1

Seismic inversion by Newtonian machine learning

Yuqing Chen1 and Gerard T. Schuster1

ABSTRACT capability of machine learning. Empirical results suggest that in-

version by NML can sometimes mitigate the cycle-skipping prob-
We present a wave-equation inversion method that inverts lem of conventional full-waveform inversion (FWI). Numerical
skeletonized seismic data for the subsurface velocity model. tests with synthetic and field data demonstrate the success of
The skeletonized representation of the seismic traces consists NML inversion in recovering a low-wavenumber approximation
of the low-rank latent-space variables predicted by a well- to the subsurface velocity model. The advantage of this method
trained autoencoder neural network. The input to the autoen- over other skeletonized data methods is that no manual picking of
coder consists of seismic traces, and the implicit function theo- important features is required because the skeletal data are auto-
rem is used to determine the Fréchet derivative, i.e., the matically selected by the autoencoder. The disadvantage is that
perturbation of the skeletonized data with respect to the velocity the inverted velocity model has less resolution compared with
perturbation. The gradient is computed by migrating the shifted the FWI result, but it can serve as a good initial model for
observed traces weighted by the skeletonized data residual, and FWI. Our most significant contribution is that we provide a gen-
the final velocity model is the one that best predicts the observed eral framework for using wave-equation inversion to invert skel-
latent-space parameters. We denote this as inversion by Newto- etal data generated by any type of neural network. In other words,
nian machine learning (NML) because it inverts for the model we have combined the deterministic modeling of Newtonian
parameters by combining the forward and backward modeling physics and the pattern matching capabilities of machine learning
of Newtonian wave propagation with the dimensional reduction to invert seismic data by NML.

INTRODUCTION solutions as the inverted velocity model become closer to the true
model. Alternatively, Wu et al. (2014) use the envelope of the
Full-waveform inversion (FWI) has been shown to accurately in- seismic traces to invert for the subsurface model because they claim
vert seismic data for high-resolution velocity models (Lailly and that the envelope carries the ultra-low-frequency information of the
Bednar, 1983; Tarantola, 1984; Virieux and Operto, 2009). How- seismic data. Ha and Shin (2012) invert the data in the Laplace-
ever, the success of FWI heavily relies on an initial model that domain, which is less sensitive to the lack of low frequencies than
is close to the true model; otherwise, cycle-skipping problems will conventional FWI. Sun and Schuster (1993), Fu et al. (2018), and
trap the FWI in a local minimum (Bunks et al., 1995). Chen et al. (2019) use an amplitude replacement method to focus
To mitigate the cycle-skipping problem, Bunks et al. (1995) the inversion on reducing the phase mismatch instead of the wave-
propose a multiscale inversion approach that initially inverts
form mismatch. In addition, they use a multiscale approach by tem-
low-pass-filtered seismic data and then gradually admits higher
porally integrating the traces to boost the low frequencies and
frequencies as the iterations proceed. AlTheyab and Schuster
mitigate cycle-skipping problems, and then they gradually intro-
(2015) remove the mid- and far-offset cycle-skipped seismic traces
duce the higher frequencies as the iterations proceed.
before inversion and gradually incorporate them into the iterative

Manuscript received by the Editor 5 August 2019; revised manuscript received 20 March 2020; published ahead of production 4 June 2020; published online
10 June 2020.
1
King Abdullah University of Science and Technology, Department of Earth Science and Engineering, Thuwal 23955-6900, Saudi Arabia. E-mail: yuqing
.chen@kaust.edu.sa (corresponding author); gerard.schuster@kaust.edu.sa.
© The Authors. © 2020 The Authors. Published by the Society of Exploration Geophysicists. All article content, except where otherwise noted (including
republished material), is licensed under a Creative Commons Attribution 4.0 Unported License (CC BY). See http://creativecommons.org/licenses/by/4.0/.
Distribution or reproduction of this work in whole or in part commercially or noncommercially requires full attribution of the original publication, including
its digital object identifier (DOI).

WA185

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA186 Chen and Schuster

Nonlinear inversion often gets stuck in a local minimum, which The autoencoder neural network is an unsupervised machine learn-
means that the objective function is very complex and is charac- ing method that is trained for dimensionality reduction (Schmidhuber,
terized by many local minima. To avoid this problem, Luo and Schus- 2015). An autoencoder maps the data into a lower dimensional space
ter (1991a, 1991b) suggest a skeletonized inversion method that com- by extracting the data’s most important features. It encodes the orig-
bines the skeletonized representation of seismic data with the implicit inal data into a condensed representation, also denoted as the skel-
function theorem to accelerate convergence to the vicinity of the global etonized representation, of the input data. The input data can be
minimum (Lu et al., 2017). Simplification of the data by skeletoniza- reconstructed by a decoder applied to the encoded latent-space vector.
tion reduces the complexity of the misfit function as well as the num- In this paper, we first use the observed seismic traces as the training
ber of local minima. Examples of wave-equation inversion of set to train the autoencoder neural network. Once the autoencoder is
skeletonized data include the following. Luo and Schuster (1991a, well trained, we feed the observed and synthetic traces into the au-
1991b) use the solutions to the wave equation to invert the first-arrival toencoder to get the corresponding low-dimension representations of
traveltimes for the low-to-intermediate wavenumber details of the the seismic data. We compute the misfit function as the sum of the
background velocity model. Feng and Schuster (2019) use the trav- squared differences between the observed and the predicted encoded
eltime misfit function to invert for the subsurface velocity and aniso- values. To calculate the gradient with respect to the model parameters
tropic parameters in a vertical transverse isotropic medium. Instead of such as the velocity in each pixel, we use the implicit function theorem
minimizing the traveltime misfit function, Li et al. (2016) find the op- to compute the perturbation of the skeletonized information with re-
timal S-velocity model that minimizes the differences between the ob- spect to the velocity. The high-level strategy for inverting the skeleton-
served and predicted dispersion curves associated with surface waves. ized latent variables is summarized in Figure 1, where L corresponds
Liu et al. (2018) extend 2D dispersion inversion of surface waves to to the forward modeling operator of the governing equations, such as
the 3D case. Li et al. (2018) invert the data recorded over near-surface the wave equation. Any machine-learning method, such as principal
waveguides using the dispersion curve misfit function. Instead of component analysis (PCA), variational autoencoder (VAE), or a regu-
inverting for the velocity model, Dutta and Schuster (2016) develop larized autoencoder, can be used to approximate the original data by a
a wave-equation inversion method that inverts for the subsurface Qp lower dimensional representation.
distribution. Here, they find the optimal Qp model by minimizing the This paper is organized into four sections. After the introduction,
misfit between the observed and the predicted peak/centroid-frequency we explain the theory of the Newtonian machine-learning (NML)
shifts of the early arrivals. Similarly, Li et al. (2017) use the peak-fre- inversion method. This theory includes the formulation first pre-
quency shift of the surface waves to invert for the Qs model. A tutorial sented in Luo and Schuster (1991a, 1991b) where the implicit func-
for skeletonized inversion is in Lu et al. (2017). tion theorem is used to employ numerical solutions to the wave
One of the key problems with skeletonized inversion is that the equation for generating the Fréchet derivative of the skeletal data.
skeletonized data must be picked from the original data, which can Then, we present the numerical results for the synthetic data and
be labor intensive for large data sets. To overcome this problem, we field data. The last section provides a discussion and a summary
propose computing the skeletonized data by an autoencoder and of our work and its significance.
then use solutions to the wave equation to invert such data for the
model of interest (Schuster, 2018). The skeletonized data corre-
THEORY
spond to the latent codes in the latent space of the autoencoder,
which has a reduced dimension and retains significant information Conventional FWI inverts for the subsurface velocity distribution
related to the model. by minimizing the l2 norm of the waveform difference between the
observed and synthetic data. However, this misfit
function is highly nonlinear and the iterative sol-
ution often gets stuck in a local minimum (Bunks
et al., 1995). To mitigate this problem, skeleton-
ized inversion methods simplify the objective
function by combining the skeletonized represen-
tation of data, such as the traveltimes, with the
implicit function theorem, to give a gradient opti-
Figure 1. The strategy for inverting the skeletonized latent variables. mization method that quickly converges to the
vicinity of the global minimum. Instead of man-
ually picking the skeletonized data, we allow the
unsupervised autoencoder to generate such picks.

Theory of the autoencoder

An autoencoder is an unsupervised neural net-
work in which the target output data is the same
as the input data, as illustrated in Figure 2. An
autoencoder is trained to learn the extremely
low-dimensional representation of the input data,
also denoted as the skeletonized representation,
Figure 2. An example of an autoencoder architecture with two layers for the encoder in an unsupervised manner (Valentine and
and two layers for the decoder. The dimension of the latent space is two. Trampert, 2012). It is similar to PCA, which is

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA187

generally used to represent input data using a smaller dimensional In practice, training by a preconditioned steepest-descent method
space than is originally present (Hotelling, 1933). However, PCA is is employed with minibatch inputs. The above equations are for a
a linear operation that is restricted to finding the optimal rotation of two-layer autoencoder only; however, this representation can be
the original data axes that maximizes its projections to the principal easily extended to the N-layer case.
component axes. In comparison, the autoencoder with a sufficient
number of layers can find almost any nonlinear sparse mapping be- Skeletonized representation of seismic data by
tween the input and output images. A typical autoencoder architec- autoencoder
ture is shown in Figure 2 that generally includes three parts: the
In this section, we show how the autoencoder computes the low-
encoder, the latent space, and the decoder.
dimensional skeletonized representation of seismic data. The input
• Encoder: Unsupervised learning by an autoencoder uses a data consist of seismic traces, each represented by an nt × 1 vector.
set of training data consisting of N training samples Each seismic trace represents one training example in the training
fxð1Þ ; xð2Þ ; : : : ; xðNÞ g, where xðiÞ is the ith feature vector set. For the crosswell experiment, there are N s sources in the source
with dimension D × 1 and D represents the number of features well and N r receivers in the receiver well. We mainly focus on the
for each feature vector. The encoder network indicated by the inversion of the transmitted arrivals by windowing the input data
pink box in Figure 2 encodes the high-dimension input data around the early arrivals.
xðiÞ into a low-dimension latent space with dimension Figure 3a shows a homogeneous velocity model with a Gaussian
C × 1 using a series of neural layers with a decreasing number anomaly in the center. Figure 3b is the initial velocity model having
of neurons; here, C is typically much smaller than D. This the same background velocity as the true velocity model. A cross-
encoding operation at the first hidden layer can be mathemati- well acquisition system with two 1570 m deep cased wells sepa-
cally described as zðiÞ ¼ gðW1 xðiÞ þ b1 Þ, where W1 and b1 rated by 1350 m describes the source and receiver wells. The
represent the network parameters and the vector of bias terms finite-difference method is used to compute 77 acoustic shot gathers
for the first layer and gðÞ indicates the activation function such for the observed and synthetic data with a 20 m shot interval. Each
as a sigmoid, ReLU, or tanh. shot is recorded with 156 receivers that are evenly distributed along
• Latent space: The compressed data zðiÞ with
dimension C × 1 in the latent space layer
(denoted by the green box in Figure 2) is
a) True model m/s b) Initial model m/s
the lowest dimensional space in which
the input data are reduced and the key in-
formation about the data is preserved. The
latent space usually has a few neurons,
which forces the autoencoder neural net-
Depth (km)

Depth (km)
work to create effective low-dimensional
representations of the high-dimensional in-
put data. These low-dimensional attributes
can be used by the decoder to reconstruct
the original input.
• Decoder: The decoder portion of the
neural network represented by the purple
box reconstructs the input data from
the latent space representation zðiÞ by a x (km) x (km)
series of neural network layers with an
increasing number of neurons. For a Figure 3. (a) A homogeneous velocity model with a Gaussian velocity anomaly in the
decoder with one hidden layer, the recon- center and the (b) homogeneous background model.
structed data x~ ðiÞ are calculated by
x~ ðiÞ ¼ W2 zðiÞ þ b2 , where the coefficients a)
Original seismic trace
of the matrix W2 and vector b2 represent the network param-
Amplitude

eters and the bias term of the decoder network, respectively.

The parameters of the autoencoder neural network with just two

hidden layers are determined by finding the values of Wi and bi for
i ¼ 1; 2 that minimize the following objective function: b) Processed seismic trace
Amplitude

X
N
JðW1 ; b1 ; W2 ; b2 Þ ¼ ðx~ ðiÞ − xðiÞ Þ2 ;
i¼1
X
N Time (s)
¼ ðW2 ðgðW1 xðiÞ þ b1 ÞÞ þ b2 − xðiÞ Þ2 : (1)
i¼1 Figure 4. The (a) original and (b) processed seismic traces.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA188 Chen and Schuster

the depth at a spacing of 10 m. To train the autoencoder network, we consists of a total of 2496 training examples, or seismic traces.
use the following workflow. We did not use all of the shot gathers for training because of the
increase in computation cost.
1) Construct the training set. For every five observed shots, we 2) Data processing. Each seismic trace is Hilbert transformed to
randomly select one shot gather as part of the training set that get its envelope, and then the transformed data are subtracted
a) Input trace by its mean and divided by its variance. Figure 4a and 4b shows
a seismic trace before and after processing, respectively. We use
Amplitude

the signal envelope instead of the original seismic trace because

it is less complicated than the latter. According to our tests, the
signal envelope leads to faster convergence compared to the
original seismic signal.
3) Training the autoencoder. We feed the processed training set
b) Decoded trace
into an autoencoder network in which the dimension of its latent
space is equal to 1. In other words, each training example with a
Amplitude

dimension of nt × 1 will be encoded as a smaller number of

latent variables by the encoder. The autoencoder parameters
are updated by iteratively minimizing equation 1. The Adam
and minibatch gradient-descent methods are used to train this
c) Trace difference network. Figure 5a and 5b shows an input training example
and its corresponding reconstructed signal by the autoencoder,
Amplitude

respectively, and their differences are shown in Figure 5c.

After training is finished, we input all the observed and predicted

seismic traces into the well-trained autoencoder network to get their
Time (s)
skeletonized low-dimensional representations. Of course, each in-
Figure 5. The (a) input training example and the (b) reconstructed put seismic trace requires the same data processing procedure as we
signal by the autoencoder and (c) their difference. performed for the training set. Figure 6a–6c shows three observed

a) b) c)

d) e) f)

Figure 6. (a-c) Three shot gathers and (d-f) their corresponding encoded data.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA189

shot gathers, which are not included in the training set, and their where pz ðxr ; t; xs Þsyn represents a synthetic trace for a given back-
encoded values are shown in Figure 6d–6f, which are the skeleton- ground velocity model recorded at the receiver location xr due to a
ized representations of the input seismic traces. The encoded values source excited at location xs . The subscript z is the skeletonized
do not have any units and can be considered as a skeletonized attrib- feature (a low-dimensional representation of the seismic trace)
ute of the data. that is encoded by a well-trained autoencoder network. Similarly,
We compare the traveltime differences and differences of the pz−z1 ðxr ; t; xs Þobs denotes the observed trace with encoded skeleton-
latent variables for the observed and synthetic data in Figure 7. ized feature equal to z − z1 that has the same source and receiver
The black and red curves represent the observed and synthetic data, location as pz ðxr ; t; xs Þsyn , and z1 is the distance between the syn-
respectively. Figure 7b shows larger traveltime differences than thetic and observed latent space variables.
Figure 7a and 7c, as its propagating waves are affected more by For an accurate velocity model, the observed and synthetic traces
the Gaussian velocity anomaly than the other two shots. However, will have the same encoded values in the latent space. Therefore,
the misfit functions for the low-dimensional representations of the we seek to minimize the distance between the synthetic and ob-
seismic data exhibit a pattern similar to that of the traveltime misfit served latent space variables. This can be done by finding the shift
functions. Both reveal a large misfit at the traces affected by the value z1 ¼ Δz that maximizes the crosscorrelation function in equa-
velocity anomaly. Similar to the traveltime misfit values, the en- tion 2. If Δz ¼ 0, it indicates that the correct velocity model has
coded values are also sensitive to the velocity changes. In this case, been found and the synthetic and observed traces have the same
we can conclude that the (1) autoencoder network is able to estimate encoded values in the latent space. The Δz that maximizes the cross-
the effective low-dimensional representations of the input data and correlation function in equation 2 should satisfy the condition that
(2) the encoded low-dimensional representations can be used as the derivative of f z1 ðxr ; t; xs Þ with respect to z1 is equal to zero.
skeletonized features sensitive to changes in the velocity model. Therefore,

Theory of the NML inversion ∂f z1 ðxr ;t;xs Þ
ḟ Δz ¼ ;
∂z1 z1 ¼Δz
To invert for the velocity model from the skeletonized data, we Z
use the implicit function theorem to compute the perturbation of the
¼ dtṗz−Δz ðxr ; t; xs Þobs pz ðxr ; t; xs Þsyn ¼ 0; (3)
skeletonized data with respect to the velocity.

Connective function where ṗz−Δz ðxr ; t; xs Þobs ¼ ∂pz−z1 ðxr ; t; xs Þ∕∂z1 . Equation 3 is the
connective function that acts as an intermediate equation to connect
A crosscorrelation function is defined as the connective function the seismogram with the skeletonized data, which are the encoded
that connects the skeletonized data with the pressure field. This con- values of the seismograms (Luo and Schuster, 1991a, 1991b). Such
nective function measures the similarity between the observed and a connective function is necessary because there is no wave equa-
synthetic traces as tion that relates the skeletonized data to a single type of model
Z parameter (Dutta and Schuster, 2016). The connective function will
f z1 ðxr ; t; xs Þ ¼ dtpz−z1 ðxr ; t; xs Þobs pz ðxr ; t; xs Þsyn ; (2) be later used to derive the derivative of skeletonized data with re-
spect to the velocity.

a) Traveltime comparison #1 b) Traveltime comparison #38 c) Traveltime comparison #77

Time (ms)

d) Encoded data comparison #1 e) Encoded data comparison #38 f) Encoded data comparison #77
Encoded value

Traces Traces Traces

Figure 7. (a-c) Comparisons of the traveltimes for different shot gathers. (d-f) Comparisons of the encoded values for different shot gathers.
The black and red curves represent the observed and synthetic data, respectively.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA190 Chen and Schuster

Misfit function δpðxr ; t; xs Þ

¼ −2ρvgp ðxr ; t; x; 0Þ ∇ · vðx; t; xs Þ; (8)
The misfit function of the skeletonized data is defined as δvðxÞ

1XX where gp represents the Green’s function, v represents the particle

ϵ¼ Δzðxr ; xs Þ2 ; (4) velocity and ρ and v indicate the density and propagation velocity,
2 s r
respectively. The term pðxr ; t; xs Þ denotes the pressure field re-
corded at receiver location xr at listening time t for a source excited
where Δz is the difference between the observed and synthetic la-
at location xs at the excitation time 0. Here, indicates temporal
tent space variables. The gradient γðxÞ is given by
convolution. Substituting equation 8 into equation 6, we get
∂ϵ X X ∂Δz T Z
γðxÞ ¼ − ¼− Δzðxr ; xs Þ: (5) ∂Δz 1
∂vðxÞ ∂vðxÞ ¼ dtð2ρvgp ðxr ; t; x; 0Þ ∇ · vðx; t; xs ÞÞ
s r ∂vðxÞ E
× ṗz−Δz ðxr ; t; xs Þobs : (9)
Figure 8 shows the encoded misfit values versus different values
of velocity, which clearly shows that the misfit function monoton-
ically decreases as the velocity values approach the true velocity Substituting equation 9 into equation 5 allows the gradient of
value (vtrue ¼ 2200 m∕s). Therefore, the skeletonized misfit func- γðxÞ to be expressed as
tion in equation 5 is able to quickly converge to the global minimum
X X ∂Δz
when using the gradient optimization method. Using equation 3 and γðxÞ ¼ − Δzðxr ; xs Þ;
the implicit function theorem, we can get s r
∂vðxÞ
h i XX 1 Z
∂ḟ Δz ¼ dtð2ρvgp ðxr ; t; x; 0Þ ∇ · vðx; t; xs ÞÞ
∂Δz ∂vðxÞ E
¼−h i; s r
∂vðxÞ ∂ḟ Δz
∂Δz × ṗz−Δz ðxr ; t; xs Þobs Δzðxr ; xs Þ;
Z
∂pz ðxr ; t; xs Þsyn XX 1 Z
1 ¼ dtð2ρvgp ðxr ; t; x; 0Þ ∇ · vðx; t; xs ÞÞ
¼− dtṗz−Δz ðxr ; t; xs Þobs ; (6) E
E ∂vðxÞ s r

× Δpz ðxr ; t; xs Þ; (10)

where
Z where Δpz ðxr ; t; xs Þ ¼ ṗz−Δz ðxr ; t; xs Þobs Δzðxr ; xs Þ denotes the
E¼ dtp̈z−Δz ðxr ; t; xs Þobs pz ðxr ; t; xs Þsyn : (7) data residual, which is obtained by weighting the derivative of
the shifted observed trace with respect to the latent variable z. Then,
the difference Δz of the observed and predicted encoded values is
The Fréchet derivative ∂pz ðxr ; t; xs Þ∕∂vðxÞ is derived in the next scaled by a factor of E. Using the identity
section. Z Z
dt½fðtÞ gðtÞhðtÞ ¼ dtgðtÞ½fð−tÞ hðtÞ; (11)
Fréchet derivative

For the first-order acoustic wave equation, the Fréchet derivative equation 10 can be rewritten as
∂p∕∂v can be written as
XXZ
γðxÞ ¼ −2ρv dt∇ · vðx; t; xs Þðgp ðxr ; −t; x; 0Þ
Encoded value misfit vs Velocity s r
25
Δpz ðxr ; t; xs ÞÞ;
20 XZ X
¼ −2ρv dt∇ · vðx; t; xs Þ ðgp ðxr ; −t; x; 0Þ
Encoded misfit

15 s r

Δpz ðxr ; t; xs ÞÞ;

10
XZ
¼ −2ρv dt∇ · vðx; t; xs Þqðx; t; xs Þ; (12)
5 s

0 where q is the adjoint-state variables of p (Plessix, 2006).

1800 1900 2000 2100 2200 2300 2400
Equation 12 is the gradient of the skeletonized data, which can
Velocity (m/s)
be numerically calculated by a zero-lag crosscorrelation of the for-
Figure 8. The plot of the encoded misfit function versus the hypo- ward wavefield ∇ · vðx; t; xs Þ with the backward-propagated wave-
thetical velocity values for the velocity model. The observed data are field qðx; t; xs Þ. The velocity model is updated by the steepest
generated for a homogeneous velocity model with v ¼ 2200 m∕s. descent formula

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA191

vðxÞkþ1 ¼ vðxÞk þ αk γðxÞk ; (13) method is shown in Figure 10d, which successfully recovers the shal-
low velocity perturbations visited by the diving waves.
where k indicates the iteration number and αk represents the step
length. Reflection energy surface geometry checkerboard test

The Born modeling method is used to compute 119 shot gathers

NUMERICAL TEST on the surface using the true velocity and true reflectivity models
The effectiveness of the NML inversion method is now demon-
strated with three synthetic data sets and with a crosswell data set
a) True model m/s
recorded by Exxon in Texas (Chen et al., 1990).

Depth (km)
Checkerboard tests
We first test the NML method on data generated for checkerboard
models with three different acquisition geometries.
x (km)

Crosswell checkerboard test b) Velocity perturbation m/s

Depth (km)
The crosswell checkerboard model is shown in Figure 9a. A source
well is located at x ¼ 10 m, which includes 89 shots evenly distrib-
uted along the well with a shot interval of 20 m. Each shot gather is
recorded by 179 receivers evenly deployed along the receiver well
located at x ¼ 590 m. A 15-Hz Ricker wavelet is used as the source x (km)
c) Initial model
wavelet. The initial model is the homogeneous model shown in Fig- m/s

ure 9b. Figure 9c shows the first iteration of the NML gradient, which
Depth (km)

accurately recovers the checkerboard velocity perturbations.

Early arrival surface geometry checkerboard test

In the second checkerboard test, we apply the NML inversion x (km)

d) NML gradient
method to early-arrival data recorded by a surface-acquisition geom-
etry. Figure 10a and 10b shows the true velocity model and true veloc-
Depth (km)

ity perturbations, respectively. There are 124 shots evenly spaced on

the surface at 20 m intervals, and each shot is recorded using 249
receivers placed on the surface with an interval of 10 m. The source
function is a Ricker wavelet with a peak frequency of 15 Hz. A time
x (km)
domain 2–4 acoustic finite-difference modeling algorithm is used for
data simulation and inversion. The initial model shown in Figure 10b Figure 10. The (a) true velocity model, (b) true velocity perturba-
is a linear increasing model that is the same as the background velocity tions, and the (c) linearly increasing initial model. The (d) NML
of the true model. The first iteration gradient of the NML inversion gradient after the first iteration.

a) True model
b) Initial model
c) NML gradient
Depth (km)

x (km) x (km) x (km)

Figure 9. The (a) true and (b) initial velocity models. (c) The NML gradient after the first iteration.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA192 Chen and Schuster

a) True model m/s shown in Figure 11a and 11b, respectively. Each shot is recorded
with 239 receivers that are evenly distributed on the surface at a
Depth (km)

spacing of 10 m. An 18-Hz Ricker wavelet is used as the source

wavelet, and the initial velocity model is shown in Figure 11c.
The NML gradient after the first iteration is shown in Figure 11d,
where part of the velocity perturbations is well recovered in the gra-
x (km) dient. However, the nature of reflection inversion with a limited
b) Reflectivity model source-receiver aperture only allowed the velocity perturbations
near the layer interface to be accurately represented in the gradient.
Depth (km)

Crosswell layer model

The NML inversion method is now tested on synthetic data gen-
erated for a layered model with a crosswell acquisition geometry.
x (km) Figure 12a shows the true velocity model, which has three horizontal
c) Initial model m/s
layers and a linear increasing background velocity. A Ricker wavelet
with a peak frequency of 15 Hz is used as the source wavelet. A
Depth (km)

fixed-spread crosswell acquisition geometry is deployed, where

119 shots at a source interval of 20 m are evenly distributed along
the vertical well located at x ¼ 10. The data are recorded by 239
receivers for each shot, where the receivers are uniformly distributed
x (km) every 10 m in depth in the receiver well located 990 m away from the
d) NML gradient
source well. The simulation time of the seismic data is 2 s with a time
sampling interval of 1 ms.
The training set includes 15,296 observed seismic traces because
we only use one of every three shot gathers for training. After data
processing, we feed the training data into the autoencoder network
shown in Figure 13. The number below each layer indicates the di-
mension of that layer. The boxes with the pink, green, and blue colors
Figure 11. The (a) true checkerboard velocity, (b) true reflectivity, represent the encoder network, latent space layer, and the decoder
and (c) initial velocity models. The (d) NML gradient after the first network, respectively. The autoencoder network is trained with mini-
iteration.

Figure 12. The (a) true velocity and (b) initial a) True model m/s
b) Initial model m/s
velocity models.
Depth (km)

Depth (km)

x (km) x (km)

Figure 13. The architecture of the autoencoder

neural network.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA193

a) b) Figure 14. The (a) FWI, (b) WT, and (c) NML
tomograms. (d) The FWI tomogram using the
NML tomogram as the initial model.
Depth (km)

x (km) Depth (km) x (km)

c) d)
Depth (km)

Depth (km)

x (km) x (km)

a) b) c) Figure 15. Velocity profiles at (a) x ¼ 0.3 km,

(b) x ¼ 0.5 km, and (c) x ¼ 0.8 km, respectively.
The true and initial models are represented by the
blue and gray solid lines, respectively. The black,
red, and green-dashed lines indicate the velocity
profiles of the FWI, NML, and the WT tomo-
grams, respectively. The FWI + NML profile is
shown as the dotted magenta line.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA194 Chen and Schuster

Figure 16. The (a) true and (b) initial velocity a) True model b) Initial model
models.

Depth (km)
x (km) x (km)

Figure 17. The (a) inverted velocity model and the a) b) c)

comparison of the vertical velocity profiles (in m/
s) at (b) x ¼ 0.5 and x ¼ 0.8 km. The gray, blue,
and red curves indicate the velocity profiles of the
initial, true, and inverted velocity models, respec-
tively.

Figure 18. The (a) raw data, (b) data after band- a) b) c)
pass filtering, (c) data after tube-wave removal,
(d) upgoing waves, (e) data after wavefield sepa-
ration, and (f) final processed data.

d) e) f)

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA195

batches of 50 traces and a tanh activation function. a) b)

The training time takes approximately 20 min us-
ing one graphics processing unit.
After the autoencoder neural network is well
trained, we can input the synthetic traces gener-
ated at each iteration of the inversion to get their
encoded values. This allows us to calculate the
skeletonized misfit and gradient functions to up-
date the velocity model. Figure 12b shows the
initial model that is homogeneous in the effective
inversion area (0.2–2.2 km in depth) with the
background velocity equal to 3535 m/s. For com-
parison, the inverted tomograms from the FWI,
wave-equation traveltime inversion (WT), and
NML methods are shown in Figure 14a–14c, re-
spectively. It clearly shows that the FWI tomo- Figure 19. The (a) initial and (b) inverted velocity models.
gram is far from the true model so that FWI
will have a cycle-skipping issue. However, the
WT and NML inverted models are similar to each
other and both tomograms have successfully re-
constructed the three velocity layers and the
a)
background velocity. The similarity between
these two tomograms is because the single latent
space neuron mainly preserves the kinematic in-
formation of different seismic traces. Figure 14d
shows the FWI inverted tomogram that used the
NML inverted model as the initial model, which
further recovered high-wavenumber details of
the velocity model.
The comparisons of the velocity profiles at
x ¼ 0.3, x ¼ 0.5, and x ¼ 0.8 km in the velocity
models in Figures 12 and 14 are shown in Fig-
ure 15a–15c, respectively. The true and initial b) c)
velocity profiles are represented by the blue
and gray solid lines, respectively. The FWI pro-
files are indicated by the black solid lines,
whereas the red and green lines show the NML
and WT velocity profiles that are close to the true
result. The velocity profile extracted from the
FWI + NML tomogram is shown as the magenta
dotted line, which almost perfectly matches the
true velocity profile.

Crosswell Marmousi model

d) e)
Data computed from a part of the Marmousi
model are used to test the NML inversion
method. We select the upper-right region of
the Marmousi model shown in Figure 16a
with 157 × 135 grid points. The finite-difference
method is used to compute 77 acoustic shot gath-
ers with a 20 m source interval along the depth of
the well located at 10 m. Each shot contains 156
receivers that are evenly distributed at a spacing
of 10 m along the vertical receiver well, which is
offset 1340 m away from the vertical source well.
The data simulation time is 2 s with a time inter-
Figure 20. The encoded value maps of the (a) observed data and the synthetic data
val of 1 ms. The source wavelet is a 15-Hz Ricker generated from the (b) initial, and (c) inverted velocity models. Here, the colors corre-
wavelet, and the initial velocity model is shown spond to the different values of z. The encoded misfit values for the (d) observed data
in Figure 16b. The inverted velocity model is and initial data, and (e) observed data and inverted data, respectively.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA196 Chen and Schuster

shown in Figure 17a, and the comparison of their vertical profiles at Friendswood crosswell field data
x ¼ 0.5 and x ¼ 0.8 is shown in Figure 17b and 17c, respectively.
The blue, red, and black curves represent the velocity profiles of the We now test our method on the Friendswood crosswell field data
set. Two 305 m deep cased wells separated by 183 m are used as the
initial, true, and inverted velocity models, respectively. It shows that
source and receiver wells. Downhole explosive charges are fired at
the inverted model can only reconstruct the low-wavenumber infor-
intervals of 3 m from 9 to 305 m in the source well, and the receiver
mation in the true velocity model. To get a high-resolution inversion
well has 96 receivers placed at depths ranging from 3 to 293 m.
result, a hybrid approach such as the skeletonized inversion + FWI
The seismic data are recorded with a sampling interval of 0.25 ms
approach can be used (Luo and Schuster, 1991a, 1991b).
for a total recording time of 0.375 s. We apply the following pro-
cessing steps to the raw data, which is similar to the processing
workflow in Dutta and Schuster (2014) and
Cai and Schuster (1993): pffiffiffiffiffiffi
The raw data are scaled by ðtÞ to correct the
a) c) d) f) 3D geometric spreading effects.pWe ffiffiffiffiffiffiffimultiply
ffi the
data spectrum with the filter i∕ω to correct
the phase.
A band-pass filter of 80–400 Hz is applied to
the observed data to remove the noise shown in
Figure 18a. The filtered data have a peak fre-
quency of 190 Hz.
To remove the tube waves shown in Fig-
ure 18b, we first flatten the tube waves and then
apply a nine-point median filter along the hori-
zontal direction to remove any other arrivals ex-
cept the tube waves. The filtered tube waves are
then shifted back to their original time positions
b) e) and subtracted from the original data. Figure 18c
shows the data after tube wave removal.
Because our goal is velocity inversion rather
than imaging, we use an FK method to separate
the upgoing waves from the downgoing waves.
Figure 18d shows the pure upgoing waves after
wavefield separation, and Figure 18e shows the
g) i) j) l) data that contain the downgoing waves only. We
interpolate the data to a 0.1 ms sampling interval
to ensure numerically stable solutions. A final
processed shot gather is shown in Figure 18f.
The autoencoder architecture we used here is
almost the same as the previous two cases, except
the dimensions of the input and output layers are
changed to 3750 × 1. A linearly increasing
velocity model is used as the initial model and
is shown in Figure 19a. Figure 19b shows the in-
verted velocity model with 10 iterations. Two
high-velocity zones at the depth range between
85–115 and 170–300 m appear in the inverted
result. However, some source artifacts appear
h) k) near the source well. Figure 20a shows the en-
coded value map of the observed data, where
the vertical and horizontal axes represent the
source and receivers indexes, respectively. This
shows that the near-offset traces have large pos-
itive values and the encoded values decrease as
the offset increases.
Figure 20b and 20c shows the encoded value
Figure 21. Shot gathers with (a) SNR = 30 db, (d) SNR = 11 db, (g) SNR = 4 db, and maps of the seismic data generated from the ini-
(j) SNR = 1 db. The 80th trace is displayed in (c), (f), (i), and (l), respectively. (b), (e), tial and inverted velocity models, respectively,
(h), and (k) display the encoded values from the observed and synthetic data, respec-
tively. The numbers along the horizontal axes of the encoded value graphs correspond to where the latter map is much more similar to
the trace indices and the numbers along the vertical axes correspond to the values of the the encoded value map of the observed data.
latent variable z. To measure the distance between the true and

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA197

the initial models, we plot the values of the encoded residuals in Figure 22 shows magnified views of the encoded values in
Figure 20d. It shows that there are relatively larger residuals at Figure 21, where some oscillations appear in the noisy data. These
the near-offset traces than at the far-offset traces. However, these oscillations could further affect the accuracy of the inverted result,
residuals are largely reduced with the inverted tomogram that is especially if the small velocity perturbation is omitted. Therefore,
shown in Figure 20e. This clearly demonstrates that our inverted good data quality with less noise is preferred for the autoencoder
tomogram is much closer to the true velocity model compared to method to recover an accurate subsurface velocity model.
the initial model.
Overfitting problem
In our examples, the number of seismic traces in the training set is
DISCUSSION
usually smaller than the number of parameters in the autoencoder,
Tests on the synthetic and observed data demonstrate that wave- which might result in an overfitting problem. If the data are over-
equation inversion of seismic data skeletonized by an autoencoder fitted, the network learns the intricacies of the training data set at
can invert for the low- to intermediate-wavenumber details of the sub- the expense of its ability to represent unseen examples (in the test
surface velocity model. We next test the method’s sensitivity to noisy data) (Valentine and Trampert, 2012). In other words, at some point
data and discuss overfitting problem. during training, the reconstruction error of the training set keeps

Noise sensitivity tests

Reconstruction errors versus iterations
In the previous synthetic tests, we assumed that the seismic data Training dataset
Testing dataset
are noise free. We now repeat the synthetic tests associated with

Reconstruction error
Figure 7, except we add random noise to the input data. Different
levels of noise are added to the observed and synthetic data. Fig-
ure 21a, 21d, 21g, and 21j shows four shot gathers, and their 80th
traces are displayed in Figure 21c, 21f, 21i, and 21l. Their encoded
results are shown in Figure 21b, 21e, 21h, and 21k, where the black
and red curves represent the encoded values from the observed and
synthetic data, respectively. It appears that the range of encoded
values decreases as the noise level increases. Moreover, the encoded
Iteration number
residual also decreases, which indicates that the encoded values be-
come less sensitive to the velocity changes as the data noise level Figure 23. The reconstruction error of the training set and testing
increases. set versus the iteration number.

a) Encoded data comparison b) Encoded data comparison Figure 22. (a-d) are the magnified views of Fig-
with SNR = 30 with SNR = 11 ure 21b, 21e, 21h, and 21k.

c) Encoded data comparison d) Encoded data comparison

with SNR = 4 with SNR = 1

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA198 Chen and Schuster

decreasing, while the reconstruction error of the testing set is either Multidimensional NML inversion
stable or becomes worse. Figure 23a and 23b shows the recon-
struction errors of the training and testing sets versus the iteration An autoencoder with a single latent space neuron can sometimes
number, respectively. It clearly shows that the reconstruction errors be incapable of fully capturing the important information in traces.
of both data sets decrease rapidly within the first 10 iterations and For a 2D latent space, the new misfit function can be written as
then gradually become stable. A similar pattern with our training
1XX
and testing sets demonstrates that we do not suffer from the over- ϵ¼ ðΔz1 ðxr ; xs Þ2 þ Δz2 ðxr ; xs Þ2 Þ; (14)
fitting problem during training. 2 s r

The connection between the encoded value and the where Δz1 and Δz2 are the encoded value difference of the first and
decoded waveform second latent space coordinates. The gradient γðxÞ is
An ideal autoencoder neural network seeks to identify the
common features in the training set and encapsulates these within ∂ϵ ∂z1 ∂z2
γðxÞ ¼ − ¼− Δz1 þ Δz2 : (15)
the encoder and decoder functions. The latent variables contain the ∂vðxÞ ∂vðxÞ ∂vðxÞ
information that distinguishes between each individual example in
the data set (Valentine and Trampert, 2012). To illustrate this point, Similarly, the connective function is
we perturb the encoded values in the latent space and see how the Z
decoded waveform changes. Figure 24 displays the changes in the
f z1 ;z2 ðxr ;t;xs Þ¼ dtpðzobs obs
1 −z1 ;z2 −z2 Þ
ðxr ;t;xs Þobs pzsyn ðxr ;t;xs Þsyn ;
encoded values and the decoded waveforms. It clearly shows that
the changes in the encoded values result in temporal shifts in the (16)
waveforms, but the shape of the waveform barely changes. There-
fore, in this case, the latent space information is mainly related to
which connects ∂z1 ∕∂vðxÞ and ∂z2 ∕∂vðxÞ with the Fréchet deriva-
the traveltimes, which are necessary to distinguish different exam-
tive ∂p∕∂vðxÞ. Using the multivariable implicit function theorem,
ples in the data set.
we can get

Figure 24. The decoded waveform changes with a)

increasing values of the latent-space coordinates.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
SEISMIC inversion by Newtonian ML WA199

a) b) Figure 25. The (a) true velocity model, (b) single-

neuron, and (c) double-neuron NML inverted
tomograms. The velocity at the bottom right are
in m/s and for x ¼ 0.5 and x ¼ 0.9 km.

c) d) e)

2 3 2 3−1 2 3
∂z1 ∂2 f ∂2 f ∂2 f CONCLUSION
4 ∂vðxÞ 5 4 ∂z2 ∂z1 ∂z2
5 4 ∂z1 ∂vðxÞ 5:
∂z2 ¼ − ∂2 f1 ∂2 f ∂2 f
(17)
We presented a wave-equation method that finds the velocity
∂vðxÞ ∂z1 ∂z2 ∂z22 ∂z2 ∂vðxÞ
model which minimizes the misfit function in the autoencoder’s la-
tent space. The autoencoder can compress a high-dimensional seis-
mic trace to a smaller dimension that best represents the original
We apply the multidimensional NML inversion method to the data in the latent space. In this case, measuring the encoded resid-
same data that were generated in the crosswell Marmousi model. uals largely reduces the nonlinearity when compared with measur-
The same data set is used for training except that there are two latent ing their waveform differences. Therefore, the inverted result will be
space neurons in the autoencoder. The autoencoder with two latent less prone to getting stuck in a local minimum. The implicit func-
space neurons converges to a smaller residual compared to the tion theorem is used to connect the perturbation of the encoded val-
single-latent space neuron autoencoder, which means that more ues with the velocity perturbations to calculate the gradient.
waveform information is preserved in the latent space. The initial Numerical results with synthetic and field data demonstrate that
model is a linear increasing model shown in Figure 16b. Figure 25b skeletonized inversion with the autoencoder network can accurately
and 25c shows the single-neuron and double-neuron NML tomo- estimate the background velocity model. The inverted result can be
grams, where the latter recovers more detail especially at depths used as a good initial model for FWI.
between z ¼ 0.1 and z ¼ 0.8 km. The velocity profile comparisons The most significant contribution of this paper is that it provides a
at x ¼ 0.5 and x ¼ 0.9 km are shown in Figure 25d and 25e, re- general framework for using solutions to the governing partial dif-
spectively, which shows that the double-neuron NML profile ferential equation in order to skeletal data generated by any type of
(the red solid line) more closely agrees with the true velocity profile neural network. The governing equation can be that for gravity, seis-
(the blue solid line). This improvement suggests that two latent mic waves, electromagnetic fields, and magnetic fields. The input
space neurons contain more information about the subsurface than data can be the records from different types of data, as long as the
does a single neuron. However, inverting a greater number of neu- model parameters are sensitive to the model perturbations. The skel-
rons will likely lead to a greater chance of getting stuck in a local etal data can be the latent space variables of a regularized autoen-
minima. To avoid this we suggest a multiscale strategy that initially coder, a VAE, or a feature map from a CNN, or PCA features. That
inverts for the velocity model from a low-dimensional latent space is, we have combined the deterministic features of forward and
representation. This low-wavenumber model is then used as the backward modeling in Newtonian physics with the dimensional re-
starting velocity model for inverting latent-space variables with a duction capabilities of machine learning to invert seismic data
higher-dimension. by NML.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest
WA200 Chen and Schuster

ACKNOWLEDGMENTS Hotelling, H., 1933, Analysis of a complex of statistical variables into prin-
cipal components: Journal of Educational Psychology, 24, 417, doi: 10
.1037/h0071325.
The research reported in this paper was supported by the King Lailly, P., and J. Bednar, 1983, The seismic inverse problem as a sequence of
Abdullah University of Science and Technology (KAUST) in before stack migrations: Conference on Inverse Scattering: Theory and
Thuwal, Saudi Arabia. We are grateful to the sponsors of the Center Application, 206–220.
Li, J., G. Dutta, and G. Schuster, 2017, Wave-equation qs inversion of skel-
for Subsurface Imaging and Modeling (CSIM) Consortium for etonized surface waves: Geophysical Journal International, 209, 979–991,
their financial support. For computer time, this research used the doi: 10.1093/gji/ggx051.
Li, J., Z. Feng, and G. Schuster, 2016, Wave-equation dispersion inversion:
resources of the Supercomputing Laboratory at KAUST. We thank Geophysical Journal International, 208, 1567–15, doi: 10.1093/gji/ggw465.
them for providing the computational resources required for carry- Li, J., S. Hanafy, and G. Schuster, 2018, Wave-equation dispersion inversion
ing out this work. We also thank Exxon for the Friendswood cross- of guided P waves in a waveguide of arbitrary geometry: Journal of Geo-
physical Research: Solid Earth, 123, 7760–7774, doi: 10.3997/2214-4609
well data. .201801961.
Liu, Z., J. Li, S. M. Hanafy, and G. Schuster, 2018, 3D wave-equation
dispersion inversion of surface waves: 88th Annual International Meeting,
DATA AND MATERIALS AVAILABILITY SEG, Expanded Abstracts, 4733–4737, doi: 10.1190/segam2018-
2997521.1.
Data associated with this research are confidential and cannot be Lu, K., J. Li, B. Guo, L. Fu, and G. Schuster, 2017, Tutorial for wave-
released. equation inversion of skeletonized data: Interpretation, 5, no. 3, SO1-
SO10, doi: 10.1190/INT-2016-0241.1.
Luo, Y., and G. T. Schuster, 1991a, Wave equation inversion of skeletalized
REFERENCES geophysical data: Geophysical Journal International, 105, 289–294, doi:
10.1111/j.1365-246X.1991.tb06713.x.
AlTheyab, A., and G. Schuster, 2015, Reflection full-waveform inversion Luo, Y., and G. T. Schuster, 1991b, Wave-equation traveltime inversion:
for inaccurate starting models: Workshop on Depth Model Building: Geophysics, 56, 645–653, doi: 10.1190/1.1443081.
Full-waveform Inversion Workshop, 18–22. Plessix, R.-E., 2006, A review of the adjoint-state method for computing
Bunks, C., F. M. Saleck, S. Zaleski, and G. Chavent, 1995, Multiscale seis- the gradient of a functional with geophysical applications: Geophysical
mic waveform inversion: Geophysics, 60, 1457–1473, doi: 10.1190/1 Journal International, 167, 495–503, doi: 10.1111/j.1365-246X.2006
.1443880. .02978.x.
Cai, W., and G. T. Schuster, 1993, Processing friendswood cross-well seis- Schmidhuber, J., 2015, Deep learning in neural networks: An overview:
mic data for reflection imaging: 63rd Annual International Meeting, SEG, Neural Networks, 61, 85–117, doi: 10.1016/j.neunet.2014.09.003.
Expanded Abstracts, 92–94, doi: 10.1190/1.1822658. Schuster, G., 2018, Machine learning and wave equation inversion of skel-
Chen, S., L. Zimmerman, and J. Tugnait, 1990, Subsurface imaging using etonized data: 80th Annual International Conference and Exhibition,
reversed vertical seismic profiling and crosshole tomographic methods: EAGE, Extended Abstracts, WS01.
Geophysics, 55, 1478–1487, doi: 10.1190/1.1442795. Sun, Y., and G. T. Schuster, 1993, Time-domain phase inversion: 63rd
Chen, Y., Z. Feng, L. Fu, A. AlTheyab, S. Feng, and G. Schuster, 2019, Annual International Meeting, SEG, Expanded Abstracts, 684–687,
Multiscale reflection phase inversion with migration deconvolution: doi: 10.1190/1.1822588.
Geophysics, 85, no. 1, R55–R73, doi: 10.1190/geo2018-0751.1. Tarantola, A., 1984, Inversion of seismic reflection data in the acoustic
Dutta, G., and G. T. Schuster, 2014, Attenuation compensation for least- approximation: Geophysics, 49, 1259–1266, doi: 10.1190/1.1441754.
squares reverse time migration using the viscoacoustic-wave equation: Valentine, A. P., and J. Trampert, 2012, Data space reduction, quality assess-
Geophysics, 79, no. 6, S251–S262, doi: 10.1190/geo2013-0414.1. ment and searching of seismograms: autoencoder networks for waveform
Dutta, G., and G. T. Schuster, 2016, Wave-equation Q tomography: Geo- data: Geophysical Journal International, 189, 1183–1202, doi: 10.1111/j
physics, 81, no. 6, R471–R484, doi: 10.1190/geo2016-0081.1. .1365-246X.2012.05429.x.
Feng, S., and G. T. Schuster, 2019, Transmission+ reflection anisotropic Virieux, J., and S. Operto, 2009, An overview of full-waveform inversion in
wave-equation traveltime and waveform inversion: Geophysical Prospec- exploration geophysics: Geophysics, 74, no. 6, WCC1–WCC26, doi: 10
ting, 67, 423–442, doi: 10.1111/1365-2478.12733. .1190/1.3238367.
Fu, L., B. Guo, and G. T. Schuster, 2018, Multiscale phase inversion of seismic Wu, R.-S., J. Luo, and B. Wu, 2014, Seismic envelope inversion and modu-
data: Geophysics, 83, no. 2, R159–R171, doi: 10.1190/geo2017-0353.1. lation signal model: Geophysics, 79, no. 3, WA13–WA24, doi: 10.1190/
Ha, W., and C. Shin, 2012, Laplace-domain full-waveform inversion of geo2013-0294.1.
seismic data lacking low-frequency information: Geophysics, 77, no. 5,
R199–R206, doi: 10.1190/geo2011-0411.1.

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

by guest

Geophysical Data Analysis Understanding Inverse Problem Theory and Practice【Meju1994】
No ratings yet
Geophysical Data Analysis Understanding Inverse Problem Theory and Practice【Meju1994】
305 pages
Rational Functions Practice Test
100% (9)
Rational Functions Practice Test
4 pages
Geo 2018 0249.1
No ratings yet
Geo 2018 0249.1
17 pages
Wu Geng 2023 Joint Data and Physics Model Driven Full Waveform Inversion Using CMP Gathers and Well Logging Data
No ratings yet
Wu Geng 2023 Joint Data and Physics Model Driven Full Waveform Inversion Using CMP Gathers and Well Logging Data
6 pages
Reflection Full Waveform Inversion
No ratings yet
Reflection Full Waveform Inversion
12 pages
Overview of Physics-Informed Machine Learning Inversion of Geophysical Data
No ratings yet
Overview of Physics-Informed Machine Learning Inversion of Geophysical Data
37 pages
Physics-Informed Data-Driven Seismic Inversion
No ratings yet
Physics-Informed Data-Driven Seismic Inversion
27 pages
Journal of Geophysical Research Machine Learning and Computation - 2025 - Wu - How Does Neural Network Reparametrization
No ratings yet
Journal of Geophysical Research Machine Learning and Computation - 2025 - Wu - How Does Neural Network Reparametrization
23 pages
JGR Solid Earth - 2023 - Sun - Implicit Seismic Full Waveform Inversion With Deep Neural Representation
No ratings yet
JGR Solid Earth - 2023 - Sun - Implicit Seismic Full Waveform Inversion With Deep Neural Representation
18 pages
Applsci 15 00941
No ratings yet
Applsci 15 00941
25 pages
CGG Article Enhancing-Salt-Model-Resolution 2023
No ratings yet
CGG Article Enhancing-Salt-Model-Resolution 2023
9 pages
Virieux 2014
No ratings yet
Virieux 2014
40 pages
2017-Application of A Complete Workflow For 2D Elastic Full-Waveform Inversion To Recorded Shallow-Seismic Rayleigh Waves-Groos
No ratings yet
2017-Application of A Complete Workflow For 2D Elastic Full-Waveform Inversion To Recorded Shallow-Seismic Rayleigh Waves-Groos
9 pages
Anisotropic 3D Full-Waveform Inversion
No ratings yet
Anisotropic 3D Full-Waveform Inversion
22 pages
An Intriguing Property of Geophysics Inversion: Yinan Feng Yinpeng Chen Shihang Feng Peng Jin Zicheng Liu Youzuo Lin
No ratings yet
An Intriguing Property of Geophysics Inversion: Yinan Feng Yinpeng Chen Shihang Feng Peng Jin Zicheng Liu Youzuo Lin
13 pages
Tutorial: The Mechanics of Waveform Inversion: Ian F. Jones
No ratings yet
Tutorial: The Mechanics of Waveform Inversion: Ian F. Jones
13 pages
Journal of Geophysical Research Machine Learning and Computation - 2025 - Liu - Automatic Differentiation Based Full
No ratings yet
Journal of Geophysical Research Machine Learning and Computation - 2025 - Liu - Automatic Differentiation Based Full
26 pages
TLE May 21 Huang Et Al Final Published
No ratings yet
TLE May 21 Huang Et Al Final Published
10 pages
Inversion Vvi
No ratings yet
Inversion Vvi
5 pages
Brittan, Jones - 2019 - FWI Evolution - From A Monolith To A Toolkit
No ratings yet
Brittan, Jones - 2019 - FWI Evolution - From A Monolith To A Toolkit
6 pages
A Timely and Necessary Antidote To Indirect Methods and So-Called P-Wave FWI by Arthur Weglein
No ratings yet
A Timely and Necessary Antidote To Indirect Methods and So-Called P-Wave FWI by Arthur Weglein
11 pages
Target-Oriented Time-Lapse Waveform Inversion Using Deep
No ratings yet
Target-Oriented Time-Lapse Waveform Inversion Using Deep
11 pages
Virieux, Operto - 2009 - An Overview of Full-Waveform Inversion in Exploration Geophysics
No ratings yet
Virieux, Operto - 2009 - An Overview of Full-Waveform Inversion in Exploration Geophysics
27 pages
Gauss Newton and Full Newton Methods in Frequency Space Seismic Waveform Inversion
No ratings yet
Gauss Newton and Full Newton Methods in Frequency Space Seismic Waveform Inversion
22 pages
Semi Supervised Learning For Acoustic Impedance Inversion
No ratings yet
Semi Supervised Learning For Acoustic Impedance Inversion
5 pages
Compressed Implicit Jacobian Scheme For Elastic Full-Waveform Inversion
No ratings yet
Compressed Implicit Jacobian Scheme For Elastic Full-Waveform Inversion
9 pages
Ggae 155
No ratings yet
Ggae 155
23 pages
Fourier-DeepONet (Comput. Methods Appl. Mech. Eng.)
No ratings yet
Fourier-DeepONet (Comput. Methods Appl. Mech. Eng.)
26 pages
4D Time Lapse Full Waveform Inversion Ca
No ratings yet
4D Time Lapse Full Waveform Inversion Ca
4 pages
2022 Lin Wu Pan An Efficient Full-Wavefield Computational Approach For Seismic Testing in A Layered Half-Space (Related SMM Code)
No ratings yet
2022 Lin Wu Pan An Efficient Full-Wavefield Computational Approach For Seismic Testing in A Layered Half-Space (Related SMM Code)
15 pages
Geo2021-0794 1
No ratings yet
Geo2021-0794 1
9 pages
Tle36121033 1
No ratings yet
Tle36121033 1
4 pages
2013-2D Full Waveform Inversion of Shallow Seismic Rayleigh Waves-Groos
No ratings yet
2013-2D Full Waveform Inversion of Shallow Seismic Rayleigh Waves-Groos
156 pages
ML DL Review Paper Deepak
No ratings yet
ML DL Review Paper Deepak
3 pages
Essoar 10507871 1
No ratings yet
Essoar 10507871 1
32 pages
Acoustic Full Waveform Inversion With Elastic Wave Equation Forward Modeling
No ratings yet
Acoustic Full Waveform Inversion With Elastic Wave Equation Forward Modeling
48 pages
FB Veeken DaSilva Inversion 2004
No ratings yet
FB Veeken DaSilva Inversion 2004
25 pages
OCCAM2D
No ratings yet
OCCAM2D
12 pages
Seismic Inversion Methods and Some of Their Constraints
No ratings yet
Seismic Inversion Methods and Some of Their Constraints
25 pages
Seismic Inversion Lecture 1
No ratings yet
Seismic Inversion Lecture 1
37 pages
3D Finite-Difference Time-Domain Modeling of Acoustic Wave Propagation Based On Domain Decompostion
No ratings yet
3D Finite-Difference Time-Domain Modeling of Acoustic Wave Propagation Based On Domain Decompostion
32 pages
SeismicImaging - 10.1051 - 978 2 7598 2351 2.c007
No ratings yet
SeismicImaging - 10.1051 - 978 2 7598 2351 2.c007
23 pages
1 s2.0 S1877050911002286 Main
No ratings yet
1 s2.0 S1877050911002286 Main
10 pages
Modern Inversion Workflow of The Multimodal Surface Wave Dispersion Curves: Staging Strategy and Pattern Search With Embedded Kuhn-Munkres Algorithm
No ratings yet
Modern Inversion Workflow of The Multimodal Surface Wave Dispersion Curves: Staging Strategy and Pattern Search With Embedded Kuhn-Munkres Algorithm
25 pages
Brittan Et Al. - 2013 - Full Waveform Inversion - The State of The Art
No ratings yet
Brittan Et Al. - 2013 - Full Waveform Inversion - The State of The Art
7 pages
Qualitative and Quantitative Reservoir Characterisation
No ratings yet
Qualitative and Quantitative Reservoir Characterisation
16 pages
Rapid Seismic Waveform Modeling and Inversion With Neural Operators
No ratings yet
Rapid Seismic Waveform Modeling and Inversion With Neural Operators
12 pages
Sintesis Articulo Cientifico Serie de Fourier - Merged
No ratings yet
Sintesis Articulo Cientifico Serie de Fourier - Merged
38 pages
Full-Waveform Inversion Via Source-Receiver Extension: Guanghui Huang, Rami Nammour, and William Symes
No ratings yet
Full-Waveform Inversion Via Source-Receiver Extension: Guanghui Huang, Rami Nammour, and William Symes
19 pages
Study of Dynamic Properties of Rocks Around An Underground Opening Using Seismic Inverse Techniques
No ratings yet
Study of Dynamic Properties of Rocks Around An Underground Opening Using Seismic Inverse Techniques
11 pages
2012 Art Bssa Bem
No ratings yet
2012 Art Bssa Bem
10 pages
Genetic Inversion P018
No ratings yet
Genetic Inversion P018
6 pages
Deep Convolutional Neural Network With Attention M
No ratings yet
Deep Convolutional Neural Network With Attention M
11 pages
IEEE SPM Deep Learning For Seismic Inverse Problems
No ratings yet
IEEE SPM Deep Learning For Seismic Inverse Problems
29 pages
A Deep Learning Approach To The Inversion of Boreh
No ratings yet
A Deep Learning Approach To The Inversion of Boreh
25 pages
Penting Nya Etika Profesi
No ratings yet
Penting Nya Etika Profesi
17 pages
Seismic Inversion With Deep Learning
No ratings yet
Seismic Inversion With Deep Learning
15 pages
Hu Et Al 2019 A Progressive Deep Transfer Learning Approach To Cycle Skipping Mitigation in Fwi
No ratings yet
Hu Et Al 2019 A Progressive Deep Transfer Learning Approach To Cycle Skipping Mitigation in Fwi
5 pages
Genetic Inversion
No ratings yet
Genetic Inversion
6 pages
Real-Time Earthquake Tracking and Localisation: A Formulation for Elements in Earthquake Early Warning Systems (Eews)
From Everand
Real-Time Earthquake Tracking and Localisation: A Formulation for Elements in Earthquake Early Warning Systems (Eews)
George R. Daglish
No ratings yet
Handbook of Ultra-Wideband Short-Range Sensing: Theory, Sensors, Applications
From Everand
Handbook of Ultra-Wideband Short-Range Sensing: Theory, Sensors, Applications
Jürgen Sachs
No ratings yet
Acqusition Report PDF
No ratings yet
Acqusition Report PDF
22 pages
Accelerating Inversion Deep Learning
No ratings yet
Accelerating Inversion Deep Learning
9 pages
ML 02 Pre-Stack Seismic Inversion With Deep Learning: Y. Zheng, Q. Zhang BP, BP America
No ratings yet
ML 02 Pre-Stack Seismic Inversion With Deep Learning: Y. Zheng, Q. Zhang BP, BP America
4 pages
Geokniga Seysmorazvedka Boganik GN Gurvich II 2006
No ratings yet
Geokniga Seysmorazvedka Boganik GN Gurvich II 2006
375 pages
Data Structures: Notes For Lecture 22 Algorithms of Searching by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 22 Algorithms of Searching by Samaher Hussein Ali
10 pages
Gauss Jacobi Method
No ratings yet
Gauss Jacobi Method
3 pages
Lectures On Math by Felix Klein
No ratings yet
Lectures On Math by Felix Klein
136 pages
CounterexamplesinCalculus MAA Samplepages 1 36
No ratings yet
CounterexamplesinCalculus MAA Samplepages 1 36
48 pages
Chapter 8 Interpolation
No ratings yet
Chapter 8 Interpolation
31 pages
Consecutive Integers
No ratings yet
Consecutive Integers
7 pages
Get An Introduction To Integral Transforms Baidyanath Patra Free All Chapters
100% (2)
Get An Introduction To Integral Transforms Baidyanath Patra Free All Chapters
53 pages
Maths
No ratings yet
Maths
5 pages
Deflection of Beam
50% (2)
Deflection of Beam
52 pages
11th Maths Chapter-10 EM
No ratings yet
11th Maths Chapter-10 EM
66 pages
INPhO Pointers
No ratings yet
INPhO Pointers
1 page
Alg Free Session
No ratings yet
Alg Free Session
126 pages
1306 4622
No ratings yet
1306 4622
4 pages
Set 8
No ratings yet
Set 8
3 pages
Finite Elements in Analysis and Design: Tiago Morkis Siqueira, Humberto Breves Coda
No ratings yet
Finite Elements in Analysis and Design: Tiago Morkis Siqueira, Humberto Breves Coda
15 pages
Quantitative Analysis For Management III: Course Instructor: Sonia
No ratings yet
Quantitative Analysis For Management III: Course Instructor: Sonia
24 pages
18MAT31-Assignment - 2 (3 SETS)
No ratings yet
18MAT31-Assignment - 2 (3 SETS)
6 pages
Ian Sloan and Lattice Rules: P. Kritzer, H. Niederreiter, F. Pillichshammer
No ratings yet
Ian Sloan and Lattice Rules: P. Kritzer, H. Niederreiter, F. Pillichshammer
29 pages
TUTORIAL Chapter 1 - I MAT 455: Answers
No ratings yet
TUTORIAL Chapter 1 - I MAT 455: Answers
1 page
HLAA P2 May23
No ratings yet
HLAA P2 May23
15 pages
IEOR 160 Partial Lecture Notes 2017
No ratings yet
IEOR 160 Partial Lecture Notes 2017
33 pages
MATENB1 Study Guide 2023
No ratings yet
MATENB1 Study Guide 2023
12 pages
Chapter 10-REDUCTION BLOCK DIAGRAM
91% (23)
Chapter 10-REDUCTION BLOCK DIAGRAM
14 pages
EE 376A: Information Theory: Lecture Notes
No ratings yet
EE 376A: Information Theory: Lecture Notes
75 pages
A Lesson Plan in Statistics
100% (4)
A Lesson Plan in Statistics
2 pages
Quadratic Equation
No ratings yet
Quadratic Equation
57 pages
4.bezier Curve
No ratings yet
4.bezier Curve
46 pages
L3 Algebra Class 3
No ratings yet
L3 Algebra Class 3
9 pages
Symbianize Forum
No ratings yet
Symbianize Forum
31 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Seismic Inversion by Newtonian Machine Learning: Yuqing Chen and Gerard T. Schuster

Uploaded by

Seismic Inversion by Newtonian Machine Learning: Yuqing Chen and Gerard T. Schuster

Uploaded by

GEOPHYSICS, VOL. 85, NO. 4 (JULY-AUGUST 2020); P. WA185–WA200, 25 FIGS.

Seismic inversion by Newtonian machine learning

Yuqing Chen1 and Gerard T. Schuster1

ABSTRACT capability of machine learning. Empirical results suggest that in-

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Theory of the autoencoder

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

eters and the bias term of the decoder network, respectively.

The parameters of the autoencoder neural network with just two

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

the signal envelope instead of the original seismic trace because

dimension of nt × 1 will be encoded as a smaller number of

respectively, and their differences are shown in Figure 5c.

After training is finished, we input all the observed and predicted

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

a) Traveltime comparison #1 b) Traveltime comparison #38 c) Traveltime comparison #77

Traces Traces Traces

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Misfit function δpðxr ; t; xs Þ

1XX where gp represents the Green’s function, v represents the particle

× Δpz ðxr ; t; xs Þ; (10)

Δpz ðxr ; t; xs ÞÞ;

0 where q is the adjoint-state variables of p (Plessix, 2006).

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

The Born modeling method is used to compute 119 shot gathers

Crosswell checkerboard test b) Velocity perturbation m/s

accurately recovers the checkerboard velocity perturbations.

Early arrival surface geometry checkerboard test

In the second checkerboard test, we apply the NML inversion x (km)

ity perturbations, respectively. There are 124 shots evenly spaced on

x (km) x (km) x (km)

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

spacing of 10 m. An 18-Hz Ricker wavelet is used as the source

Crosswell layer model

fixed-spread crosswell acquisition geometry is deployed, where

Figure 13. The architecture of the autoencoder

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

x (km) Depth (km) x (km)

a) b) c) Figure 15. Velocity profiles at (a) x ¼ 0.3 km,

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Figure 17. The (a) inverted velocity model and the a) b) c)

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

batches of 50 traces and a tanh activation function. a) b)

Crosswell Marmousi model

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Noise sensitivity tests

c) Encoded data comparison d) Encoded data comparison

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Figure 24. The decoded waveform changes with a)

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

a) b) Figure 25. The (a) true velocity model, (b) single-

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

Downloaded from https://pubs.geoscienceworld.org/geophysics/article-pdf/doi/10.1190/geo2019-0434.1/5071174/geo-2019-0434.1.pdf

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.