0% found this document useful (0 votes)
3 views8 pages

Diifusion Model 5

The document discusses a supervised machine learning approach using deep convolutional neural networks (CNNs) for seismic image denoising, focusing on noise generated during the migration stage. It highlights the challenges of coherent noise affecting seismic data quality and outlines the architecture and training process of the CNN, including data augmentation and dropout techniques to prevent overfitting. Case studies demonstrate the effectiveness of the CNN in attenuating noise from field data sets, improving the clarity of seismic images while preserving essential geological features.

Uploaded by

tumul.iitb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views8 pages

Diifusion Model 5

The document discusses a supervised machine learning approach using deep convolutional neural networks (CNNs) for seismic image denoising, focusing on noise generated during the migration stage. It highlights the challenges of coherent noise affecting seismic data quality and outlines the architecture and training process of the CNN, including data augmentation and dropout techniques to prevent overfitting. Case studies demonstrate the effectiveness of the CNN in attenuating noise from field data sets, improving the clarity of seismic images while preserving essential geological features.

Uploaded by

tumul.iitb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

SPECIAL TOPIC: MACHINE LEARNING

Leveraging deep learning for seismic image


denoising
Elena Klochikhina1*, Sean Crawley1, Sergey Frolov1, Nizar Chemingui1 and Tony Martin1 describe
a supervised machine learning approach for attenuation of noise, formed by suboptimal
destructive interference within the migration process. The authors outline the training and
validation approach of their deep convolutional neural network, and demonstrate its
application on field data sets.

The impact of noise noise is often coherent, sharing seismic characteristics with the
Noise can affect the quality of seismic data, damaging the signal, it can be challenging to design filters that only remove the
geological integrity of the final migrated image; therefore, we unwanted components but preserve the useful energy.
should minimize its impact. If not properly attenuated it may
affect amplitude-related attributes, lead to difficulties during the The adoption of artificial intelligence
quantitative interpretation (Cambois, 2001; Ball et al., 2011), and In the early days, computers learnt how to solve problems that
result in an inaccurate appraisal of the reservoir. were intellectually difficult for humans by following a sequence
Noise can take many forms. In this paper, we consider the of strict mathematical rules. The true challenge was to create a
noise generated during the migration stage. The noise sources machine that could solve problems that humans solve intuitively,
might be as diverse as residual impulsive noise, multiple energy, problems that are hard to describe in formal rules. This knowledge
mispositioned primary energy due to errors in estimation of earth somehow needs to be captured by a computer to behave in an
properties, or insufficient illumination caused by limitations in the intelligent way (Goodfellow et al., 2017). This is a challenge for
data acquisition, and/or complex media-related propagation effects. creating an artificial intelligence (AI). To overcome the problem
In each case the migrated image may be affected by suboptimal the field of machine learning (ML) was born: AI systems were
destructive interference of the migration isochrones (Gardner and given the ability to acquire knowledge by extracting patterns from
Canning, 1994), resulting in a contamination of the data by coher- the data and gather knowledge from experience. Classical ML
ent noise. We concentrate on the latter case where the resulting methods are highly dependent on features that are prepared by
noise is due to inadequate illumination in a complex media. humans. It could be time consuming and also difficult to extract
An experienced geophysicist may easily differentiate signal and provide a right set of features without human bias in order for
and noise in a seismic section. It can be challenging to remove an ML algorithm to perform well. Deep learning (DL) overcomes
the noise without affecting the signal, because the coherent noise this problem by extracting information from raw data: complex
often has similar seismic characteristics to the desirable compo- representations can be learnt from the input by decomposing the
nents of the data. Several approaches can minimize the migration data into simpler intermediate representations. DL models look
artefacts. These include data regularization prior to migration, at the data on different scale levels layer by layer. Deep learning
filtering during migration, post-processing after migration, and has an ability to perform automatic feature extraction from raw
least-squares migration methods. data without depending completely on human-crafted features.
Data reconstruction is often used to overcome irregularities Together with advanced architectures and optimized training
in data coverage (Chemingui and Biondi, 2002; Schonewille et approaches, increase of training data for DL algorithms can help
al., 2009). Depending on the method and migration algorithm, in reaching human performance on complex tasks by learning
regularized data may produce less noise, but can require a signif- from a vast variety of examples.
icant effort in the data preparation. Moreover, regularization may Data is not in short supply within the seismic processing busi-
impact the resolution of the final image. Aperture optimization ness, however the adoption of artifical intelligence has not been
may also reduce the impact of noise generated in the migration as extensive as in other data-rich industries. There is evidence that
process (Alerini and Ursin, 2009; Klokov and Fomel, 2012), it is changing, as seismic companies seek to augment decision
but requires the knowledge of local structural dips, and results making and reduce project cycle times. The number of papers
are dependent on the accuracy of such dip information. There and manuscripts describing applications of artificial intelligence
are pragmatic alternatives such as filters that can be designed and data analytics has grown at geophysical conferences and in
to attenuate the noise in the image domain (Hale, 2011). As the geophysical journals. The trend in paper numbers suggests meth-

1
PGS
*
Corresponding author, E-mail: elena.klochikhina@pgs.com
DOI: 10.3997/1365-2397.fb2020048

FIRST BREAK I VOLUME 38 I JULY 2020 41


SPECIAL TOPIC: MACHINE LEARNING

ods invoking artificial intelligence-enabled automation may be The stability and quality of predictions depends on the network
the future of our industry. The combination of data and computer architecture, the hyperparameters and the training data set. In our
science with geophysics may be applicable to every aspect of example, the training data were created using noisy images as
a seismic processing project. From unsupervised (Martin et al., an input to the CNN and clean images (noise-free) as an output.
2015) to supervised (Farmani and Pedersen, 2020) classification The architecture of a convolutional neural network contains a
of denoising workflows, and support vector regression for data number of different operations; the goal of the trained network is
interpolation (Jia and Ma, 2017) to parabolic dictionary learning to replicate human endeavour. In our case we are trying to iden-
for data reconstruction (Turquais et al., 2019), most aspects tify and remove coherent noise from a seismic image. A typical
of data domain processing are being tested. Using a variety of CNN architecture for computer vision problems consists of a
neural network approaches, efforts are being made to compare number of different components that include convolutional layers
and contrast the velocity model building with conventional and activation functions. It may also contain other operations like
inversion-based schemes (Øye and Dahl, 2019; Yang and Ma, downsampling (or pooling), upsampling, batch normalization,
2019; Zheng et al., 2019). The reported results look encouraging. etc. All network components are connected in the form of a graph.
The essential components of CNN architectures are con-
Using a deep convolutional neural network for volutional layers separated by non-linear activation functions.
image denoising Convolutional layers use filters, which are also known as kernels.
For coherent noise attenuation, rather than explicitly formulating Each filter is an array consisting of a sequence of numbers or
filters as in conventional methods, we present an AI approach that weights. The filter slides over the image, only being exposed to
utilizes a deep convolutional neural network (CNN) to achieve a small number of input pixels at any time. The operation used
the same goal. The main components of CNNs are convolutional is a dot product of the input values with the filter weights. The
filters that are iteratively adjusted during the training step to output is a single number per sliding window. The results over the
handle the artefacts and produce clean outputs from the noisy entire input is called an activation or a feature map. Depending on
inputs. The trained models are then used to denoise the seismic the filter, each activation map identifies distinguishing features;
images from field experiments. with each progressive convolutional layer, more complex features
A neural network may act as a universal function approxi- can be determined. Non-linearity is a crucial part that allows the
mator to mimic the characteristics of a complex function F that neural network to approximate complex operations necessary for
maps a noisy input x to a noise-free output y: solving a given task. The so-called activation functions are used
for this purpose.
(1) There are many other components that could be used in CNN
architectures. In our denoising architecture, we use a pooling step
The goal of the training is to find a transformation that for downsampling. There are several types of pooling, and we used
maps x into a set of corresponding y. To do this we minimize a a maxpooling operation, which selects the largest number from
cost function ( J ), that is defined as a difference between the the neighbouring cells during the downsampling process. This
transformed inputs y’ and the desired clean outputs y, with respect reduces the spatial dimensionality of the input data scale, limiting
to the parameters of the trained network in L2 sense: computational cost and increasing the exposure of the input for the
next convolutional layer. The opposite operation is upsampling that
(2) is used to refine spatial sampling of the feature map.
The hyperparameters of a convolutional neural network are
(3) its structure, components, and training specifications. To achieve

Figure 1 The architecture used for the deep learning


process.

42 FIRST BREAK I VOLUME 38 I JULY 2020


SPECIAL TOPIC: MACHINE LEARNING

Figure 2 Examples of data used in the training of the


convolutional neural network.

the best performance of the convolutional neural network, we The bottom layer takes an input from the left branch and
optimize the hyperparameters through testing. In practice, the applies two convolutional layers.
network is trained using user defined data. This is critical for The expansion path receives the input from the bottleneck
the success of the neural network to achieve its goals. The data and also consists of four blocks; each block has two convolu-
needs to be representative of the problem we are trying to solve. tional layers followed by an upsampling procedure. After each
The training happens over the course of multiple epochs. Each upsampling step, the number of filters in the convolutional layers
epoch consists of multiple iterations. Each iteration uses a subset halves.
of the input data set, called a batch. On average, every epoch The corresponding blocks of the contraction and expansion
passes through all input data samples once. Data augmentation paths are connected by ‘skip connections’. This helps to solve the
enables a modification to the pool of training data. It is one way problem of a vanishing gradient during the training stage and sim-
to increase the number and variability of the data set, enabling a plifies the prediction task, as there is no need for reconstruction
more robust prediction and resulting in an increase in the level of of the image at full resolution from its compressed representation.
sophistication of the trained network. To accommodate the challenge, we modified the convolu-
tional blocks of U-net and fine-tuned the hyperparameters of the
Specifics of the architecture network during the training process to achieve better performance
Among the wide variety of commonly used network architectures of the neural network. In order to reduce the likelihood of overfit-
for image denoising in computer vision, we considered the U-net ting, we added dropout layers.
architecture (Ronneberger et al., 2015) to be most suitable for this We modelled synthetic shot gathers, which were migrated to
problem. During our testing phase, it showed better convergence, form the noisy inputs and clean outputs used in the training step.
faster training and solves the problem naturally as it enables We subsampled and migrated the synthetic data to generate the
operations on different feature resolutions (Figure 1). coherent noise in the images. The noise-free output consisted of
The architecture consists of three parts: the contraction (left clean images from the migration of appropriately sampled data.
branch), the bottleneck (bottom) and the expansion (right branch). The image patch size for training was 256x256 pixels (Figure 2).
Each convolutional layer receives an input and applies a set of We carefully selected the data set so that it included variations
3x3 filters, followed by a nonlinear activation function. in the following: frequency content, structural dip, amplitude,
The contracting path consists of four blocks; each block has and noise character and level. To increase the variability of the
two convolutional layers followed by a downsampling procedure input, we used data augmentation, which included horizontal
(maxpooling). The number of filters in the convolutional layers flips, random crops and sign reversal, filtering and scaling with
doubles each time the resolution decreases, so the architecture depth, resulting in approximately 100,000 total input samples.
retains the ability to explain complex features present in the input. Hyperparameters such as learning rate schedule, dropout rate and

FIRST BREAK I VOLUME 38 I JULY 2020 43


SPECIAL TOPIC: MACHINE LEARNING

batch size were adjusted during the training phase to minimize the studies demonstrate the ability of the trained network to attenuate
prediction error. We trained the network for 50 epochs on a single the noise from data sets that represent two geologically different
GPU with 32 Gb of memory. settings. We used only synthetic data to train the neural network;
therefore, the case studies demonstrate the ability of the network
The challenge of overfitting to generalize outside the training data set.
A common challenge for machine learning algorithms is overfit- We compared performance of the CNN-based denoising
ting. This occurs when the trained network’s performance shows tool with an application of a commonly used structure-oriented
great promise on the data used for training, but has a poor success filter (Hale, 2011). We could parameterize the filter differently
rate when attempting to generalize on previously unseen data. to preserve the primary energy; however here we were focused
This happens when the capacity of the model is too large, primarily on the noise attenuation aspect. More aggressive filter
compared to the diversity of a data set used for building the settings can better eliminate the noise at the cost of damaging
model. With neural networks, this occurs when there are too image resolution.
many parameters. The model may provide great flexibility and
approximation power, but the amount and variability of the data Example one – offshore Brazil
given to it is not enough to constrain the weights within the net- In the first example from Brazil, there is strong and pervasive
work, at least not without regularization. As a result, the network coherent noise. The upper yellow arrow in Figure 4 highlights
makes unreasonable predictions for any data that differs from the where this is most evident. The middle yellow arrow in the
training set, in our case frequency, amplitude or noise level. As same figure shows coherent noise above a high contrast and
a precaution, the input data set is split into two subsets, one for rugose surface. The migration noise directly overlying the
training and the remainder for validation. We then monitor the unconformity distorts the seismic events, making interpretation
trained model’s performance on the latter. A gradual decrease challenging. In all cases, the noise shares seismic characteristics
of the loss function for both training and validation data sets with the signal that we want to use and preserve, such as
implies reasonable generalization, assuming fair selection of the dipping fault planes, and the flanks of the deeper steep-sided
validation data set. body. Figure 5 shows the result of using the CNN on the
To reduce the overfitting problem, we used a dropout input data. The blue arrows show the removal of the coherent
technique. During the training step, we carefully monitored the noise. The reflectivity above the rugose unconformity is more
behaviour of the objective function for both the training and val- continuous, no longer disrupted by the noise forms, and there is
idation data sets, assuring the proper behaviour whilst preserving no noise contamination of the data abutting the deep steep-sided
an effective convergence (Figure 3). body.
The denoised section is much cleaner, and reflectivity is
Case studies easier to track. Steep dipping energy, such as fault planes,
The denoising capabilities of the neural network were tested are still present. The difference of the application (Figure 6)
on two field data sets, one from a deep water survey offshore demonstrates the impact of the CNN – a large amount of noise
Brazil, the other from a shallow water example in the North has been removed. There are indications that the process has
Sea. In both cases, insufficient illumination and complex media attenuated some steep dipping energy that correlates with the
cause coherent noise in the images that have similar seismic noise; however, the output section (Figure 5) shows that much of
characteristics just as the signal we want to preserve. The case this energy remains unscathed.

Figure 3 Objective functions for both training and


validation processes. The two were monitored during
the process to confirm equivalent convergence for
both.

44 FIRST BREAK I VOLUME 38 I JULY 2020


SPECIAL TOPIC: MACHINE LEARNING

Figure 4 Input seismic data. The yellow arrows


show the migration-related noise the CNN model is
attempting to remove.

Figure 5 CNN denoised data. Blue arrows indicate


that the model has removed almost all the coherent
noise from the seismic section.

Figure 6 The application difference shows the energy


removed from the seismic section. Orange arrows
indicate some primary energy has been attenuated.

FIRST BREAK I VOLUME 38 I JULY 2020 45


SPECIAL TOPIC: MACHINE LEARNING

Figure 7 The application of a conventional denoise


process. The turquoise arrows indicate the residual
noise left in the data, whilst the orange arrows
highlight the attenuation of desirable signal.

Figure 7 demonstrates an application of a structure-oriented filter. The injectites have localized reservoir potential. In Figure 10
The upper turquoise arrows indicate locations where the noise is still (blue arrows), we see that the trained neural network has removed
present, whilst the lower orange arrows show where the process has more noise; the events overlying the source of the noise are easier
removed the complementary steep dipping signal we would ideally to interpret. Figure 11 shows the impact of the deep-learning
preserve. It is also important to emphasize that the application of the approach. The noise has almost been eradicated. It is worth noting
structure-oriented filter affected image resolution. This conventional the orange arrows on Figure 11. They show that the injectites
method has not been as effective as the CNN approach. are affected by the denoise process. Their shape is very similar
to the coherent noise created in the migration process, as each
Example two – North Sea migration-related noise form has an apex. Consequently, the
In the second example, poor illumination of a single high-contrast process does attenuate some energy from the injectites but no
and undulating event causes localized migration-related noise more than the conventional approach, which underperforms on
(yellow arrows – Figure 8). The noise swings upwards disrupting the migration related noise attenuation.
the events directly above the rugose event, making interpretation
challenging. Discussion
The conventional denoise approach using structure-oriented The goal of this work is to demonstrate the denoising capabilities
filters mitigates some noise (turquoise arrows – Figure 9), but of a convolution neural network, in particular the attenuation of
also smears some of the isolated injectites located in the shallow- coherent noise formed during the migration process. The network
er layer (orange arrows – Figure 9). was trained using approximately 100,000 input samples, and

Figure 8 North Sea input seismic data. Yellow arrows


highlight the offending coherent noise.

46 FIRST BREAK I VOLUME 38 I JULY 2020


SPECIAL TOPIC: MACHINE LEARNING

Figure 9 The structure-oriented filtering result shows


residual noise (turquoise arrows), and a smearing
energy of shallower injectite energy (orange arrows).

Figure 10 The output from the application of the


convolutional neural network’s model. The blue
arrows highlight the effectiveness of the process;
almost all coherent noise has been removed.

Figure 11 The difference of the deep learning


approach shows the removal of the offending noise.
Some injectite energy has also been attenuated, but
not fully removed.

FIRST BREAK I VOLUME 38 I JULY 2020 47


SPECIAL TOPIC: MACHINE LEARNING

benefited from an augmentation process. The noise forms have References


a 3D nature. However, each input sample to the neural network Alerini, M. and Ursin, B. [2009]. Adaptive focusing window for seismic
training was 2D, as was the training process. Despite this, the angle migration. Geophysics, 74 (1), S1–S10.
transferability of the trained network to unseen field data shows Ball, V., Blangy, J.P., Pringle, K. and Schwark, J. [2011]. Seismic rock
significant potential as the noise forms we were attempting to physics in the presence of attribute noise. 81st SEG Annual Meeting,
remove had a consistent nature. How generalizable the tool to Extended Abstracts, 355-359.
other forms of noise is still to be determined. Cambois, G. [2001]. AVO processing: Myths and reality. 71st SEG
Improving the training process in a 3D sense would Annual Meeting, Extended Abstracts.
improve the results for noise that is three dimensional. In Chemingui, N. and Biondi, B. [2002]. Seismic data reconstruction by
the current training process, the majority of the work was inversion to common offset. Geophysics, 67, 1575–1585.
collecting an appropriate population of input samples, whereas Farmani, B. and Pedersen, M. [2020]. Extended Attributes for Machine
the neural network training took an insignificant time once the Learning Denoise Process: First Step Towards Automation. 82nd
optimum combination of hyperparameters was established. To EAGE conference and Exhibition, Extended Abstracts.
extend the training to 3D would require considerably more Gardner, G. H. F. and Canning, A. [1994]. Effects of irregular sampling
effort. The network would become larger, and the tuning of on 3-D prestack migration. 64th SEG Annual International Meeting.
the hyperparameters would be far more convoluted. Whilst this Expanded Abstracts, 1553-1556.
would be a more memory and compute intensive process, the Goodfellow, I., Bengio Y. and Courville, A. [2017]. Deep learning. MIT
trained network should be able to differentiate noise and signal Press, Cambridge, MA.
more effectively. Hale, D. [2011]. Structure-oriented bilateral filtering of seismic images.
It is also worth noting that we have focused on the effec- 81st SEG Annual Meeting, Expanded Abstracts, 3596-3600.
tiveness of the CNN to remove noise in post-stack images. Jia, Y. and Ma, J. [2017]. What can machine learning do for seismic pro-
The extension of the approach to the pre- or sub-stack image cessing? An interpolation application. Geophysics, 82 (3), 163-177.
domain data, for attribute analysis, still needs to be confirmed. Klokov, A. and Fomel, S. [2012]. Optimal migration aperture for con-
Empirical evidence from our existing tests suggests that an flicting dips. 82nd SEG Annual Meeting, Expanded Abstracts, 1-6.
application to pre-stack data will also assist amplitude-related Martin, T, Saturni, C., Ashby, P. [2015]. Using machine learning to
attribute generation and analysis. produce a global automated quantitative QC for noise attenuation.
85th SEG Annual Meeting, Expanded Abstracts, 4790-4794.
Conclusions Øye, O. K. and Dahl, E. K. [2019]. Velocity Model Building from Raw
We have demonstrated a deep learning approach for attenuating Shot Gathers Using Machine Learning. 81st EAGE conference and
migration noise from seismic images. Historically, removing Exhibition, Extended Abstracts.
this type of noise from seismic data has been challenging, as it Ronneberger, O., Fischer, P. and Brox, T. [2015]. U-Net: Convolutional
shares many characteristics with the signal that we would like Networks for biomedical Image Segmentation. Medical Image Com-
to preserve. We have trained a convolutional neural network to puting and Computer-Assisted Intervention (MICCAI), 9351, 234-241.
differentiate the noise from geological structures. Schonewille, M., Klaedtke, A., Vigner, A., Brittan, J. and Martin, T.
The application of the trained model to two field data sets [2009]. Seismic data regularization with the anti-alias anti-leakage
demonstrates the potential of the solution for successfully Fourier transform. First Break, 27, 85-92.
attenuating migration noise without compromising the reso- Turquais, P., Söllner, W. and Pedersen, M. [2019]. Parabolic Dictionary
lution or structural integrity of the seismic image ― we see Learning: A Method or Seismic Data Reconstruction Beyond the
improvements in both structure and amplitude fidelity of the Linearity Assumption. 81st EAGE Conference and Exhibition,
seismic image. Extended Abstracts.
Yang, F. and Ma, J. [2019]. Deep-learning inversion: A next-generation
Acknowledgements seismic velocity model building method. Geophysics, 84 (4), 583-599.
The authors wish to thank PGS for permission to publish Zheng, Y., Zhang, Q., Yusifov, A., Shi, Y. [2019]. Applications of
the paper and PGS MultiClient for providing the field data supervised deep learning for seismic interpretation and inversion. The
examples. Leading Edge, 38 (7), 526-533.

48 FIRST BREAK I VOLUME 38 I JULY 2020

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy