0% found this document useful (0 votes)
60 views13 pages

SPE-197932-MS Decline Curve Analysis Using Artificial Intelligence

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views13 pages

SPE-197932-MS Decline Curve Analysis Using Artificial Intelligence

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

SPE-197932-MS

Decline Curve Analysis Using Artificial Intelligence

Shashipal Reddy Masini, Shubham Goswami, Aditya Kumar, and Balaji Chennakrishnan, Telesto Energy

Copyright 2019, Society of Petroleum Engineers

This paper was prepared for presentation at the Abu Dhabi International Petroleum Exhibition & Conference held in Abu Dhabi, UAE, 11-14 November 2019.

This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.

Abstract
DCA (Decline curve analysis) is the most followed and useful approach for the short-term production
forecasts. The objective of this paper is to demonstrate automated DCA using AI (Artificial Intelligence)
Techniques.
Cutting-edge clustering techniques were used to identify the data points where the decline curve is
applicable. As the well’s pressure depends on factors such as production rates, field and surrounding well’s
production rates, Multi-variate regression was used to increase the pressure prediction’s robustness. As the
production forecasting is a time-series sequence problem, models like XGBoost (Extreme Gradient Bosting
Algorithm), RNN (Recurrent Neural Networks) i(Lipton 2015) were chosen to perform this regression. After
training the model on the data, the model’s predictions were combined with outputs of well-modelling to
assess the well’s flowing ability at each timestep.
The solution removes anomalous data and perform automatic DCA in a scalable manner. The observed
and calculated data have a very good agreement (very less root mean square error and a deviation less than
5%). Probable Well ceasure times were predicted with very minimal uncertainty (with an Average deviation
of 3 months).
The novelty of this approach is the ability of the AI and machine learning algorithms to perform the
DCA automatically and its scalability. It frees the engineers from data gathering, anomaly removing and rate
break identifying process leading to quicker interpretations and effective decision making. This approach
can also be combined with well modelling using AI to predict the probable well shut-in timelines.

Introduction
DCA is the most used method to predict oil well or gas well production based on historic production
performance. Usually, DCA is done on rate vs time or rate vs cumulative oil/gas data. Based on the
nominal decline rate change with the time, the production decline trend of the well is categorised into three
types (exponential, hyperbolic and harmonic declines). Using these decline trends production scenarios are
estimated for planning future operations and finances of the field.
One of the main assumptions taken in DCA is that "no operational changes to be done" in the field for the
time period of data taken. Due to this reason, Rate breaks and operational changes identification becomes
very critical in DCA. With frequent changes in operational parameters (e.g. choke, back pressure, separator
2 SPE-197932-MS

pressure etc), it is very difficult in practical applications to identify all such rate-breaks and operational
changes manually (Takes huge time as well). In most of cases, these rate breaks are either ignored or went
unseen as this is time consuming process. If it is not ensured that assumption is followed, the consequent
forecasts involves tangible uncertainty.
Our objective was to use the advanced machine learning clustering techniques to solve this problem
and to design a solution which can identify the rate-breaks and points with similar operational constraints
accurately.
Various clustering techniques like Random Forest and Density based Clustering were tested and used
in the study.
This paper summarise the process and algorithm adopted to solve above issue, which will help petroleum
engineers in quick and automated identification of rate break due to operational parameter changes. We
took advantage of advancement of various machine learning clustering techniques to solve this problem
effectively. Random Forest and Density based Clustering were used to identify the points with same
operational conditions, so that DCA can be applied, and the reliability of the analysis can be improved. In
the cases where operational constraints were not same/similar, multi-variate pressure prediction algorithms
were used to estimate the well shut in timelines.

Method and Process followed


For the demonstration purposes, the solution presented was tested on Volve field dataset (open source).
Volve field is an oil field, located in Norwegian offshore. To improve the recovery and to maintain the
pressure above bubble point, Water injection was done, using the 2 injection wells. A total of 5 Production
wells and 2 Water Injection wells were in the field. The daily production data, pressure data (in most of the
producing days) was available in each well.
The data obtained involved so many blank cells as well as "Nan" column entries. As, without rates DCA
is not possible, all the missing rows with missing well rates were removed.
The daily choke data was available in the dataset. As the decline curve is only applicable for the dates
where operational changes were not done, DBSCAN (Density Based Clustering) with the help of OPTICS
(Ordering points to identify the clustering structure) was used to identify the days where choke was constant
and as well as to ignore the days where temporary operational changes were done.
DBSCAN is a clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and
Xiaowei Xu ii(1996). It is a non-parametric machine learning algorithm, which clusters based on density. In
a set of data points, this algorithm separates the low-density region using the density of the points.
DBSCAN separates points into neighbour groups by finding the density on a sphere with radius є with
specified minimum number (Minimum points) of points.

• Core points: The points, which are surrounded by more than or equal to minimum number of points
specified, in the neighborhood within the radius є.
• Border points: The points that are inside the clusters, defined by radius є, but surrounded by less
than specified minimum number of points.
• Outliers: The points that doesn’t lie in any cluster and neighborhood doesn’t contain more points
than specified minimum number of points as well.
The advantage with the density-based clustering is that, one need not specify the number of clusters to be
divided in the data set. This is the main reason behind choosing this algorithm. But a limitation of DBSCAN
is the need to specify parameters like radius and Minimum points for every new data set. These parameters
influence how effectiveness of the algorithm. Ideally, the value of ε varies well to well.
Generally, minimum points can be derived from the number of dimensions (D) of the data set, as
Minimum points ≥ D + 1 (and should always be greater than 1). Larger values are usually better for data
sets with noise and will yield more significant clusters. As a rule of thumb, Minimum points = 3 can be
SPE-197932-MS 3

used, however it may be useful to have a number more than 30, as it is a good practice to consider clusters
with at least 30 days of data, to do decline curve forecasts, as the dataset consists of daily data.
Suitable value of ε was identified by plotting Minimum points-1 vs Distance Graph. Best value usually
lies around the sharp inflection point (like an elbow) in the plot. If large values are chosen for ε, then all
the data points will be in the same cluster. Likewise, if small values are chosen, then most of the points will
remain un-clustered. To avoid these problems, choosing the right value of ε is very important.
To avoid this input of manual parameters, OPTICS algorithm was used. OPTICS is an algorithm for
finding density-based clusters in spatial data. It was presented by Mihael Ankerst, Markus M. Breunig,
Hans-Peter Kriegel and Jörg Sander iii(2013). OPTICS algorithm is an improvised version of DBSCAN,
which removes the necessity of doing iteration to find the optimal ε value, so that optimal clustering can
be done, even in the data with varying densities. It is done by arranging/ ordering the data points as per the
spatially closest distance. Minimum points then essentially become the minimum cluster size to find.
The ε distance estimated by OPTICS, was used in DBSCAN to further cluster the data. Minimum points
specified used to define the number of points required to form a cluster.
More densely packed clusters were considered by OPTICS, so each point is assigned a core distance that
describes the Minimum points nearest point.
The parameter ε usually can be set to a maximum value. But, when a spatial index is available, it plays
a role with complexity.
The clusters obtained in the above-mentioned methods were fitted with linear or polynomial curves and
based on the curve coefficients, the clusters with frequent choke changes were removed. This gives the data
of the days where very minimal production changes were done. As per the decline curve assumption, the
oil production data on these days is good for DCA.
Now the oil production rates of the after-clustered data points were further used for the DCA. Two
Algorithms were built and assessed in this process. One is conventional curve fitting using Exponential,
harmonic and hyperbolic curves and the other is using RNN. But as the data frequency is very less, the RNNs
were not used. For the demonstration purpose, only exponential decline was used on the clustered data.

Method for Pressure Prediction


When frequent Operational changes are done in the field, traditional decline curve forecast cannot be
applied. In such cases Multi-variate regression approach was used to predict the future pressures.
The following steps were used in this multi-variate regression.
A different anomaly removal process was used for the pressure prediction. DBSCAN was mainly used
to remove the points where pressure values are zero as well as the points where pressure and production
both are zero. Rest all points were isolated and stored in a no anomaly data-frame.
The analysis was built on the logic that the change in pressure in reservoir/well is caused by production of
fluids. All the production rates were converted into reservoir barrels, using the PVT data available. The input
features used were cumulative production of all wells (cumulative gas, oil and water in reservoir barrels)
and total production rate of the well on which analysis was being done (in reservoir barrels). Different
combinations of these features were tested, to check the prediction efficiency. Pressure was taken as output
feature.
All the features were scaled, individually, using minmax scaler of sklearn (a python machine learning
module), so that individual inverse transforming can be done at a later stage.
Once the different combinations of features identified, Machine learning models such as Linear
regression iv(statisticssolutions.com), XGBoost, Random Forest and LSTM (Long Short-term memory cells
algorithm) v(Gers 1999) were used to build different models.
4 SPE-197932-MS

Random Forest
Random Forests are advanced version of Decision trees. In this method instead of taking the output from
a single tree, average output of all the decision trees is done (same as bagging). To ensure that each tree
is different, each tree in the algorithm is trained with randomly selected data from training set. This way
variance of the model is largely reduced, avoiding the overfitting problem. This algorithm comes under
ensemble algorithm.
The first algorithm for random decision forests was created by Tin Kam Ho vi(Livingston 2005) using the
random subspace method, which, in Ho's formulation, is a way to implement the "stochastic discrimination"
approach to classification proposed by Eugene Kleinberg.
Usually, as the number of trees increases prediction accuracy increases and the algorithm would be more
robust. So, in random forest, the higher the number of trees, the lower would be the RMSE (Root Mean
Square Error) of the model. This algorithm uses both Bootstrapping as well as bagging. As the single tree
prediction is sensitive to the noise, averaging over number of trees reduces the noise of the prediction as well
as variance, with no increase in bias. However, training of all these trees on a single dataset would produce
number of correlated trees. To reduce/ remove this correlation (decorrelating) the bootstrap sampling is
done.
The number of trees required for the good training varies with the dataset. With the dataset used, 800-1200
trees were giving acceptable results. Optimal number of trees were adjusted by observing the out of bag
error and results of the cross-validation methods. The mean prediction error of each specific sample, using
only the trees that doesn’t contain that specific sample.
Apart from the above-mentioned bagging algorithm, Random forest contains one additional operation.
It uses a modified learning model that picks random features at each candidate split in the process. Due to
this, the different trees will become decorrelated. This is also called as feature bagging.

XGBoost
XGBoost vii(Chen 2016) was chosen as an option, as it is a well acclaimed and go-to algorithm for best
regression & classification models (built for structured data set). It is also called by different names such
as gradient boosting, stochastic gradient boosting and multiple additive regression trees. It is an algorithm
designed to implement gradient boosted trees with improved performance.
Boosting is an ensemble technique where new models are added to correct the errors made by existing
models. Models are added sequentially until no further improvements can be made.
Gradient boosting is an approach where new models are created that predict the residuals or errors of
prior models and then added to make the final prediction. It is called gradient boosting because it uses a
gradient descent algorithm to minimize the loss when adding new models.
As opposed to bagging, where all trees are built simultaneously and outputs are averaged, in boosting the
calculations on the trees are done in sequence. After calculating the error in each tree, the gradient of the
loss function (consisting of error metric and complexity metric) is minimised to reduce the error produced
by the previous tree. Hence, each tree will update the residual errors of the previous model.
As the Pressure prediction from features like production rates and choke data is a time series problem,
different iterations with different lagged features were done to assess the prediction loss.
In the final model, the lagged Bottom-hole pressures, Well’s Cumulative production in reservoir barrels,
Well’s production rates (in reservoir barrels), Field water injection cumulative (in Reservoir barrels) were
used as features in the model. Instead of using the direct prediction on test data, recursive prediction was
used. Recursive prediction uses current time step prediction as the input in the next step. recursive prediction
simulates the real-life scenario as close as possible, hence, it was used for all the predictions.
Sensitivities on different lags and features were done. Cross-validation for time series is different from
machine-learning problems as additional time sequence is involved in time-series problems. In the case of
machine learning models, random subset of data is selected as a validation set to estimate the accuracy of
SPE-197932-MS 5

the measurement. In time series problem, a value in the future is predicted, so, the validation data always
must occur after the training data. There are two schemas sliding window and Forward Chaining validation
methods, that can be used for the time series Cross-validation. In this case, Forward chaining validation
method was used, as it is closer to real-life estimation situation.

Presentation of Data and Results


DCA Data and Results. Initial choke and oil productions were as shown in Figure 1 (a).

Figure 1—(a) Actual Choke Data (b) Actual Oil Production Data

If the conventional method is followed, As shown in Figure 2 different decline rate regimes can be
identified for each well with time. This happens because change in operational constraints affects the
apparent decline rates observed. This in turn increases the forecast uncertainty. Manual identification of
the similar operational constraint’s timeline is a difficult task, owing to the multiple number of parameters
to be examined and multiple wells to be analysed. That is where the proposed method helps engineers by
reducing the time of analyses as well as the uncertainty of the forecasts.

Figure 2—Decline rates derived without validating the similar operating parameters

After using DBSCAN with OPTICS and curve fitting on choke data, choke data clusters, as shown in
Figure 3, were obtained. The advantage with this method is, this clearly identifies the points at which decline
curve method is applicable and reduces the uncertainty arises from picking the wrong part of the curve to
forecast. It would give also a fair idea about historic decline rates, which can be used in other analyses like
history matching etc.
6 SPE-197932-MS

Figure 3—(a) After clustering Choke Data (b) After clustering Oil Production Data

The difference between the decline rates assessed, with time, from the conventional method and the new
clustering method is represented in Figure 4 (a) & (b) (left hand side represent traditional method vs right
hand side represent new clustering method).

Figure 4—(a) Dervied decline rate phases without considering the similar Operaring constraints
(b) Decline phases identified from the data honoring Decline curve theory assumptions

Due to limitation of dataset, as it is having only choke data, clustering was done on choke data with
time. This algorithm can cluster the multi-dimensional data (e.g.: Separator pressure, Multiple chokes etc).
Added Advantage with this method is, if operational parameters are not constant over a specified timeline,
algorithm displays a warning and error saying, "the data is not suitable for DCA".
Pressure prediction Data and Results. For the wells where conventional DCA is not applicable, pressure
prediction algorithm was useful in assessing the well’s producing ability and shut-in timelines.
Separate anomaly removal process was used for pressure prediction. In this process, the points involving
allocation issues, containing zero pressure values removed first using DBSCAN. After clustering, as shown
in Figure 5, the data was used for building different machine learning models, like Linear Regression,
XGBoost, Random Forest and RNN.

Figure 5—figures showing before and after clustering data for pressure prediction algorithm
SPE-197932-MS 7

Different loss functions were used to estimate the error. But to penalise the large magnitude errors more
MSE (Mean Square error) and RMSE were used in the iterations. It was found that RMSE was giving the
best results. Results are assessed as per the following criteria

• RMSE of the fit, Out of Bag error

• Train Vs Test RMSE and R2 difference

• Average of Forward chaining validation scores and errors

• Assessing whether the trends are being followed in the prediction or not

• Testing the pressure prediction trends by using the multipliers for well and field rates

These criteria are used to select the suitable models for the prediction and Table 1 shows the final selected
model.

Table 1—Parameters used in Selected XGboost Model

The following Model has been selected as the final model. XGBoost Model was performing better
compared to the other models used. Following Features were used in the model

• Well Cumulative oil volume in reservoir barrels

• Well Cumulative water volume in reservoir barrels

• Field Cumulative oil volume in reservoir barrels

• Field Cumulative water volume in reservoir barrels

• Well oil rates in reservoir barrels

• Well water rates in reservoir barrels

• Choke size of the well

• Lagged value of Bottom-hole pressures to convert the model into time series model.

Following parameters were used in the final selected XGBoost model. The model statistics are as shown
in Table 1.
As shown in Table 2 and Table 3 test and train RMSEs were satisfactory. To further check the prediction
reliability, as it is a time-series problem, recursive prediction was done to simulate the real-life scenario. In
the recursive prediction, model adds each time steps pressure outcome to the next time steps test data set to
predict. Following graph shows the outcome of the recursive prediction.
8 SPE-197932-MS

Table 2—Model metrics of XGBoost Recursive Model-Train

Table 3—Model metrics of XGBoost Recursive Model-Test

Figure 6—Recursive Prediction results on the test Data

To further validate the model, forward chaining prediction algorithm was used. In each iteration, this
algorithm picks the dataset, from the well start date, till different specified point in the history and predicts
the future values. Forward chaining model was used instead of k-fold cross validation because forward
chaining better represents the real-world production forecasting. The results are shown Figure 8 to Figure 10.

Figure 7—Forward chaining Prediction results on Train and Test data-iteration 1


SPE-197932-MS 9

Figure 8—Forward chaining Prediction results on Train and Test data-iteration 2

Figure 9—Forward chaining Prediction results on Train and Test data-iteration 3

Figure 10—Forward chaining Prediction results on Train and Test data-iteration 4


10 SPE-197932-MS

Table 4—Model Metrics- Forward chaining prediction-Train Data

Table 5—Model Metrics- Forward chaining prediction-Test Data

Both the recursive prediction and forward chaining validation statistics are suggesting that fit of the
model was good.

Conclusions
Identifying the operating parameters change with time has been a bottleneck process in DCA, which, if not
done properly, cause tangible uncertainty. The proposed solution will help in reducing this uncertainty by
utilizing power of new available technology in following ways

• Cluster the points with similar operational conditions, which made the DCA faster and accurate,
without compromising on the scalability. This will be a huge value addition for fields having higher
number of wells.
• Automated Pressure prediction using multi-variate regression was done to forecast the future
bottom-hole pressures, which will give an idea about the well’s shut-in time. In the absence of
history matched model, this process is very useful for wells with frequent operational changes.
• Both the solutions are scalable and can be deployed on a real-time / near real-time basis.

Acknowledgements
The authors would like to thank Telesto Energy for providing permission to publish this work.

Nomenclature
ε = Maximum distance specified, between two points, to consider them as a part of
same cluster.
R = Coefficient of determination
2

Abbreviation
AI Artificial Intelligence
DBSCAN Density Based-Density-based spatial clustering of applications with noise
SPE-197932-MS 11

LSTM Long Short-term memory cells algorithm


MAPE Mean Absolute Percentage Error
MAE Mean Absolute Error
MSE Mean Square Error
OPTICS Ordering points to identify the clustering structure
RMSE Root Mean Square Error
RNN Recurrent Neural Network Algorithm
Sklearn A python Machine learning Module
XGBOOST Extreme Gradient Bosting Algorithm

Reference
Zachary C. Lipton, John Berkowitz, Charles Elkan. 2015. A Critical Review of Recurrent Neural Networks for Sequence
Learning. 29-05-2015. https://arxiv.org/abs/1506.00019
Martin Ester, Hans-Peter Kriegel, Jiirg Sander, Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters
in Large Spatial Databases with Nois. https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf
Izabela A Wowczko. 2013. Density Based Clustering with DBSCAN and OPTICS - Literature Review. https://
www.academia.edu/8142139/Density_Based_Clustering_with_DBSCAN_and_OPTICS_-Literature_Review
statisticssolutions.com. Linear Regression implementation. https://www.statisticssolutions.com/what-is-linear-
regression/
Felix A Gers, Judgen schmidhuber. 1999. Learning to Forget: Continual Prediction with LSTM. Jan-1999. http://
citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.5709&rep=repl&type=pdf
Frederick Livingston. 2005. Implementation of Breiman’s Random Forest Machine Learning Algorithm.
Machine Learning Journal Paper. ECE591Q. https://datajobs.com/data-science-repo/Random-Forest-%5bFrederick-
Livingston%5d.pdf
Tianqi Chen, Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. https://www.kdd.org/kdd2016/papers/files/
rfp0697-chenAemb.pdf
12 SPE-197932-MS

Appendix

Figure (a)—Clustering output for Well1

Figure (b)—Clustering output for Well2


SPE-197932-MS 13

Figure (c)—Clustering output for Well0

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy