2 Data Analysis
2 Data Analysis
• Characteristics – 6 V’s
• Volume, Variety, Velocity, Veracity, Value, and
Big Data Analytics Variability
&
Internet of Things
in
Upstream Oil and Gas Industry
1 2
3 4
1
1/26/2023
Linear regression
5 6
7 8
2
1/26/2023
9 10
11 12
3
1/26/2023
• Measures of location
– Mean
– Median
– Mode
• measures of spread
– Variance
– standard deviation
– interquartile range
• measures of shape
– coefficient of skewness
– coefficient of variation
13 14
4
1/26/2023
Measures of Spread
17 18
Measures of Shape
19 20
5
1/26/2023
21 22
Bivariate Analysis
Graphing Univariate Date
23 24
6
1/26/2023
Correlation • Covariance:
• There are three patterns one can observe on a scatterplot: the
variables are either positively correlated, negatively
correlated, or uncorrelated.
• Two variables are positively correlated if the larger values of
one variable tend to be associated with larger values of the
other variable, and similarly with the smaller values of each
variable. In porous rocks, porosity and permeability are
typically positively correlated.
• Correlation Coefficient:
• Two variables are negatively correlated if the larger values of
one variable tend to be associated with the smaller values of
the other. In geological data sets, the concentrations of two
major elements are often negatively correlated.
• The final possibility is that the two variables are not related. An
increase in one variable has no apparent effect on the other.
27 28
7
1/26/2023
29 30
31 32
8
1/26/2023
Fig. 2.5A shows the porosity-permeability scatterplot for this dataset, indicating an
apparent exponential relationship.
On the other hand, same data after rank transformation, where a much stronger linear
trend can be observed.
Pearson CC value of 0.789 for these data, which reflects the strength of the linear trend
Spearman CC value of 0.916 reflecting the strength of the rank-transformed linear trend
33 34
9
1/26/2023
Linear Regression
37 38
• Seismic attributes is be divided into two categories: • The coefficients a and b in this
– Horizon-based attributes, the average properties of the seismic equation may be derived by
trace between two boundaries, generally defined by picked horizons
minimizing the mean-squared
– Sample-based attributes, the transforms of the input trace in such a
way as to produce another output trace with the same number of prediction error:
samples as the input.
39 40
10
1/26/2023
43
11