Group6 EMD6M4A AS2.docx (Report)
Group6 EMD6M4A AS2.docx (Report)
ASSIGNMENT 2
LECTURER NAME
DR. WAN SULAIMAN WAN MOHAMAD
No Name Student ID
1. NUR AYUNI BINTI MOHD SAWAL 2018801192
2. ABDUL HAKIM NADZMI BIN ABD RAHMAN 2018259958
3. MIOR MUHAMMAD MUAZ BIN MIOR HANIP 2018660092
4.
5.
TOTAL 20%
REMARKS:
MARKING RUBRIC CO3
Scale 5 4 3 2 1
Marks
Criteria Excellent Good Satisfactory Poor Very Poor
Defining Student states the Student does
Student fails to
the problems and Student states the Student adequately not identify
define the problems
problem objectives clearly problems and defines the problems the problems
and objectives
and identifies objectives clearly. and objectives. and
adequately.
underlying issues. objectives.
Developing Student develops a
Student does
a plan to clear and concise
Student develops a Student develops an Student develops a not develop a
solve the plan to solve the
single plan and adequate plan and marginal plan and coherent plan
problem problem, offering
follows it to follows it to does not follow it to to solve the
alternative strategies,
conclusion and is conclusion and is conclusion and is problem and
and follows the plan
able to solve the able to solve the not able to solve the is not able to
to conclusion and is
problem. problem. problem. solve the
able to solve the
problem.
problem.
Collecting Student collects Student collects and Student collects Student collects Student
and information from analyzes adequate inadequate collects no
analyzing multiple sources information. information and information to viable
information and analyzes the performs basic perform meaningful information.
information in- analyses. analyses.
depth.
Interpreting Student provides a Student provides a Student provides an Student provides an Student does
findings logical logical adequate inadequate not interpret
and solving interpretation of interpretation of interpretation of interpretation of the
the the findings and the findings and the findings and the findings and findings/reach
problem clearly solves the solves the problem solves the problem does not derive a a conclusion
problem, offering but fails to provide but fails to provide logical solution to
alternative alternatives. alternatives. the problem.
solutions.
TOTAL
LIST OF TABLES
Table 1 linear regression data ............................................................................................................... 10
Table 2 polynomial regression data ...................................................................................................... 12
1. INTRODUCTION
We will often have occasion to fit curves to data points. The techniques developed for this
purpose can be divided into two general categories: regression and interpolation. Regression is
employed where there is a significant degree of error associated with the data. Experimental
results are often of this kind. For these situations, the strategy is to derive a single curve that
represents the general trend of the data without necessarily matching any individual points. In
contrast, interpolation is used where the objective is to determine intermediate values between
relatively error-free data points. Such is usually the case for tabulated information. For these
situations, the strategy is to fit a curve directly through the data points and use the
curve to predict the intermediate values.
Regression analysis is a reliable method of identifying which variables have impact on
a topic of interest. The process of performing a regression allows you to confidently determine
which factors matter most, which factors can be ignored, and how these factors influence each
other. In order to understand regression analysis fully, it’s essential to comprehend the
following terms:
Dependent Variable: This is the main factor that you’re trying to understand or predict.
Independent Variables: These are the factors that you hypothesize have an impact on your
dependent variable.
In this assignment, we are using two types of regression, linear regression and
polynomial regression. Linear regression comprises a predictor variable and a dependent
variable related to each other in a linear fashion. Linear regression involves the use of a best fit
line. Linear regression is use when the variables are related linearly. For example, if we are
forecasting the effect of increased advertising spend on sales. However, this analysis is
susceptible to outliers, so it should not be used to analyze big data sets. Polynomial regression is
a procedure for determining the coefficients of a polynomial of a second degree, or higher,
such that the polynomial best fits a given set of data points. As in linear regression, the
derivation of the equations that are used for determining the coefficients is based on minimizing
the total error. It is used when data points are present in a non-linear fashion.
We used these methods to calculate the data for fatal construction site accident. Each
day, on average, two construction workers die of work-related injuries in Malaysia. In fact, one
in five workplace fatalities are construction-related. The top causes of construction-related
fatalities are falls, struck-by an object, electrocution and caught between objects. Due to the
numerous hazards on construction sites, there are many safety precautions to protect workers.
But accidents still happen and, when they do, you pay the price with a life-changing
injury. Although the construction industry is subject to safety laws and codes to regulate the
field's uncertainty, many accidents still occur.
By using the method, we are plotting a set of data points on an x- and y-axis graph is
the initial stage in this regression analysis procedure. An analyst will build a line of greatest fit
using the least squares method to show the likely link between research variables.
2. PROBLEM STATEMENT
The construction industry is growing extremely progressive and profitable. It is one of the
topmost industries that contribute to the economy of a country. This industry provides
employment opportunities and enhances economic development, especially in developing
countries, such as China, India, Indonesia, and Malaysia. However, construction is a risky
activity in which different parties engage in myriad challenges in one environment. By
analysing secondary data from Malaysia Social Security Organization (SCOSO), found that
2,822 casual occupational injuries occurred in Malaysia with an average annual incidence of
9.2 fatal job-related injuries per 100,000 workers. Recently published data from the Department
of Occupational Safety and Health (DOSH) revealed that 1,116 work-related accidents
occurred over the period of 2011 to 2016 and that 37.85%–51.50% of accidents resulting in
non-permanent disability, permanent disability, and death occurred on construction sites.
Linear regression
For this assignment, the data of the number of deaths during construction in 2005 until
2019 has been recorded in the excel and shown using linear regression method. The
linear regression method is a fit straight line to a set of n data point. The equation of
linear regression is 𝑓(𝑥) = 𝑎𝑜 + 𝑎1 𝑥 + 𝑒. 𝑎𝑜 (intercept), 𝑎1 (slope), 𝑒(residual). There
are 3 criteria for best fit linear regression which are minimizes the sum residuals,
minimize the sum of the absolute values of the residuals and minimizes the maximum
error of any individual point.
Polynomial regression
For MATLAB, the graph obtained is polynomial regression for number of deaths
during construction against year. This method is to minimize the sum of the squares of
the estimate residuals. The first order polynomial regression is only the straight line
that have large residual. The 2nd order polynomial regression has minimized the
residuals and the graph is in curve shape. The formula for the second order polynomial
regression is 𝑓(𝑥) = 𝑎𝑜 + 𝑎1 𝑥 + 𝑎2 𝑥 2 + 𝑒.
4.3 Coding
Linear regression
Linear regression
From the table below, the x and y are represented as the year and the number of deaths during construction, respectively for Malaysia in 2005 until 2019.
̅
𝑦 7.2
𝑆𝑡 24368
𝑆𝑟 9359.071429
𝑟2 0.615928
̅
𝑦 7.2
𝑆𝑡 24368
𝑆𝑟 7720.415966
𝑟2 0.683174
The first order polynomial denotes a linear equation, whereas the second order polynomial
denotes a quadratic equation. When the points in the data are not captured by the linear
regression model and it fails to describe the best result, a polynomial is used. The r 2 must be
greater than or equal to one in order to be a perfect fit for the graph. As a result, it was
demonstrated that as the degree order polynomial increases, so does the model's performance.
The r2 is 0.615928 for the first order. In the meantime, r2 for second order is 0.683174,
indicating that the method is significant enough to define the best possible model.
Finally, there are some advantages and disadvantages to using polynomial regression to
complete the data. The advantage of using polynomial regression is that it works on any size
dataset and works very well on non-linear problems. The disadvantage is that the user must
select the right polynomial degree for good variance or bias trade off.
5. CONCLUSIONS AND RECOMMENDATION
CONCLUSION
Advantage
Disadvantage
Recommendation