Guru Nanak Dev Engineering College, Ludhiana
Guru Nanak Dev Engineering College, Ludhiana
CANDIDATE'S DECLARATION
I “NAME OF THE STUDENT” hereby declare that I have undertaken Six month training
partial Fulfillment of requirements for the award of degree of B .Tech (Electronics and
LUDHIANA. The work which is being presented in the training report submitted to
Optical communication etc. Ever since its inception, the department has
been the hub of academic excellence through some great teachers who
region, who have spread their wings all over the globe. The alumni of
the department are not only excelling in India but also in the Silicon
worthy emulators of the legacy of their seniors for the glory of their
Institute.
SUMMER TRAINING CERTIFICATE
https://eict.iitr.ac.in/cert/cloudxlab/jpg/cert-L4WR6.jpg
1.) INTRODUCTION
The term Machine Learning was first coined by Arthur Samuel in the year 1959. Looking
back, that year was probably the most significant in terms of technological advancements.
If you browse through the net about ‘what is Machine Learning’, you’ll get at least 100
different definitions. However, the very first formal definition was given by Tom M.
Mitchell:
“A computer program is said to learn from experience E with respect to some class of
provides machines the ability to learn automatically & improve from experience
without being explicitly programmed to do so. In the sense, it is the practice of getting
Algorithm: A Machine Learning algorithm is a set of rules and statistical techniques used
to learn patterns from data and draw significant information from it. It is the logic behind
Regression algorithm.
using a Machine Learning Algorithm. An algorithm maps all the decisions that a model
is supposed to take based on the given input, in order to get the correct output.
Predictor Variable: It is a feature(s) of the data that can be used to predict the output.
Response Variable: It is the feature or the output variable that needs to be predicted by
Training Data: The Machine Learning model is built using the training data. The training
data helps the model to identify key trends and patterns essential to predict the output.
Testing Data: After the model is trained, it must be tested to evaluate how accurately it
Traditional Approach
Supervised Learning
Supervised learning is the machine learning task of learning a function that maps an
a vector) and a desired output value (also called the supervisory signal).
Data Preparation
Design algorithmic
logic Train the
model with Train
X
Derive the relationship between x and y, that is, y = f(x)
Training Step
Design algorithmic
logic Train the
model with Train
Derive the relationship between x and y, that is, y = f(x)
Evaluation or Test Step
Evaluate or test with Train E
If accuracy score is high, you have the final learned
If accuracy score is low, go back to training step
Production Deployment
Use the learned algorithm y = f(x) to predict production data.
The algorithm can be improved by more training data, capacity, or algo redesign.
Once the algorithm is trained, test it with test data (a set of data instances that
not work well on test data. Retraining may be needed to find a better fit.
situation where the algorithm is not able to handle new testing data that it has
not seen before. The technique to keep data generic is called regularization.
Voice Assistants
Gmail Filters
Weather Apps
Regression
classified nor labeled and allowing the algorithm to act on that information without
guidance
find data clusters so that each cluster has the most closely matched data.
Visualization Algorithms
data and display this data in an intuitive 2D or 3D format. The data is separated into
Anomaly Detection
also make use of unlabeled data for training – typically a small amount of labeled data
training data) and supervised learning (with completely labeled training data).
Google Photos automatically detects the same person in multiple photos from a
One has to just name the person once (supervised), and the name tag
Reinforcement Learning
system to observe the environment and learn the ideal behavior based on trying to
system to observe the environment and learn the ideal behavior based on trying to
It differs from supervised learning in that labelled input/output pairs need not be
presented, and sub-optimal actions need not be explicitly corrected. Instead the
certain cases).
Data Preprocessing
Data Preparation
Types of Data
Test Data
Validation Data
Feature Engineering
The transformation stage in the data preparation process includes an important step
Feature Engineering refers to selecting and extracting right features from the data
Feature Selection
Most useful and relevant features are selected from the available data
Feature Addition
Feature Extraction
Existing features are combined to develop more useful ones
Feature Filtering
Feature Scaling
Standardization
mathematical model).
The mean of each feature is centered at zero, and the feature
Normalization
In most cases, normalization refers to rescaling of data features between 0 and 1, which is
1.1.1. Regression
It includes many techniques for modeling and analyzing several variables, when the
focus is on the relationship between a dependent variable and one or more independent
More specifically, regression analysis helps one understand how the typical value
of the dependent variable (or 'criterion variable') changes when any one of the
independent variables is varied, while the other independent variables are held
fixed.
Linear Regression
Linear regression is a linear approach for modeling the relationship between
parameters.
y = wx + b
Polynomial Regression
straight line.
must have
of a model.
Random Forests use an ensemble of decision trees to perform regression tasks.
Classification
an input variable.
It is best used when the output has finite and discreet values.
classification problems.
by the
S-curve)
detection.
linearly separable.
Naïve Bayes
gain (IG).
time and outputting the class that is the mode of the classes
trees.
Clustering means
based on similarity
possible
Elbow Method
Euclidian Distance
dimensional space.
Classifying high risk and low risk patients from a patient pool
Introduction to Deep Learning
ML Vs Deep Learning
1.2.2. Artificial Neural Networks
inputs.
It is an interconnected group of nodes akin to the vast network of
1.2.3. TensorFlow
The job market for machine learning engineers is not just hot but it’s
sizzling.
4.) Project
1.) First required a data set (IPL data set of first 5 inning): link of the data set is
Since the dawn of the IPL in 2008, it has attracted viewers all around the globe. High level
of uncertainty and last moment nail biters has urged fans to watch the matches. Within a
short period, IPL has become the highest revenue generating league of cricket. Data
Analytics has been a part of sports entertainment for a long time. In a cricket match, we
might have seen the score line showing the probability of the team winning based on the
Problem and Classification problem. The Regression problem deals with the kind of
problems having continuous values as output while in the Classification problem the
outputs are categorical values. Since the output of winner prediction is a continuous
2.) Apply the approach of machine learning and apply the best fit algorithm:
o Converting the column 'date' from string into date time object
o Ridge Regression
o Visualization using seaborn module
Flask is a web framework. This means flask provides you with tools, libraries and
technologies that allow you to build a web application. This web application can
external libraries. This has pros and cons. Pros would be that the framework is
light, there are little dependency to update and watch for security bugs, cons is
that some time you will have to do more work by yourself or increase yourself the
Templating
Templates are files that contain static data as well as placeholders for dynamic
data. A template is rendered with specific data to produce a final
document. Flask uses the Jinja template library to render templates. In your
application, you will use templates to render HTML which will display in the
user's browser.
app = Flask(__name__)
@app.route('/')
def home():
return render_template('index.html')
@app.route('/predict', methods=['POST'])
def predict():
temp_array = list()
if request.method == 'POST':
batting_team = request.form['batting-team']
if batting_team == 'Chennai Super Kings':
temp_array = temp_array + [1,0,0,0,0,0,0,0]
elif batting_team == 'Delhi Daredevils':
temp_array = temp_array + [0,1,0,0,0,0,0,0]
elif batting_team == 'Kings XI Punjab':
temp_array = temp_array + [0,0,1,0,0,0,0,0]
elif batting_team == 'Kolkata Knight Riders':
temp_array = temp_array + [0,0,0,1,0,0,0,0]
elif batting_team == 'Mumbai Indians':
temp_array = temp_array + [0,0,0,0,1,0,0,0]
elif batting_team == 'Rajasthan Royals':
temp_array = temp_array + [0,0,0,0,0,1,0,0]
elif batting_team == 'Royal Challengers Bangalore':
temp_array = temp_array + [0,0,0,0,0,0,1,0]
elif batting_team == 'Sunrisers Hyderabad':
temp_array = temp_array + [0,0,0,0,0,0,0,1]
bowling_team = request.form['bowling-team']
if bowling_team == 'Chennai Super Kings':
temp_array = temp_array + [1,0,0,0,0,0,0,0]
elif bowling_team == 'Delhi Daredevils':
temp_array = temp_array + [0,1,0,0,0,0,0,0]
elif bowling_team == 'Kings XI Punjab':
temp_array = temp_array + [0,0,1,0,0,0,0,0]
elif bowling_team == 'Kolkata Knight Riders':
temp_array = temp_array + [0,0,0,1,0,0,0,0]
elif bowling_team == 'Mumbai Indians':
temp_array = temp_array + [0,0,0,0,1,0,0,0]
elif bowling_team == 'Rajasthan Royals':
temp_array = temp_array + [0,0,0,0,0,1,0,0]
elif bowling_team == 'Royal Challengers Bangalore':
temp_array = temp_array + [0,0,0,0,0,0,1,0]
elif bowling_team == 'Sunrisers Hyderabad':
temp_array = temp_array + [0,0,0,0,0,0,0,1]
overs = float(request.form['overs'])
runs = int(request.form['runs'])
wickets = int(request.form['wickets'])
runs_in_prev_5 = int(request.form['runs_in_prev_5'])
wickets_in_prev_5 = int(request.form['wickets_in_prev_5'])
data = np.array([temp_array])
my_prediction = int(regressor.predict(data)[0])
if __name__ == '__main__':
app.run(debug=True)
o Template of frontend
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title>First Innings Score Predictor</title>
<link rel="shortcut icon" href="{{ url_for('static', filename='ipl-favicon.ico') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='styles.css') }}">
<script src="https://kit.fontawesome.com/5f3f547070.js" crossorigin="anonymous"></script>
<link href="https://fonts.googleapis.com/css2?family=Open+Sans:wght@300&display=swap" rel="style
sheet">
</head>
<body background="static\back.jpg">
<!-- Website Title -->
<div class="container">
<h2 class='container-
heading'><span class='heading_first'>First Innings Score Predictor for </span><span class="headi
ng_second">Indian Premier League (IPL)</span></h2>
<div class='description'>
<p>A Machine Learning Web App, Built with Flask, Deployed using Heroku.</p>
</div>
</div>
<div class="slider-frame">
<div class="slide-images-up">
</div>
</div>
</div>
<div class="slider-frame">
<div class="slide-images-down">
</div>
</div>
</div>
</div>
</body>
</html>
3.) Result and discussion
By adding the required fields in the form we get prediction score and the result
show as below
The project predicts the score by Appling the linear regression algorithm on the
model and display the score
4.) CONCLUSION AND FUTURE SCOPE
conclusion
We have only limited data, so we could not build powerful models. We strongly
believe that the model performance will increase, if we were given large amounts of
data.
We can’t compare two projects based on their MSE. For example, if in one project,
the output values are in the range of (1,10), and if we predicted the value to be 1 and
the actual value is 10, the error is 9. If in another project, where the output values are
in the range of (300,800), and if we predicted the value to be 740 and the actual value
is 790, the error is 50. But practically, we are performing good in the second project.
So, we can’t say which project is best based on just MSE’s.
Future scope
Moreover, neglecting all these ruckuses that AI/ML will steadily and
inevitably take over large sectors of the workforce and will bring
the way for close to 2.3 million jobs by the year 2020.
of Machine Learning.
Reason for choosing Machine Learning
The job market for machine learning engineers is not just hot but it’s
sizzling.
References