AIDI - 1010 - WEEK2 - Google Colab - v1.2
AIDI - 1010 - WEEK2 - Google Colab - v1.2
WEEK2
Jahanzeb Abbas (JB)
Week Objectives
(A) Review of Python/Modules
Modules
BootCamp Video
https://youtu.be/rfscVS0vtbw
(B) Review of IDE (Local)
Anaconda
Anaconda is a distribution of the Python and R programming languages for scientific computing, that
aims to simplify package management and deployment.
https://www.anaconda.com/distribution/
IPython
“Interactive Python” is an interactive shell to run Python commands.
<ENTER> key executes the code.
Great for prototyping your code
Install through Anaconda
Have to install dependencies and modules separately
Jupyter Notebook
Web-based interactive shell to run Python commands. (IPython is the shell and kernel for Jupyter)
Code is produced when its run and ‘markdown cells’ are used to explain what the code means
Markdown is a popular language, superset of HTML
Install through Anaconda
Have to import libraries/modules separately
(B) Review of IDE (Virtual)
Google Colab
Check WEEK2’s video that can help you step through Google Colab
This will be the preferred IDE for the course
Web-based interactive shell to run Python commands
Architecture of Jupyter notebooks hosted on Google machines!
https://colab.research.google.com/
Next slide to guide you further
No messy installations; required packages already installed/imported
Run this code to check installed packages: !pip list -v
(B) Review of IDE (Virtual)
(C) Review of Machine Learning
• “Machine learning (ML) is the study of computer algorithms
that improve automatically through experience and by the
use of data” – Wikipedia
• Ask a question that might be answered with data (as a
hypothesis)
• Extract the data & understand it automatically
• Create decisions or insight, from a computer, that would
typically require a human
(C) Review of Machine Learning
• What question are we trying to answer?
• Write a program that can scan an image and differentiate between a
boy and a girl; without a model, program will be super lengthy
• If someone asks to now detect other objects in the same picture as
well; without a model, it will take forever to incorporate changes
https://
wordstream-files-prod.s3.amazonaws.com/s3fs-public/styles/simple_im
age/public/images/machine-learning1.png?SnePeroHk5B9yZaLY7peFkU
LrfW8Gtaf&itok=yjEJbEKD
(C) Review of Machine Learning (Steps)
1
1. “Import” the dataset
• Dataset can be from Modules, DB, EXCEL, CSV etc
2
2. “Clean” the dataset
• Remove duplicates, noise, irrelevant, incomplete, or convert properly
3
3. “Approach” the dataset with an “Estimator/Data Strategy” (dependent on objective)
• Choose an approach category: Supervised Learning vs Unsupervised Learning
• Apply a data strategy by splitting the data into sets
• Training Set – to train model (determine a split; 80%)
• Test Set – to test the model (determine a split; 20%)
4
4. “Analyze” the dataset through an “Model/Algorithm”
• Select an estimator’s model/algorithm that can analyze your dataset
• Many algorithms/models exist; i.e: Decision Trees, Neural Net etc.
• Libraries/Modules can facilitate algorithm/model selection e.g: SciKit-Learn
5
5. “Train” your Model to find “Patterns”
• Model will look for patterns in the dataset
6
6. “Test” your Model to make “Predictions”
• Ask Model to answer your question about differentiating between a car and a motorcycle
• Predictions are not always accurate
7
7. “Measure” accuracy of “Predictions” and “Improve”
• Evaluate, measure your accuracy
• Go back to predictions, or re-choose a model or fine-tune parameters of the model to optimize accuracy
(C) Review of Machine Learning (Terms)
• Each row in your data will be known as “observation”
• Aka: Sample, example, instance, record
Any Questions?