0% found this document useful (0 votes)
55 views17 pages

AIDI - 1010 - WEEK2 - Google Colab - v1.2

This document provides an overview and objectives for Week 2 of an introduction to emerging technologies course. It covers: (A) A review of Python programming language fundamentals including modules and libraries like NumPy, Pandas, and Scikit-learn. (B) A review of integrated development environments (IDEs) for Python like Anaconda, Jupyter Notebook, and Google Colab, emphasizing Colab for the course. (C) A high-level overview of machine learning concepts including supervised vs. unsupervised vs. reinforcement learning and the typical machine learning process of importing and cleaning data, selecting models/algorithms, training and testing models, and measuring accuracy.

Uploaded by

Shafat Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views17 pages

AIDI - 1010 - WEEK2 - Google Colab - v1.2

This document provides an overview and objectives for Week 2 of an introduction to emerging technologies course. It covers: (A) A review of Python programming language fundamentals including modules and libraries like NumPy, Pandas, and Scikit-learn. (B) A review of integrated development environments (IDEs) for Python like Anaconda, Jupyter Notebook, and Google Colab, emphasizing Colab for the course. (C) A high-level overview of machine learning concepts including supervised vs. unsupervised vs. reinforcement learning and the typical machine learning process of importing and cleaning data, selecting models/algorithms, training and testing models, and measuring accuracy.

Uploaded by

Shafat Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

AIDI 1010 –

Introduction to Emerging Technologies

WEEK2
Jahanzeb Abbas (JB)
Week Objectives
 (A) Review of Python/Modules
 Modules
 BootCamp Video

 (B) Review of Integrated Development Environment (IDE)


 Local & Virtual
 Introduction to Google Colaboratory (supported with video)
 Setup
 Access
 Usage

 (C) Review of Machine Learning


 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
(A) Review of Python/Modules
 Established 1991 -> Popular from 2005+
 Python is an Interpreted Language:
 Execute instructions without compiling a program (Java, JavaScript, C++, etc.)
 Interpreter (Read one statement) vs Compiler (Read entire program)
 “Python Interpreter” executes directly, translating each sequence of subroutines
into a language for your machine; alternatives: Perl, Ruby, and others
 Python also referred to as a “glue language” (Java, JavaScript, C++, etc.)
 Python is critical for data science, machine learning, general software dev in
academia/industry
 Association with scikit-learn & pandas has made Python very popular against R,
MATLAB, SAS, Stata, etc.
 Runs slower than compiled language code but debugging is faster.
(A) Review of Python/Modules
 Python Libraries
 Collection of pre-written, compiled code defined to work with Python (Standard Library)
 Existing compiled code that can be imported into Python (Modules, Modular Components)
 Packaged into archives (which has metadata, other parameters)

 NumPy: Numerical computing BFF for Python (Library)


 Pandas: Data Manipulator BFF for Python (Library)
 Matplotlib: Plots BFF for Python (Library)
 SciPy: Scientific computing BFF for Python (Library)
 Scikit-Learn: Machine Learning toolkit BFF for Python (Library)
 Statsmodels: All things stats BFF for Python (Library)
(A) Review of Python/Modules
(A) Review of Python/Modules
 Check this excellent BootCamp video by Mike Dane (from freeCodeCamp.org) on introduction
with Python syntax.

https://youtu.be/rfscVS0vtbw
(B) Review of IDE (Local)
 Anaconda
 Anaconda is a distribution of the Python and R programming languages for scientific computing, that
aims to simplify package management and deployment.
 https://www.anaconda.com/distribution/
 IPython
 “Interactive Python” is an interactive shell to run Python commands.
 <ENTER> key executes the code.
 Great for prototyping your code
 Install through Anaconda
 Have to install dependencies and modules separately
 Jupyter Notebook
 Web-based interactive shell to run Python commands. (IPython is the shell and kernel for Jupyter)
 Code is produced when its run and ‘markdown cells’ are used to explain what the code means
 Markdown is a popular language, superset of HTML
 Install through Anaconda
 Have to import libraries/modules separately
(B) Review of IDE (Virtual)
 Google Colab
 Check WEEK2’s video that can help you step through Google Colab
 This will be the preferred IDE for the course
 Web-based interactive shell to run Python commands
 Architecture of Jupyter notebooks hosted on Google machines!
 https://colab.research.google.com/
 Next slide to guide you further
 No messy installations; required packages already installed/imported
 Run this code to check installed packages: !pip list -v
(B) Review of IDE (Virtual)
(C) Review of Machine Learning
• “Machine learning (ML) is the study of computer algorithms
that improve automatically through experience and by the
use of data” – Wikipedia
• Ask a question that might be answered with data (as a
hypothesis)
• Extract the data & understand it automatically
• Create decisions or insight, from a computer, that would
typically require a human
(C) Review of Machine Learning
• What question are we trying to answer?
• Write a program that can scan an image and differentiate between a
boy and a girl; without a model, program will be super lengthy
• If someone asks to now detect other objects in the same picture as
well; without a model, it will take forever to incorporate changes

• How to solve the problem? In simple words..


• Build a model that looks at data, predicts using lots of data
• The more input data the more accuracy of your model

• Applications of Machine Learning


• Robotics/Process Automation
• Language Processing
• Vision Processing
• Stock Market Trends
(C) Review of Machine Learning (Overall)

https://
wordstream-files-prod.s3.amazonaws.com/s3fs-public/styles/simple_im
age/public/images/machine-learning1.png?SnePeroHk5B9yZaLY7peFkU
LrfW8Gtaf&itok=yjEJbEKD
(C) Review of Machine Learning (Steps)
1
1. “Import” the dataset
• Dataset can be from Modules, DB, EXCEL, CSV etc

2
2. “Clean” the dataset
• Remove duplicates, noise, irrelevant, incomplete, or convert properly

3
3. “Approach” the dataset with an “Estimator/Data Strategy” (dependent on objective)
• Choose an approach category: Supervised Learning vs Unsupervised Learning
• Apply a data strategy by splitting the data into sets
• Training Set – to train model (determine a split; 80%)
• Test Set – to test the model (determine a split; 20%)

4
4. “Analyze” the dataset through an “Model/Algorithm”
• Select an estimator’s model/algorithm that can analyze your dataset
• Many algorithms/models exist; i.e: Decision Trees, Neural Net etc.
• Libraries/Modules can facilitate algorithm/model selection e.g: SciKit-Learn

5
5. “Train” your Model to find “Patterns”
• Model will look for patterns in the dataset

6
6. “Test” your Model to make “Predictions”
• Ask Model to answer your question about differentiating between a car and a motorcycle
• Predictions are not always accurate

7
7. “Measure” accuracy of “Predictions” and “Improve”
• Evaluate, measure your accuracy
• Go back to predictions, or re-choose a model or fine-tune parameters of the model to optimize accuracy
(C) Review of Machine Learning (Terms)
• Each row in your data will be known as “observation”
• Aka: Sample, example, instance, record

• Each column in your data will be known as a “feature”


• Aka: input, predictor, attribute, regressor, covariate, data, independent
variable
• Feature-names are column-headers

• Each value in your data that we will be predicting is


known as a “response”
• Aka: target, outcome, output, label, dependent variable
https://miro.medium.com/max/3000/1*gDF_bo9keqJnKskiEMGHyg.png
• Knowing your ‘response’ variable is key to understanding which model to
use; classification or regression etc.
(C) Review of Machine Learning (Toolkit)
• You should be aware of the following
modules in this course:
• Pandas
• NumPy
• Matplotlib
• Scikit-learn; Keras/TensorFlow
Disclaimer
Due to nature of the course, various materials have been compiled from different open source
resources with some moderation.
The course designer (slides creator), sincerely acknowledges their hard work and contribution,
credit will be given wherever necessary
Thank You Very Much

Any Questions?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy