0% found this document useful (0 votes)
19 views54 pages

CS436 CS5310 EE513 L01 Introduction

The document outlines the course CS436/CS5310/EE513 on Computer Vision Fundamentals, taught by Murtaza Taj at LUMS, covering topics such as feature detection, visual recognition, and geometric transformations. It includes a tentative course outline, reading materials, and evaluation criteria for assignments and projects. The goal of the course is to enable students to make useful decisions about real physical objects and scenes based on sensed images.

Uploaded by

Rao aafaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views54 pages

CS436 CS5310 EE513 L01 Introduction

The document outlines the course CS436/CS5310/EE513 on Computer Vision Fundamentals, taught by Murtaza Taj at LUMS, covering topics such as feature detection, visual recognition, and geometric transformations. It includes a tentative course outline, reading materials, and evaluation criteria for assignments and projects. The goal of the course is to enable students to make useful decisions about real physical objects and scenes based on sensed images.

Uploaded by

Rao aafaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

CS436/CS5310/EE513

Computer Vision Fundamentals

Murtaza Taj
murtaza.taj@lums.edu.pk

Lecture 1: Introduction
Mon, 04th Sep 2023
Introduction
! Murtaza Taj (PhD)
! PhD from Queen Mary University of London in Dec 2009
! Joined LUMS in Jan 2011
! Faculty Director Technology for People Initiative
! Director Computer Vision & Graphics Lab
! Research: 2D & 3D Scene Understanding, Image Processing,
Computer Graphics, Machine Learning

tpi.lums.edu.pk cvlab.lums.edu.pk
Computer Vision & Graphics Lab (CVG Lab)
https://cvlab.lums.edu.pk

! Remote Sensing
! Image matching
! Retrieval, classification, object detection

! Digital Cultural Heritage


! 2D/3D Scene Understanding
! 3D Object Retrieval, Point cloud segmentation

! Medical Imaging
! Generative Adversarial Networks
! Image classification

! Others
! Self-driving cars (End-to-end, How well am I driving)
! Visual Question Generation (NLP), Pose Estimation
Murtaza Taj
Co-founder Groopic Inc. CA
What is the core business of Google & Facebook?
What is the core business
of Google & Facebook?
Murtaza Taj
Co-founder Ingrain Media Inc. CA
Course Outline
Reading
! Text
! Computer Vision: Algorithms and Applications:
This is the draft of a textbook recently written by Richard Szeliski
Available in PDF form at http://szeliski.org/Book/

! Introductory Techniques for 3D Computer Vision:


by Emanuel Trucco and Alessandro Verri,
is very useful, especially for topics related to geometry

! Journals
! IEEE Transactions on Pattern Analysis and Machine Intelligence
! Transaction on Graphics (ToG)

! Conferences
! IEEE CVPR, ICCV, ECCV
! SIGGRAPH
Tentative Course Outline
Topic Lectures Reading
Introduction 1 Szeliski Ch 1
• Course Introduction, policies, etc

es L?
• Overview of Computer Vision

ur D
• Why are computer vision problems hard?

ct 7
le 43
• Examples of successful computer vision applications

5 S
• Overview of course topics

4- h C
Feature Detection 2 Szeliski Ch 4

nd it
• Edge Detection/Convolution Trucco Ch 4-5
• Filter Banks ou p w
ar rla
ve

Visual Recognition 4
O

• Deep Learning & CNN (Tutorial)


• Object Classification (ImageNet, LeNet etc)
• Object Localization
Tentative Course Outline
Topic Lectures Reading
Geometric Transformations and Camera Models 10-18 Szeliski Ch 2
• 2D transformations
• Estimating 2D Transformation
• 3D transformations
• Camera Models
• Camera Calibration
Dense Motion Estimation and Image Stitching 19-21 Szeliski Ch 8-9
• Optical Flow
• Pyramids
• Parametric Methods for Image Alignment
Structure from Motion 22-23 Szeliski Ch 7
• Rigid SFM (Factorization Method)
Stereo 24-26 Trucco Ch 7-8
• Basic Formulation
• Epipolar Constraint
• Estimation of Fundamental Matrix
Tentative Course Outline
Topic Lectures Reading
Geometric Transformations and Camera Models 10-18 Szeliski Ch 2

es G?
• 2D transformations
• Estimating 2D Transformation

ur C
ct 2
• 3D transformations

le 5
5 S4
• Camera Models

4- C
• Camera Calibration

h
Dense Motion Estimation and Image Stitching 19-21 Szeliski Ch 8-9

nd it
ou w
• Optical Flow

ar rlap
• Pyramids
• Parametric Methods for Image Alignment

ve
Structure from Motion O 22-23 Szeliski Ch 7
• Rigid SFM (Factorization Method)
Stereo 24-26 Trucco Ch 7-8
• Basic Formulation
• Epipolar Constraint
• Estimation of Fundamental Matrix
Instrument Weight
Course Introduction Assignments 30%
Quizzes 5%
! Programming Environment
Project 20%
! Python (OpenCV, TensorFlow, Keras, PyTorch)
Mid-term 20%
Exam 25%
! Assignments
! Written and Programming assignments (approx. 2+3)
! Associated report (discussion on results)

! Project
! Project is a simple extension of assignments with some room
for innovation
! Project evaluation meetings at regular intervals
! In a group of 2
Any Questions?
Introduction
Slide Credits
! CS 436 - LUMS, Dr. Sohaib Khan
! CS131 - Stanford, Fei Fei Li
! CS231 - Stanford, A. Karpathy
! UC Berkley, Jetindra Malik
! and many more
Introduction
! Sight is our primary sensation
! 80% of our first 12 years of learning is
through vision
! 30% of neurons in brain’s cortex are
dedicated to vision, compared to 8% for
touch, 2% for hearing

! Human Experience
What is the goal of Computer Vision?

“The goal of Computer Vision is to make useful decisions about


real physical objects and scenes based on sensed images”

Image Computer
Processing Graphics

Computer
Vision

2D & 3D Scene Understanding


Image IN Image OUT
Image
Processing

Symbolic Image OUT


Info IN Computer
Graphics

Symbolic
Decision
Image IN OUT
Computer
Vision
What is Computer Vision?

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford


Interpretation
! Scene understanding

! Multi-view Geometry
Scene Understanding
What we would like to infer…

Will person B put some money into Person C’s tip bag?
What kind of information can we extract from an image?
! 3D Information
! Semantic Information
Safe City Project
Traffic E-Challan
Multi-view Geometry
Camera Projection
! 3D to 2D projection

3D
2D

Optical
centre

Photograph
Came Real object
Laws ra ,
of Op
tics
Slide credit: Kenton Anderson
Geometric Transformations
! 2D-to-2D (image-to-image)
! 3D-to-3D (world-to-world)
! 3D-to-2D (camera model)
! 2D-to-3D (3D reconstruction)
! Shape from Stereo
! Structure from Motion
! Single View Reconstruction
Image plane-to-Image plane
Pakistan Super League
Vision for Robotics, Space exploration

! Vision system used for:


! Panorama stiching
! 3D terrain modelling
! Obstacle detection, position tracking
! …
Shape from Stereo

Source: http://www-robotics.jpl.nasa.gov
Multi-camera Surveillance System
Computer Vision

Convolution Stereo
Transformations
Feature extraction Image Recognition
Camera Model
Multi-View Geometry
Scene Understanding
Face Recognition Machine Learning 3D Reconstruction

Mobile Apps Photogrammetry


Structure from Motion
Shop Analytics
Surveillance Object Detection
Animated Movies
Innovation
Startups
Key Questions
! How a 3D world is projected onto a 2D
image by a camera?

! How multiple images of the same


world are related together?

! How can we reconstruct the 3D world


from images?

! What objects are present in the scene


and where?
Next …
! Edge Detection
Additional Slides
The Complexity of Perception
Why is computer vision hard?
! Computers are good at numerical processing

! Humans are good at perceptual processing

! We want to use a computer to mimic human perception…


which is complex to understand
Perception

Ref: Light and Vision: LIFE Science Library


Perception
What is this?
Recognition Helps Reorganization
Any Questions?
Writing Programs that “See”

An Example
Representing a Digital Image
! It is natural to represent image as a matrix
The goal of Computer Vision - Image Understanding

Slide acknowledgement: Prof. Fei Fei Li’s CS131 class at Stanford


What kind of information can we extract from an image?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy