0% found this document useful (0 votes)
40 views72 pages

Lec00 Intro For Web Highlighted

Uploaded by

abbasahmer734
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views72 pages

Lec00 Intro For Web Highlighted

Uploaded by

abbasahmer734
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 72

CS5670: Intro to Computer Vision

(Cornell Tech)
Depth from a single image
Visualizing scenes from tourist
photos
Reconstructing dynamic 3D
scenes

DynIBaR: Neural Dynamic Image-Based Rendering [


https://dynibar.github.io/]
Zhengqi Li, Qianqian Wang, Forrester Cole, Richard Tucker, Noah Snavely
CVPR 2023
Today
1. What is computer vision?

2. Why study computer vision?

3. Course overview

4. Images & image filtering [time permitting]


Today
• Readings
– Szeliski, Chapter 1 (Introduction)
Every image tells a story
• Goal of computer vision:
perceive the “story”
behind the picture
• Compute properties of
the world
– 3D shape
– Names of people or
objects
– What happened?
The goal of computer vision
Can computers match human perception?
• Yes and no (mainly no)
– computers can be better at
“easy” things
– humans are better at
“hard” things

• But huge progress


– Accelerating in the last five
years due to deep learning
– What is considered “hard”
keeps changing
Human perception has its shortcomings

https://twitter.com/pickover/status/
1460275132958662657/
But humans can tell a lot about a scene
from a little information…

Source: “80 million tiny images” by Torralba, et al.


The goal of computer vision
The goal of computer vision
• Compute the 3D shape of the world

ZED 2i Camera
The goal of computer vision
• Recognize objects and people

Terminator 2, 1991
slide credit: Fei-Fei, Fergus & Torralba
sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba


The goal of computer vision
• “Enhance” images
The goal of computer vision
• Forensics

Source: Nayar and Nishino, “Eyes for Relighting”


Source: Nayar and Nishino, “Eyes for Relighting”
Source: Nayar and Nishino, “Eyes for Relighting”
The goal of computer vision
• Improve photos (“Computational Photography”)

Super-resolution (source:
2d3)

Depth of field on cell phone


camera (source:
Google Research Blog) Removing objects (
Google Magic Erase
Low-light photography r
(credit: Hasinoff et al., SIGGRAPH ASIA 2016 )
)
April 10, 2019
Why study computer vision?
• Billions of images/videos captured per day

• Huge number of potential applications


• The next slides show the current state of
Optical character recognition
(OCR) • If you have a scanner, it probably came with OCR
software

Digit recognition, AT&T labs (1990’s) License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
http://yann.lecun.com/exdb/lenet/

Sudoku grabber
http://sudokugrab.blogspot.com/

Automatic check processing


Face detection

• Nearly all cameras detect faces in real


time
– (Why?)
Face analysis and recognition
Vision-based biometrics

Who is she? Source: S. Seitz


Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read
the story

Source: S. Seitz
Login without a password

Fingerprint scanners Face unlock on Apple iPhone X


on many new See also
smartphones and http://www.sensiblevision.com/
other devices
New York Times, Jan. 18, 2020
by Kashmir Hill
Bird identification

Merlin Bird ID (based on Cornell Tech technology!)


Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC


Source: S. Seitz
Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz


3D face tracking w/ consumer cameras

Snapchat Lenses

Face2Face system (Thies et


Image synthesis

Karras, et al., Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR
Which face is real?

https://www.whichfaceisreal.com/
Image synthesis

“An astronaut riding a horse in a “A photo of a Corgi dog riding a bike in


photorealistic style” – DALL-E 2 Times Square. It is wearing sunglasses and
a beach hat” – Imagen
Sports

Sportvision first down line


Explanation on www.howstuffworks.com

Source: S. Seitz
Smart cars

• Mobileye
• Tesla Autopilot
• Safety features in many cars
Self-driving cars

Waymo
Robotics

NASA’s Mars Curiosity Rover Amazon Picking Challenge


https://en.wikipedia.org/wiki/Curiosity_(rover) http://www.robocup2016.org/en/events/amazon-picking-chal
lenge/

Amazon Prime Air Amazon Scout


Medical imaging

3D imaging
(MRI, CT) Skin cancer classification with deep learning
https://cs.stanford.edu/people/esteva/nature/
Virtual & Augmented Reality

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture


Current state of the art
• You just saw many examples of current systems.
– Many of these are less than 5 years old

• Computer vision is an active research area, and rapidly


changing
– Many new apps in the next 5 years
– Deep learning and generative methods powering many modern
applications

• Many startups across a dizzying array of areas


– Generative AI, robotics, autonomous vehicles, medical
imaging, construction, inspection, VR/AR, …
Why is computer vision difficult?

Viewpoint variation

Credit: Flickr user michaelpaul

Scale
Illumination
Why is computer vision difficult?

Motion (Source: S. Lazebnik)


Intra-class variation

Background clutter Occlusion


Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba


But there are lots of visual cues we can
use…

Source: S. Lazebnik
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a given 2D
image

Artist Julian Beever with his anamorphic Coke bottle


– We often must use prior knowledge about the world’s
structure Image source: F. Durand
CS5670: Introduction to Computer Vision

• Project-based course whose goal is to teach you


the basics of computer vision – image processing,
geometry, recognition – in a hands-on way
Course requirements
• Prerequisites
– Data structures
– Good working knowledge of Python programming
– Linear algebra
– Vector calculus

• Course does not assume prior imaging


experience
– computer vision, image processing, graphics, etc.
Course overview
(tentative)
1. Low-level vision
– image processing, edge detection,
feature detection, cameras, image
formation

2. Geometry & appearance


– projective geometry, stereo, structure
from motion, optimization, lighting &
materials

3. Recognition & generative


models
– object classification, deep learning,
1. Low-level vision
• Basic image processing and image formation

* =
Filtering, edge detection

Feature extraction Image formation


Project: Hybrid images
Project: Feature detection and matching
2. Geometry & appearance

Image credit: IDS Imaging

Projective geometry Stereo vision

Multi-view stereo Structure from motion


Project: Creating panoramas
Project: 3D reconstruction
3. Recognition, Deep Learning &
Generative Models

“dog”

Image classification Convolutional Neural Networks

“a class watching a computer vision lecture at Cornell Tech”

Image generation
Project: Neural Radiance Fields
(NeRFs)
Questions?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy