0% found this document useful (0 votes)
8 views34 pages

Ch1 TDMA Image Processing

The document outlines the fundamentals and applications of computer vision, including topics such as interpreting pixel intensities, correspondence and alignment, grouping and segmentation, and object recognition. It discusses various real-world applications of computer vision in areas like smartphones, medical imaging, and transportation. Additionally, it covers advanced topics like action recognition and the use of image filters for extracting meaningful information from images.

Uploaded by

haythem.elhadj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views34 pages

Ch1 TDMA Image Processing

The document outlines the fundamentals and applications of computer vision, including topics such as interpreting pixel intensities, correspondence and alignment, grouping and segmentation, and object recognition. It discusses various real-world applications of computer vision in areas like smartphones, medical imaging, and transportation. Additionally, it covers advanced topics like action recognition and the use of image filters for extracting meaningful information from images.

Uploaded by

haythem.elhadj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

28/10/2021

W I D E D S O U I D E N E
M S E D D I

COMPUTER
VISION

OUTLINE
• Introduction
• Interpreting Intensities
– What determines the brightness and color of a pixel?
– How can we use image filters to extract meaningful information from the image?

• Correspondence and Alignment


– How can we find corresponding points in objects or scenes?
– How can we estimate the transformation between them?

• Grouping and Segmentation


– How can we group pixels into meaningful regions?
• Categorization and Object Recognition
– How can we represent images and categorize them?
– How can we recognize categories of objects?
• Advanced Topics
– Action recognition, 3D scenes and context, human-in-the-loop vision…

1
28/10/2021

INSTRUCTOR: WIDED SOUIDENE MSEDDI

COMPUTER VISION AND YOU….

Have you ever used computer vision?


How? Where?
Reconstruction? Recognition? (Re)organization?

2
28/10/2021

Laptop: Biometrics auto-login (face recognition, 3D), OCR

Smartphones: QR codes, computational photography (Android Lens Blur, iPhone Portrait Mode), panorama
construction (Google Photo Spheres), face detection, expression detection (smile), Snapchat filters (face tracking),
Google Tango (3D reconstruction), Night Sight (Pixel)

Web: Image search, Google photos (face recognition, object recognition, scene recognition, geolocalization from
vision), Facebook (image captioning), Google maps aerial imaging (image stitching), etc.

VR/AR: Outside-in tracking (HTC VIVE), inside out tracking (simultaneous localization and mapping, HoloLens),
object occlusion (dense depth estimation)

Motion: Kinect, full body tracking of skeleton, gesture recognition, etc.

MOREOVER…
Medical imaging: CAT / MRI reconstruction, assisted diagnosis, endoscopic surgery
Industry: Vision-based robotics (marker-based), machine-assisted router (jig),
automated post, surveillance, drones, etc.
Transportation: Assisted driving (everything), face tracking/iris dilation for
drunkenness/drowsiness, etc.

Media: Visual effects for film, TV (reconstruction), virtual sports replay


(reconstruction), semantics-based auto edits (reconstruction, recognition)

3
28/10/2021

O P T I CAL CH ARACT E R
RECOGNITION (OCR)
Technology to convert images of text into text
If you have a scanner, it probably came with OCR software

Live
Camera
Translation

Mail digit recognition, AT&T labs


http://www.research.att.com/~yann/

License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition

FAC E D E T E C T I O N

• Almost all digital cameras detect faces


• Snapchat face filters

4
28/10/2021

SMILE DETECTION/EMOTION DETECTION

Sony Cyber-shot® T70 Digital Still Camera

VISION-BASED BIOMETRICS

“How the Afghan Girl was Identified by Her Iris Patterns”


Read the story (Wikipedia)

10

5
28/10/2021

FAC I A L L O G I N W I T H O U T A
PA S S W O R D …

11

OBJECT RECOGNITION (IN MOBILE


P H O N E S ) e.g., Google Lens

12

6
28/10/2021

3D FROM IMAGES

Building Rome in a Day: Agarwal et al. 2009


13

SPECIAL EFFECTS: SHAPE CAPTURE

Star Wars: Rogue One – Peter Cushing / Admiral Tarkin

14

7
28/10/2021

I N T E R AC T I V E G A M E S
Object Recognition:
http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY

15

MEDICAL IMAGING

Image guided surgery


3D imaging
Grimson et al., MIT
MRI, CT

16

8
28/10/2021

AUTOCARS - UBER BOUGHT CMU’S


LAB

17

18

9
28/10/2021

19

MOBILE ROBOTS
http://www.robocup.org/

Saxena et al. 2008


STAIR at Stanford

Skydio 2 drone
6x fisheye cameras for
obstacle avoidance
Onboard NVIDIA GPU

20

10
28/10/2021

W H AT I S E AC H PA R T O F A N I M A G E ?

21

W H AT I S E AC H PA R T O F A N I M A G E ?
• Pixel -> picture element

‘138’

y
I(x,y)

22

11
28/10/2021

IMAGE AS A 2D SAMPLING OF SIGNAL

• Signal: function depending on some variable with physical meaning.

• Image: sampling of that function.


• 2 variables: xy coordinates
• 3 variables: xy + time (video)
• ‘Brightness’ is the value of the function for visible light

• Can be other physical values too: temperature, pressure, depth …

23

EXAMPLE 2D IMAGES

24

12
28/10/2021

SAMPLING IN 1D

Sampling in 1D takes a function and returns a vector whose


elements are values of that function at the sample points.

25

SAMPLING IN 2D

Sampling in 2D takes a function and


returns a matrix.

26

13
28/10/2021

G R AY S C A L E D I G I T A L I M A G E

Brightness
or intensity

x y

27

W H AT I S E AC H PA R T O F A P H O T O G R A P H ?
• Pixel -> picture element

‘127’

y
I(x,y)

28

14
28/10/2021

I N T E G R AT I N G L I G H T OV E R A R A N G E O F
ANGLES

Output Image

Camera Sensor

29

RESOLUTION – GEOMETRIC VS.


S PAT I A L R E S O L U T I O N

Both images are ~500x500 pixels

30

15
28/10/2021

Q UA N T I Z AT I O N

Underlying signal Quantized values

31

QUANTIZATION EFFECTS –
RADIOMETRIC RESOLUTION

8 bit – 256 levels 4 bit – 16 levels 2 bit – 4 levels 1 bit – 2 levels

32

16
28/10/2021

IMAGES IN PYTHON NUMPY


N x M grayscale image “im”
– im[0,0] = top-left pixel value
– im[y, x] = y pixels down, x pixels to right
– im[N-1, M-1] = bottom-right pixel

Column
Row 0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.99
0.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.91
0.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.92
0.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.95
0.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.85
0.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.33
0.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.74
0.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.93
0.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.99
0.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.97
0.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93

33

G R AY S C A L E I N T E N S I T Y

0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.99
0.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.91
0.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.92
0.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.95
0.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.85
0.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.33
0.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.74
0.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.93
0.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.99
0.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.97
0.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93

34

17
28/10/2021

IMAGE FILTERING

35

THREE VIEWS OF FILTERING


• Image filters in spatial domain
– Filter is a mathematical operation of a grid of numbers
– Smoothing, sharpening, measuring texture

• Image filters in the frequency domain


– Filtering is a way to modify the frequencies of images
– Denoising, sampling, image compression

• Image pyramids
– Scale-space representation allows coarse-to-fine operations

36

18
28/10/2021

IMAGE FILTERING
Compute function of local neighborhood
at each position:

h[m, n] = å f [k , l ] I [m + k , n + l ]
k ,l

37

IMAGE FILTERING
h=output f=filter I=image

h[m, n] = å f [k , l ] I [m + k , n + l ]
k ,l
2d coords=k,l 2d coords=m,n

[ ] [ ]
[ ]

38

19
28/10/2021

Example: box filter

f [× ,× ]
1 1 1

1 1 1

1 1 1

Slide credit: David Lowe (UBC)

39

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0
0 0
0 0
0 90
90 0
0 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 90
90 90
90 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 90
90 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 1, 𝑛 = 1
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

40

20
28/10/2021

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 2, 𝑛 = 1
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

41

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 3, 𝑛 = 1
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

42

21
28/10/2021

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 4, 𝑛 = 1
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

43

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 5, 𝑛 = 1
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

44

22
28/10/2021

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 4, 𝑛 = 6
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

45

Image filtering 1 1 1

f [× ,× ] 1 1 1

1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0 50

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ] 𝑚 = 6, 𝑛 = 4
𝑘, 𝑙 = −1,0,1
k ,l Credit: S. Seitz

46

23
28/10/2021

Image filtering f [× ,× ] 1 1 1
1 1 1
1 1 1

I [.,.] h[.,.]
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10

0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20

0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30

0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 0 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60 40 20

0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10

0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[m, n] = å f [k , l ] I [m + k , n + l ]
k ,l Credit: S. Seitz

47

Box Filter

f [× ,× ]
What does it do?
• Replaces each pixel with 1 1 1
an average of its
neighborhood 1 1 1

1 1 1
• Achieve smoothing effect
(remove sharp features)

Slide credit: David Lowe (UBC)

48

24
28/10/2021

Box Filter

f [× ,× ]
What does it do?
• Replaces each pixel with 1 1 1
an average of its
neighborhood 1 1 1

1 1 1
• Achieve smoothing effect
(remove sharp features)

• Why does it sum to one?

Slide credit: David Lowe (UBC)

49

Smoothing with box filter f [× ,× ]


1 1 1
1 1 1
1 1 1

50

25
28/10/2021

IMAGE FILTERING

h[m, n] = å f [k , l ] I [m + k , n + l ]
• Really important! k ,l

• Enhance images
• Denoise, resize, increase contrast, etc.
• Extract information from images
• Texture, edges, distinctive points, etc.
• Detect patterns
• Template matching

James Hays

51

T H I N K - PA I R - S H A R E T I M E
0 0 0
1. 0 1 0
0 0 0

0 0 0
2. 0 0 1
0 0 0

1 0 -1
3. 2 0 -2
1 0 -1

0 0 0 1 1 1
4. 0
0
2
0
0
0
- 1
1
1
1
1
1

52

26
28/10/2021

1 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0
0 1 0 ?
0 0 0

Original

53

1 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0
0 1 0
0 0 0

Original Filtered
(no change)

Source: D. Lowe

54

27
28/10/2021

2 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0
0 0 1 ?
0 0 0

Original

Source: D. Lowe

55

2 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0
0 0 1
0 0 0

Original Shifted left


By 1 pixel

Source: D. Lowe

56

28
28/10/2021

3 . P RACT I CE W I T H LI NE AR F I LT E RS

1 0 -1
2 0 -2
1 0 -1
Sobel

Vertical Edge
(absolute value)

57

3 . P RACT I CE W I T H LI NE AR F I LT E RS

1 2 1
0 0 0
-1 -2 -1
Sobel

Horizontal Edge
(absolute value)

58

29
28/10/2021

4 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1
?
(Note that filter sums to 1)
Original

59

4 . P R AC T I C E W I T H L I N E A R F I LT E R S

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1

Original
Sharpening filter
- Accentuates differences with local average

60

30
28/10/2021

4 . P R AC T I C E W I T H L I N E A R F I LT E R S

61

F I L T R A G E DA N S L E D O M A I N E
FREQUENTIEL

62

31
28/10/2021

63

64

32
28/10/2021

TRANSFORMÉE EN ONDELETTES

65

PROJECT ANNOUNCEMENT

• Sujet : Machine learning for foreground/background separation in computer vision

• WEEK 3 : Read the article …


• WEEK 3 : Quiz on : Background substraction article + Course content Weeks 1 & 2

66

33
28/10/2021

S U P E R H U M A N S TAT E O F T H E A R T ?

Deep learning is an enormous disruption to the field.


Since 2012, rapid expansion and commercialization.
Why?

“With enough data, computer vision matches or even


outperforms human vision at most recognition tasks.”

WHAT.

67

VISION AND SOCIETY


Lots of data = lots of potential bias in the data.

Needs understanding of possible failures.


+
Responsible approach.
+
Techniques to overcome bias.

68

34

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy