30
30
import numpy as np
31
31
32
32
try :
33
- import urllib .request as urllib # for backwards compatibility
33
+ import urllib .request as urllib # for backwards compatibility
34
34
except ImportError :
35
35
import urllib
36
36
@@ -231,33 +231,36 @@ def fetch_lfw_people(data_home=None, funneled=True, resize=0.5,
231
231
picture of a face, find the name of the person given a training set
232
232
(gallery).
233
233
234
+ The original images are 250 x 250 pixels, but the default slice and resize
235
+ arguments reduce them to 62 x 74.
236
+
234
237
Parameters
235
238
----------
236
- data_home: optional, default: None
239
+ data_home : optional, default: None
237
240
Specify another download and cache folder for the datasets. By default
238
241
all scikit learn data is stored in '~/scikit_learn_data' subfolders.
239
242
240
- funneled: boolean, optional, default: True
243
+ funneled : boolean, optional, default: True
241
244
Download and use the funneled variant of the dataset.
242
245
243
- resize: float, optional, default 0.5
246
+ resize : float, optional, default 0.5
244
247
Ratio used to resize the each face picture.
245
248
246
- min_faces_per_person: int, optional, default None
249
+ min_faces_per_person : int, optional, default None
247
250
The extracted dataset will only retain pictures of people that have at
248
251
least `min_faces_per_person` different pictures.
249
252
250
- color: boolean, optional, default False
253
+ color : boolean, optional, default False
251
254
Keep the 3 RGB channels instead of averaging them to a single
252
255
gray level channel. If color is True the shape of the data has
253
256
one more dimension than than the shape with color = False.
254
257
255
- slice_: optional
258
+ slice_ : optional
256
259
Provide a custom 2D slice (height, width) to extract the
257
260
'interesting' part of the jpeg files and avoid use statistical
258
261
correlation from the background
259
262
260
- download_if_missing: optional, True by default
263
+ download_if_missing : optional, True by default
261
264
If False, raise a IOError if the data is not locally available
262
265
instead of trying to download the data from the source site.
263
266
@@ -267,11 +270,13 @@ def fetch_lfw_people(data_home=None, funneled=True, resize=0.5,
267
270
268
271
dataset.data : numpy array of shape (13233, 2914)
269
272
Each row corresponds to a ravelled face image of original size 62 x 47
270
- pixels.
273
+ pixels. Changing the ``slice_`` or resize parameters will change the shape
274
+ of the output.
271
275
272
276
dataset.images : numpy array of shape (13233, 62, 47)
273
277
Each row is a face image corresponding to one of the 5749 people in
274
- the dataset.
278
+ the dataset. Changing the ``slice_`` or resize parameters will change the shape
279
+ of the output.
275
280
276
281
dataset.target : numpy array of shape (13233,)
277
282
Labels associated to each face image. Those labels range from 0-5748
@@ -389,36 +394,39 @@ def fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5,
389
394
390
395
.. _`README.txt`: http://vis-www.cs.umass.edu/lfw/README.txt
391
396
397
+ The original images are 250 x 250 pixels, but the default slice and resize
398
+ arguments reduce them to 62 x 74.
399
+
392
400
Parameters
393
401
----------
394
- subset: optional, default: 'train'
402
+ subset : optional, default: 'train'
395
403
Select the dataset to load: 'train' for the development training
396
404
set, 'test' for the development test set, and '10_folds' for the
397
405
official evaluation set that is meant to be used with a 10-folds
398
406
cross validation.
399
407
400
- data_home: optional, default: None
408
+ data_home : optional, default: None
401
409
Specify another download and cache folder for the datasets. By
402
410
default all scikit learn data is stored in '~/scikit_learn_data'
403
411
subfolders.
404
412
405
- funneled: boolean, optional, default: True
413
+ funneled : boolean, optional, default: True
406
414
Download and use the funneled variant of the dataset.
407
415
408
- resize: float, optional, default 0.5
416
+ resize : float, optional, default 0.5
409
417
Ratio used to resize the each face picture.
410
418
411
- color: boolean, optional, default False
419
+ color : boolean, optional, default False
412
420
Keep the 3 RGB channels instead of averaging them to a single
413
421
gray level channel. If color is True the shape of the data has
414
422
one more dimension than than the shape with color = False.
415
423
416
- slice_: optional
424
+ slice_ : optional
417
425
Provide a custom 2D slice (height, width) to extract the
418
426
'interesting' part of the jpeg files and avoid use statistical
419
427
correlation from the background
420
428
421
- download_if_missing: optional, True by default
429
+ download_if_missing : optional, True by default
422
430
If False, raise a IOError if the data is not locally available
423
431
instead of trying to download the data from the source site.
424
432
@@ -427,12 +435,14 @@ def fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5,
427
435
The data is returned as a Bunch object with the following attributes:
428
436
429
437
data : numpy array of shape (2200, 5828)
430
- Each row corresponds to 2 ravel'd face images of original size 62 x 67
431
- pixels.
438
+ Each row corresponds to 2 ravel'd face images of original size 62 x 47
439
+ pixels. Changing the ``slice_`` or resize parameters will change the shape
440
+ of the output.
432
441
433
- pairs : numpy array of shape (2200, 2, 62, 67 )
442
+ pairs : numpy array of shape (2200, 2, 62, 47 )
434
443
Each row has 2 face images corresponding to same or different person
435
- from the dataset containing 5749 people.
444
+ from the dataset containing 5749 people. Changing the ``slice_`` or resize
445
+ parameters will change the shape of the output.
436
446
437
447
target : numpy array of shape (13233,)
438
448
Labels associated to each pair of images. The two label values being
0 commit comments