Skip to content

Commit 8159c65

Browse files
committed
DOC fix typo in output shape of fetch_lfw_pairs (and minor additions)
1 parent 5ff34b2 commit 8159c65

File tree

1 file changed

+31
-21
lines changed

1 file changed

+31
-21
lines changed

sklearn/datasets/lfw.py

Lines changed: 31 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
import numpy as np
3131

3232
try:
33-
import urllib.request as urllib #for backwards compatibility
33+
import urllib.request as urllib # for backwards compatibility
3434
except ImportError:
3535
import urllib
3636

@@ -231,33 +231,36 @@ def fetch_lfw_people(data_home=None, funneled=True, resize=0.5,
231231
picture of a face, find the name of the person given a training set
232232
(gallery).
233233
234+
The original images are 250 x 250 pixels, but the default slice and resize
235+
arguments reduce them to 62 x 74.
236+
234237
Parameters
235238
----------
236-
data_home: optional, default: None
239+
data_home : optional, default: None
237240
Specify another download and cache folder for the datasets. By default
238241
all scikit learn data is stored in '~/scikit_learn_data' subfolders.
239242
240-
funneled: boolean, optional, default: True
243+
funneled : boolean, optional, default: True
241244
Download and use the funneled variant of the dataset.
242245
243-
resize: float, optional, default 0.5
246+
resize : float, optional, default 0.5
244247
Ratio used to resize the each face picture.
245248
246-
min_faces_per_person: int, optional, default None
249+
min_faces_per_person : int, optional, default None
247250
The extracted dataset will only retain pictures of people that have at
248251
least `min_faces_per_person` different pictures.
249252
250-
color: boolean, optional, default False
253+
color : boolean, optional, default False
251254
Keep the 3 RGB channels instead of averaging them to a single
252255
gray level channel. If color is True the shape of the data has
253256
one more dimension than than the shape with color = False.
254257
255-
slice_: optional
258+
slice_ : optional
256259
Provide a custom 2D slice (height, width) to extract the
257260
'interesting' part of the jpeg files and avoid use statistical
258261
correlation from the background
259262
260-
download_if_missing: optional, True by default
263+
download_if_missing : optional, True by default
261264
If False, raise a IOError if the data is not locally available
262265
instead of trying to download the data from the source site.
263266
@@ -267,11 +270,13 @@ def fetch_lfw_people(data_home=None, funneled=True, resize=0.5,
267270
268271
dataset.data : numpy array of shape (13233, 2914)
269272
Each row corresponds to a ravelled face image of original size 62 x 47
270-
pixels.
273+
pixels. Changing the ``slice_`` or resize parameters will change the shape
274+
of the output.
271275
272276
dataset.images : numpy array of shape (13233, 62, 47)
273277
Each row is a face image corresponding to one of the 5749 people in
274-
the dataset.
278+
the dataset. Changing the ``slice_`` or resize parameters will change the shape
279+
of the output.
275280
276281
dataset.target : numpy array of shape (13233,)
277282
Labels associated to each face image. Those labels range from 0-5748
@@ -389,36 +394,39 @@ def fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5,
389394
390395
.. _`README.txt`: http://vis-www.cs.umass.edu/lfw/README.txt
391396
397+
The original images are 250 x 250 pixels, but the default slice and resize
398+
arguments reduce them to 62 x 74.
399+
392400
Parameters
393401
----------
394-
subset: optional, default: 'train'
402+
subset : optional, default: 'train'
395403
Select the dataset to load: 'train' for the development training
396404
set, 'test' for the development test set, and '10_folds' for the
397405
official evaluation set that is meant to be used with a 10-folds
398406
cross validation.
399407
400-
data_home: optional, default: None
408+
data_home : optional, default: None
401409
Specify another download and cache folder for the datasets. By
402410
default all scikit learn data is stored in '~/scikit_learn_data'
403411
subfolders.
404412
405-
funneled: boolean, optional, default: True
413+
funneled : boolean, optional, default: True
406414
Download and use the funneled variant of the dataset.
407415
408-
resize: float, optional, default 0.5
416+
resize : float, optional, default 0.5
409417
Ratio used to resize the each face picture.
410418
411-
color: boolean, optional, default False
419+
color : boolean, optional, default False
412420
Keep the 3 RGB channels instead of averaging them to a single
413421
gray level channel. If color is True the shape of the data has
414422
one more dimension than than the shape with color = False.
415423
416-
slice_: optional
424+
slice_ : optional
417425
Provide a custom 2D slice (height, width) to extract the
418426
'interesting' part of the jpeg files and avoid use statistical
419427
correlation from the background
420428
421-
download_if_missing: optional, True by default
429+
download_if_missing : optional, True by default
422430
If False, raise a IOError if the data is not locally available
423431
instead of trying to download the data from the source site.
424432
@@ -427,12 +435,14 @@ def fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5,
427435
The data is returned as a Bunch object with the following attributes:
428436
429437
data : numpy array of shape (2200, 5828)
430-
Each row corresponds to 2 ravel'd face images of original size 62 x 67
431-
pixels.
438+
Each row corresponds to 2 ravel'd face images of original size 62 x 47
439+
pixels. Changing the ``slice_`` or resize parameters will change the shape
440+
of the output.
432441
433-
pairs : numpy array of shape (2200, 2, 62, 67)
442+
pairs : numpy array of shape (2200, 2, 62, 47)
434443
Each row has 2 face images corresponding to same or different person
435-
from the dataset containing 5749 people.
444+
from the dataset containing 5749 people. Changing the ``slice_`` or resize
445+
parameters will change the shape of the output.
436446
437447
target : numpy array of shape (13233,)
438448
Labels associated to each pair of images. The two label values being

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy