Skip to content

Commit 1f0815b

Browse files
arjolyogrisel
authored andcommitted
ENH more explicit name for auc + consistency for scorer, fix scikit-learn#2096
Conflicts: sklearn/metrics/tests/test_metrics.py
1 parent c69201a commit 1f0815b

File tree

9 files changed

+148
-44
lines changed

9 files changed

+148
-44
lines changed

doc/modules/classes.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -715,7 +715,6 @@ details.
715715

716716
metrics.accuracy_score
717717
metrics.auc
718-
metrics.auc_score
719718
metrics.average_precision_score
720719
metrics.classification_report
721720
metrics.confusion_matrix
@@ -730,6 +729,7 @@ details.
730729
metrics.precision_recall_fscore_support
731730
metrics.precision_score
732731
metrics.recall_score
732+
metrics.roc_auc_score
733733
metrics.roc_curve
734734
metrics.zero_one_loss
735735

doc/modules/model_evaluation.rst

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ Scoring Function
5656
'f1' :func:`sklearn.metrics.f1_score`
5757
'precision' :func:`sklearn.metrics.precision_score`
5858
'recall' :func:`sklearn.metrics.recall_score`
59-
'roc_auc' :func:`sklearn.metrics.auc_score`
59+
'roc_auc' :func:`sklearn.metrics.roc_auc_score`
6060

6161
**Clustering**
6262
'adjusted_rand_score' :func:`sklearn.metrics.adjusted_rand_score`
@@ -182,11 +182,11 @@ Some of these are restricted to the binary classification case:
182182
.. autosummary::
183183
:template: function.rst
184184

185-
auc_score
186185
average_precision_score
187186
hinge_loss
188187
matthews_corrcoef
189188
precision_recall_curve
189+
roc_auc_score
190190
roc_curve
191191

192192

@@ -268,21 +268,21 @@ and with a list of labels format:
268268
for an example of accuracy score usage using permutations of
269269
the dataset.
270270

271-
Area under the curve (AUC)
272-
...........................
271+
Area under the ROC curve
272+
.........................
273273

274-
The :func:`auc_score` function computes the 'area under the curve' (AUC) which
275-
is the area under the receiver operating characteristic (ROC) curve.
274+
The :func:`roc_auc_score` function computes the area under the receiver
275+
operating characteristic (ROC) curve.
276276

277-
This function requires the true binary value and the target scores, which can
277+
This function requires the true binary value and the target scores, which can
278278
either be probability estimates of the positive class, confidence values, or
279279
binary decisions.
280280

281281
>>> import numpy as np
282-
>>> from sklearn.metrics import auc_score
282+
>>> from sklearn.metrics import roc_auc_score
283283
>>> y_true = np.array([0, 0, 1, 1])
284284
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
285-
>>> auc_score(y_true, y_scores)
285+
>>> roc_auc_score(y_true, y_scores)
286286
0.75
287287

288288
For more information see the
@@ -812,12 +812,16 @@ Wikipedia) <http://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_:
812812
Here a small example of how to use the :func:`roc_curve` function::
813813

814814
>>> import numpy as np
815-
>>> from sklearn import metrics
815+
>>> from sklearn.metrics import roc_curve
816816
>>> y = np.array([1, 1, 2, 2])
817817
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
818-
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
818+
>>> fpr, tpr, thresholds = roc_curve(y, scores, pos_label=2)
819819
>>> fpr
820820
array([ 0. , 0.5, 0.5, 1. ])
821+
>>> tpr
822+
array([ 0.5, 0.5, 1. , 1. ])
823+
>>> thresholds
824+
array([ 0.8 , 0.4 , 0.35, 0.1 ])
821825

822826

823827
The following figure shows an example of such ROC curve.

doc/whats_new.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,8 @@ Changelog
212212
API changes summary
213213
-------------------
214214
215+
- The :func:`auc_score` was renamed :func:`roc_auc_score`.
216+
215217
- Testing scikit-learn with `sklearn.test()` is deprecated. Use
216218
`nosetest sklearn` from the command line.
217219

sklearn/metrics/__init__.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from .metrics import (accuracy_score,
77
average_precision_score,
88
auc,
9-
auc_score,
9+
roc_auc_score,
1010
classification_report,
1111
confusion_matrix,
1212
explained_variance_score,
@@ -31,6 +31,9 @@
3131
from .metrics import zero_one
3232
from .metrics import zero_one_score
3333

34+
# Deprecated in 0.16
35+
from .metrics import auc_score
36+
3437
from .scorer import make_scorer, SCORERS
3538

3639
from . import cluster
@@ -54,7 +57,7 @@
5457
'adjusted_mutual_info_score',
5558
'adjusted_rand_score',
5659
'auc',
57-
'auc_score',
60+
'roc_auc_score',
5861
'average_precision_score',
5962
'classification_report',
6063
'cluster',

sklearn/metrics/metrics.py

Lines changed: 54 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ def auc(x, y, reorder=False):
133133
"""Compute Area Under the Curve (AUC) using the trapezoidal rule
134134
135135
This is a general function, given points on a curve. For computing the
136-
area under the ROC-curve, see :func:`auc_score`.
136+
area under the ROC-curve, see :func:`roc_auc_score`.
137137
138138
Parameters
139139
----------
@@ -163,7 +163,10 @@ def auc(x, y, reorder=False):
163163
164164
See also
165165
--------
166-
auc_score : Computes the area under the ROC curve
166+
roc_auc_score : Computes the area under the ROC curve
167+
168+
precision_recall_curve :
169+
Compute precision-recall pairs for different probability thresholds
167170
168171
"""
169172
x, y = check_arrays(x, y)
@@ -292,7 +295,7 @@ def average_precision_score(y_true, y_score):
292295
293296
See also
294297
--------
295-
auc_score : Area under the ROC curve
298+
roc_auc_score : Area under the ROC curve
296299
297300
precision_recall_curve :
298301
Compute precision-recall pairs for different probability thresholds
@@ -310,7 +313,8 @@ def average_precision_score(y_true, y_score):
310313
precision, recall, thresholds = precision_recall_curve(y_true, y_score)
311314
return auc(recall, precision)
312315

313-
316+
@deprecated("Function 'auc_score' has been renamed to "
317+
"'roc_auc_score' and will be removed in release 0.16.")
314318
def auc_score(y_true, y_score):
315319
"""Compute Area Under the Curve (AUC) from prediction scores
316320
@@ -344,10 +348,53 @@ def auc_score(y_true, y_score):
344348
Examples
345349
--------
346350
>>> import numpy as np
347-
>>> from sklearn.metrics import auc_score
351+
>>> from sklearn.metrics import roc_auc_score
352+
>>> y_true = np.array([0, 0, 1, 1])
353+
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
354+
>>> roc_auc_score(y_true, y_scores)
355+
0.75
356+
357+
"""
358+
return roc_auc_score(y_true, y_score)
359+
360+
361+
def roc_auc_score(y_true, y_score):
362+
"""Compute Area Under the Curve (AUC) from prediction scores
363+
364+
Note: this implementation is restricted to the binary classification task.
365+
366+
Parameters
367+
----------
368+
369+
y_true : array, shape = [n_samples]
370+
True binary labels.
371+
372+
y_score : array, shape = [n_samples]
373+
Target scores, can either be probability estimates of the positive
374+
class, confidence values, or binary decisions.
375+
376+
Returns
377+
-------
378+
auc : float
379+
380+
References
381+
----------
382+
.. [1] `Wikipedia entry for the Receiver operating characteristic
383+
<http://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_
384+
385+
See also
386+
--------
387+
average_precision_score : Area under the precision-recall curve
388+
389+
roc_curve : Compute Receiver operating characteristic (ROC)
390+
391+
Examples
392+
--------
393+
>>> import numpy as np
394+
>>> from sklearn.metrics import roc_auc_score
348395
>>> y_true = np.array([0, 0, 1, 1])
349396
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
350-
>>> auc_score(y_true, y_scores)
397+
>>> roc_auc_score(y_true, y_scores)
351398
0.75
352399
353400
"""
@@ -593,7 +640,7 @@ def roc_curve(y_true, y_score, pos_label=None):
593640
594641
See also
595642
--------
596-
auc_score : Compute Area Under the Curve (AUC) from prediction scores
643+
roc_auc_score : Compute Area Under the Curve (AUC) from prediction scores
597644
598645
Notes
599646
-----

sklearn/metrics/scorer.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
import numpy as np
2424

2525
from . import (r2_score, mean_squared_error, accuracy_score, f1_score,
26-
auc_score, average_precision_score, precision_score,
26+
roc_auc_score, average_precision_score, precision_score,
2727
recall_score, log_loss)
2828

2929
from .cluster import adjusted_rand_score
@@ -253,8 +253,8 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
253253
f1_scorer = make_scorer(f1_score)
254254

255255
# Score functions that need decision values
256-
auc_scorer = make_scorer(auc_score, greater_is_better=True,
257-
needs_threshold=True)
256+
roc_auc_scorer = make_scorer(roc_auc_score, greater_is_better=True,
257+
needs_threshold=True)
258258
average_precision_scorer = make_scorer(average_precision_score,
259259
needs_threshold=True)
260260
precision_scorer = make_scorer(precision_score)
@@ -269,7 +269,7 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
269269

270270
SCORERS = dict(r2=r2_scorer,
271271
mean_squared_error=mean_squared_error_scorer,
272-
accuracy=accuracy_scorer, f1=f1_scorer, roc_auc=auc_scorer,
272+
accuracy=accuracy_scorer, f1=f1_scorer, roc_auc=roc_auc_scorer,
273273
average_precision=average_precision_scorer,
274274
precision=precision_scorer, recall=recall_scorer,
275275
log_loss=log_loss_scorer,

sklearn/metrics/tests/test_metrics.py

Lines changed: 61 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
precision_score,
4747
recall_score,
4848
r2_score,
49+
roc_auc_score,
4950
roc_curve,
5051
zero_one,
5152
zero_one_score,
@@ -106,7 +107,7 @@
106107
}
107108

108109
THRESHOLDED_METRICS = {
109-
"auc_score": auc_score,
110+
"roc_auc_score": roc_auc_score,
110111
"average_precision_score": average_precision_score,
111112
}
112113

@@ -299,14 +300,33 @@ def make_prediction(dataset=None, binary=False):
299300
return y_true, y_pred, probas_pred
300301

301302

303+
def _auc(y_true, y_score):
304+
pos_label = np.unique(y_true)[1]
305+
306+
# Count the number of times positive samples are correctly ranked above
307+
# negative samples.
308+
pos = y_score[y_true == pos_label]
309+
neg = y_score[y_true != pos_label]
310+
diff_matrix = pos.reshape(1, -1) - neg.reshape(-1, 1)
311+
n_correct = np.sum(diff_matrix > 0)
312+
313+
return n_correct / float(len(pos) * len(neg))
314+
315+
302316
def test_roc_curve():
303317
"""Test Area under Receiver Operating Characteristic (ROC) curve"""
304318
y_true, _, probas_pred = make_prediction(binary=True)
305319

306320
fpr, tpr, thresholds = roc_curve(y_true, probas_pred)
307321
roc_auc = auc(fpr, tpr)
308-
assert_array_almost_equal(roc_auc, 0.90, decimal=2)
309-
assert_almost_equal(roc_auc, auc_score(y_true, probas_pred))
322+
expected_auc = _auc(y_true, probas_pred)
323+
assert_array_almost_equal(roc_auc, expected_auc, decimal=2)
324+
assert_almost_equal(roc_auc, roc_auc_score(y_true, probas_pred))
325+
326+
with warnings.catch_warnings(record=True):
327+
assert_almost_equal(roc_auc, auc_score(y_true, probas_pred))
328+
329+
310330
assert_equal(fpr.shape, tpr.shape)
311331
assert_equal(fpr.shape, thresholds.shape)
312332

@@ -461,26 +481,47 @@ def test_auc_errors():
461481

462482

463483
def test_auc_score_non_binary_class():
464-
"""Test that auc_score function returns an error when trying to compute AUC
484+
"""Test that roc_auc_score function returns an error when trying to compute AUC
465485
for non-binary class values.
466486
"""
467487
rng = check_random_state(404)
468488
y_pred = rng.rand(10)
469489
# y_true contains only one class value
470490
y_true = np.zeros(10, dtype="int")
471491
assert_raise_message(ValueError, "AUC is defined for binary "
472-
"classification only", auc_score, y_true, y_pred)
492+
"classification only", roc_auc_score, y_true, y_pred)
473493
y_true = np.ones(10, dtype="int")
474494
assert_raise_message(ValueError, "AUC is defined for binary "
475-
"classification only", auc_score, y_true, y_pred)
495+
"classification only", roc_auc_score, y_true, y_pred)
476496
y_true = -np.ones(10, dtype="int")
477497
assert_raise_message(ValueError, "AUC is defined for binary "
478-
"classification only", auc_score, y_true, y_pred)
498+
"classification only", roc_auc_score, y_true, y_pred)
479499
# y_true contains three different class values
480500
y_true = rng.randint(0, 3, size=10)
481501
assert_raise_message(ValueError, "AUC is defined for binary "
482-
"classification only", auc_score, y_true, y_pred)
502+
"classification only", roc_auc_score, y_true, y_pred)
483503

504+
with warnings.catch_warnings(record=True):
505+
rng = check_random_state(404)
506+
y_pred = rng.rand(10)
507+
# y_true contains only one class value
508+
y_true = np.zeros(10, dtype="int")
509+
assert_raise_message(ValueError, "AUC is defined for binary "
510+
"classification only", auc_score,
511+
y_true, y_pred)
512+
y_true = np.ones(10, dtype="int")
513+
assert_raise_message(ValueError, "AUC is defined for binary "
514+
"classification only", auc_score, y_true,
515+
y_pred)
516+
y_true = -np.ones(10, dtype="int")
517+
assert_raise_message(ValueError, "AUC is defined for binary "
518+
"classification only", auc_score, y_true,
519+
y_pred)
520+
# y_true contains three different class values
521+
y_true = rng.randint(0, 3, size=10)
522+
assert_raise_message(ValueError, "AUC is defined for binary "
523+
"classification only", auc_score, y_true,
524+
y_pred)
484525

485526
def test_precision_recall_f1_score_binary():
486527
"""Test Precision Recall and F1 Score for binary classification task"""
@@ -871,16 +912,23 @@ def test_precision_recall_curve_errors():
871912

872913

873914
def test_score_scale_invariance():
874-
# Test that average_precision_score and auc_score are invariant by
915+
# Test that average_precision_score and roc_auc_score are invariant by
875916
# the scaling or shifting of probabilities
876917
y_true, _, probas_pred = make_prediction(binary=True)
877918

878-
roc_auc = auc_score(y_true, probas_pred)
879-
roc_auc_scaled = auc_score(y_true, 100 * probas_pred)
880-
roc_auc_shifted = auc_score(y_true, probas_pred - 10)
919+
roc_auc = roc_auc_score(y_true, probas_pred)
920+
roc_auc_scaled = roc_auc_score(y_true, 100 * probas_pred)
921+
roc_auc_shifted = roc_auc_score(y_true, probas_pred - 10)
881922
assert_equal(roc_auc, roc_auc_scaled)
882923
assert_equal(roc_auc, roc_auc_shifted)
883924

925+
with warnings.catch_warnings():
926+
roc_auc = auc_score(y_true, probas_pred)
927+
roc_auc_scaled = auc_score(y_true, 100 * probas_pred)
928+
roc_auc_shifted = auc_score(y_true, probas_pred - 10)
929+
assert_equal(roc_auc, roc_auc_scaled)
930+
assert_equal(roc_auc, roc_auc_shifted)
931+
884932
pr_auc = average_precision_score(y_true, probas_pred)
885933
pr_auc_scaled = average_precision_score(y_true, 100 * probas_pred)
886934
pr_auc_shifted = average_precision_score(y_true, probas_pred - 10)
@@ -912,7 +960,7 @@ def test_losses():
912960
1 - zero_one_loss(y_true, y_pred))
913961

914962
with warnings.catch_warnings(record=True):
915-
# Throw deprecated warning
963+
# Throw deprecated warning
916964
assert_equal(zero_one_score(y_true, y_pred),
917965
1 - zero_one_loss(y_true, y_pred))
918966

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy