Skip to content

Commit 2ddb503

Browse files
committed
Cosmit readability improvements and better whatsnew.
1 parent 123a8c9 commit 2ddb503

File tree

3 files changed

+15
-12
lines changed

3 files changed

+15
-12
lines changed

doc/whats_new.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,10 @@ Enhancements
5656
:class:`linear_model.LogisticRegression`, by avoiding loss computation.
5757
By `Mathieu Blondel`_ and `Tom Dupre la Tour`_.
5858

59-
- Improved heuristic for ``class_weight="auto"`` for classifiers supporting
60-
``class_weight`` by Hanna Wallach and `Andreas Müller`_
59+
- The ``class_weight="auto"`` heuristic in classifiers supporting
60+
``class_weight`` was deprecated and replaced by the ``class_weight="balanced"``
61+
option, which has a simpler forumlar and interpretation.
62+
By Hanna Wallach and `Andreas Müller`_.
6163

6264
Bug fixes
6365
.........

sklearn/utils/estimator_checks.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1087,7 +1087,7 @@ def check_class_weight_balanced_linear_classifier(name, Classifier):
10871087
"""Test class weights with non-contiguous class labels."""
10881088
X = np.array([[-1.0, -1.0], [-1.0, 0], [-.8, -1.0],
10891089
[1.0, 1.0], [1.0, 0.0]])
1090-
y = [1, 1, 1, -1, -1]
1090+
y = np.array([1, 1, 1, -1, -1])
10911091

10921092
with warnings.catch_warnings(record=True):
10931093
classifier = Classifier()
@@ -1102,10 +1102,11 @@ def check_class_weight_balanced_linear_classifier(name, Classifier):
11021102
coef_balanced = classifier.fit(X, y).coef_.copy()
11031103

11041104
# Count each label occurrence to reweight manually
1105-
class_weight = {
1106-
1: 5. / (2 * 3),
1107-
-1: 5. / (2 * 2)
1108-
}
1105+
n_samples = len(y)
1106+
n_classes = len(np.unique(y))
1107+
1108+
class_weight = {1: n_samples / (np.sum(y == 1) * n_classes),
1109+
-1: n_samples / (np.sum(y == -1) * n_classes)}
11091110
classifier.set_params(class_weight=class_weight)
11101111
coef_manual = classifier.fit(X, y).coef_.copy()
11111112

sklearn/utils/tests/test_class_weight.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,15 +39,15 @@ def test_compute_class_weight_not_present():
3939

4040

4141
def test_compute_class_weight_invariance():
42-
# test that results with class_weight="balanced" is invariant against
43-
# class imbalance if the number of samples is identical
44-
# the test uses a balanced two class dataset with 100 datapoints.
45-
# it then creates three versions, one where class 1 is duplicated
42+
# Test that results with class_weight="balanced" is invariant wrt
43+
# class imbalance if the number of samples is identical.
44+
# The test uses a balanced two class dataset with 100 datapoints.
45+
# It creates three versions, one where class 1 is duplicated
4646
# resulting in 150 points of class 1 and 50 of class 0,
4747
# one where there are 50 points in class 1 and 150 in class 0,
4848
# and one where there are 100 points of each class (this one is balanced
4949
# again).
50-
# with balancing class weights, all three should give the same model.
50+
# With balancing class weights, all three should give the same model.
5151
X, y = make_blobs(centers=2, random_state=0)
5252
# create dataset where class 1 is duplicated twice
5353
X_1 = np.vstack([X] + [X[y == 1]] * 2)

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy