Skip to content

Commit 742505d

Browse files
jnothmanogrisel
authored andcommitted
FIX remove duplicates in MultiLabelBinarizer
1 parent 18c399e commit 742505d

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

sklearn/preprocessing/label.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -557,7 +557,7 @@ def _transform(self, y, class_mapping):
557557
indices = array.array('i')
558558
indptr = array.array('i', [0])
559559
for labels in y:
560-
indices.extend(class_mapping[label] for label in labels)
560+
indices.extend(set(class_mapping[label] for label in labels))
561561
indptr.append(len(indices))
562562
data = np.ones(len(indices), dtype=int)
563563
return sp.csr_matrix((data, indices, indptr),

sklearn/preprocessing/tests/test_label.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -371,3 +371,12 @@ def test_mutlilabel_binarizer_non_integer_labels():
371371

372372
lsb = MultiLabelBinarizer()
373373
assert_raises(TypeError, lsb.fit_transform, [({}), ({}, {'a': 'b'})])
374+
375+
376+
def test_mutlilabel_binarizer_non_unique():
377+
inp = [(1, 1, 1, 0)]
378+
indicator_mat = np.array([[1, 1]])
379+
lsb = MultiLabelBinarizer()
380+
assert_array_equal(lsb.fit_transform(inp), indicator_mat)
381+
382+
assert_array_equal(lsb.inverse_transform(np.array([[1, 3]])), [(0, 1,)])

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy