-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Insights: huggingface/datasets
Overview
Could not load contribution data
Please try again later
8 Pull requests merged by 7 people
-
Add
IterableDataset.push_to_hub()
#7595 merged
Jun 6, 2025 -
fix string_to_dict usage for windows
#7598 merged
Jun 6, 2025 -
Fix broken link to albumentations
#7593 merged
Jun 5, 2025 -
Avoid multiple default config names
#7585 merged
Jun 5, 2025 -
Add missing property on
RepeatExamplesIterable
#7581 merged
Jun 5, 2025 -
[MINOR:TYPO] Update save_to_disk docstring
#7575 merged
Jun 5, 2025 -
Fix regex library warnings
#7576 merged
Jun 5, 2025 -
Fixed typos
#7572 merged
Jun 5, 2025
2 Pull requests opened by 2 people
-
Remove scripts altogether
#7592 opened
Jun 4, 2025 -
Add albumentations to use dataset
#7596 opened
Jun 5, 2025
2 Issues closed by 1 person
-
Feature request: IterableDataset.push_to_hub
#5665 closed
Jun 6, 2025
5 Issues opened by 5 people
-
`push_to_hub` is not concurrency safe (dataset schema corruption)
#7600 opened
Jun 7, 2025 -
My already working dataset (when uploaded few months ago) now is ignoring metadata.jsonl
#7599 opened
Jun 6, 2025 -
Download datasets from a private hub in 2025
#7597 opened
Jun 6, 2025 -
Add option to ignore keys/columns when loading a dataset from jsonl(or any other data format)
#7594 opened
Jun 5, 2025 -
Add num_proc parameter to push_to_hub
#7591 opened
Jun 4, 2025
3 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Dataset lib seems to broke after fssec lib update
#7570 commented on
Jun 2, 2025 • 0 new comments -
`Sequence(Features(...))` causes PyArrow cast error in `load_dataset` despite correct schema.
#7590 commented on
Jun 4, 2025 • 0 new comments -
feat: use content defined chunking
#7589 commented on
Jun 5, 2025 • 0 new comments