Skip to content

JumperBot/whitespace-sifter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

whitespace-sifter

crates.io version github.com forks github.com stars crates.io downloads


use whitespace_sifter::WhitespaceSifter;
// This prints `1.. 2.. 3.. 4.. 5..`.
println!(
    "{}",
    "1.. \n2..  \n\r\n\n3..   \n\n\n4..    \n\n\r\n\n\n5..     \n\n\n\n\n".sift(),
);

// This prints `1..\n2..\n3..\n4..\r\n5..`.
println!(
    "{}",
    "1.. \n2..  \n\r\n3..   \n\n\n4..    \r\n\n\r\n\n5..     \n\n\n\n\n"
        .sift_preserve_newlines(),
);

✨ Sift Duplicate Whitespaces In One Function Call

This crate helps you remove duplicate whitespaces within a UTF-8 encoded string.
It naturally removes the whitespaces at the start and end of the string.


📈 Crate Comparison

Crate Implementation
whitespace-sifter Any AsRef<str> as input, CR-LF compatibility, preserve_newlines
collapse &str input only
fast_whitespace_collapse &str input only, SIMD with fallback for any unsupported rustc target

Crate Whitespace Dictionary Time Complete
whitespace-sifter '\t' | '\n' | '\x0C' | '\r' | ' '| "\r\n" ~170 µs
collapse ' ' | '\x09'..='\x0d' | unicode::White_Space(c) ~270 µs
fast_whitespace_collapse ' ' | '\t' ~160 µs

Disclaimers:

  1. I do not know the crate maintainers nor asked for permission to include their crates here.

  2. As far as I know, there are only three crates dedicated to whitespace sifting/collapse.

  3. fast_whitespace_collapse was not able to collapse cr-lf and line feeds.


⚡️Benchmarks

Performance is a priority; Most updates are performance improvements.
The benchmark uses a transcript of the Bee Movie.

Execute these commands to benchmark:

$ git clone https://github.com/JumperBot/whitespace-sifter.git
$ cd whitespace-sifter/bench
$ cargo bench

You should only look for results that look like the following:

Sift/Sift               time:   [178.69 µs 178.84 µs 179.03 µs]
Sift Preserved/Sift Preserved
                        time:   [179.61 µs 179.75 µs 179.90 µs]

In just 0.0001 seconds; Pretty impressive, no?

Go try it on a better machine, I guess. Benchmark specifications:
  • Processor: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz 1.90 GHz
  • Memory: RAM 16.0 GB (15.8 GB usable)
  • System: GNU/Linux 5.15.153.1-microsoft-standard-WSL2 x86_64
  • Modified: v2.3.4

➕ Dependency

Add this to your project with:

$ cargo add whitespace-sifter

📦️ Installation

Download the binary with:

$ cargo install whitespace-sifter

Use it as usual:

$ echo "Hello    there!" | whitespace-sifter
$ cat document.txt | whitespace-sifter --preserve-newlines

🔊 Changelog

  • Improved Performance
  • Minimum Supported Rust Version set to v1.79.0 (starting v2.3.3)
  • Crate binary (starting v2.3.6)
  • Stricter Tests (starting v2.3.2)
    • Proper UTF-8/Unicode Encoding
    • Regular Sifting
    • Sifting With Leading Whitespaces
    • Documentation Assertion
    • MSRV Verification
    • Compliance Check for Old Versions
  • Crate Comparison (starting v2.3.4)
  • Benchmark Separation (starting v2.3.5)

📄 Licensing

whitespace-sifter is licensed under the MIT LICENSE; This is the summarization.

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy