Skip to content

Implement detect restriction level #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ pub use tables::UNICODE_VERSION;

pub mod mixed_script;
pub mod general_security_profile;
pub mod restriction_level;

pub use mixed_script::MixedScript;
pub use general_security_profile::GeneralSecurityProfile;
Expand Down
1 change: 1 addition & 0 deletions src/mixed_script.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ use unicode_script::{Script, ScriptExtension};
/// An Augmented script set, as defined by UTS 39
///
/// https://www.unicode.org/reports/tr39/#def-augmented-script-set
#[derive(Copy, Clone, PartialEq, Debug, Hash)]
pub struct AugmentedScriptSet {
/// The base ScriptExtension value
pub base: ScriptExtension,
Expand Down
75 changes: 75 additions & 0 deletions src/restriction_level.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
//! For detecting the [restriction level](https://www.unicode.org/reports/tr39/#Restriction_Level_Detection)
//! a string conforms to

use crate::mixed_script::AugmentedScriptSet;
use unicode_script::{Script, ScriptExtension};
use crate::GeneralSecurityProfile;

#[derive(Copy, Clone, PartialEq, PartialOrd, Eq, Ord, Debug, Hash)]
/// The [Restriction level](https://www.unicode.org/reports/tr39/#Restriction_Level_Detection)
/// a string conforms to
pub enum RestrictionLevel {
/// https://www.unicode.org/reports/tr39/#ascii_only
ASCIIOnly,
/// https://www.unicode.org/reports/tr39/#single_script
SingleScript,
/// https://www.unicode.org/reports/tr39/#highly_restrictive
HighlyRestrictive,
/// https://www.unicode.org/reports/tr39/#moderately_restrictive
ModeratelyRestrictive,
/// https://www.unicode.org/reports/tr39/#minimally_restrictive
MinimallyRestrictive,
/// https://www.unicode.org/reports/tr39/#unrestricted
Unrestricted,
}

/// Utilities for determining which [restriction level](https://www.unicode.org/reports/tr39/#Restriction_Level_Detection)
/// a string satisfies
pub trait RestrictionLevelDetection: Sized {
/// Detect the [restriction level](https://www.unicode.org/reports/tr39/#Restriction_Level_Detection)
///
/// This will _not_ check identifier well-formedness, as different applications may have different notions of well-formedness
fn detect_restriction_level(self) -> RestrictionLevel;


/// Check if a string satisfies the supplied [restriction level](https://www.unicode.org/reports/tr39/#Restriction_Level_Detection)
///
/// This will _not_ check identifier well-formedness, as different applications may have different notions of well-formedness
fn check_restriction_level(self, level: RestrictionLevel) -> bool {
self.detect_restriction_level() <= level
}
}

impl RestrictionLevelDetection for &'_ str {
fn detect_restriction_level(self) -> RestrictionLevel {
let mut ascii_only = true;
let mut set = AugmentedScriptSet::default();
let mut exclude_latin_set = AugmentedScriptSet::default();
for ch in self.chars() {
if !GeneralSecurityProfile::identifier_allowed(ch) {
return RestrictionLevel::Unrestricted;
}
if ch.is_ascii() {
ascii_only = false;
}
let ch_set = ch.into();
set.intersect_with(ch_set);
if !ch_set.base.contains_script(Script::Latin) {
exclude_latin_set.intersect_with(ch_set);
}
}

if ascii_only {
return RestrictionLevel::ASCIIOnly;
} else if !set.is_empty() {
return RestrictionLevel::SingleScript;
} else if exclude_latin_set.kore || exclude_latin_set.hanb || exclude_latin_set.jpan {
return RestrictionLevel::HighlyRestrictive;
} else if let ScriptExtension::Single(script) = exclude_latin_set.base {
if script.is_recommended() && script != Script::Cyrillic && script != Script::Greek {
return RestrictionLevel::ModeratelyRestrictive;
}
}
return RestrictionLevel::MinimallyRestrictive;
}
}
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy