TestsTested | ✓ |
LangLanguage | SwiftSwift |
License | MIT |
ReleasedLast Release | Sep 2017 |
SwiftSwift Version | 4.0.0 |
SPMSupports SPM | ✓ |
Maintained by ukitaka.
Extension of Swift String API to deal with East Asian Width. The most generally use case is to classify unicode scalar value as Fullwidth (全角) or Halfwidth (半角).
// Halfwidth Katakana (半角カナ)
"アイウエオ".unicodeScalars.forEach { (u: UnicodeScalar) in
u.isHalfwidth // true
}
// Fullwidth Katakana (全角カナ)
"アイウエオ".unicodeScalars.forEach { (u: UnicodeScalar) in
u.isFullwidth // true
}
East Asian Width is specified as Unicode® Standard Annex #11.
UnicodeScalar
ExtensionsFor East Asian Width, this library provides methods below
/// East Asian Wide (W)
/// See: http://unicode.org/reports/tr11/#ED4
unicodeScalar.isEastAsianWide
/// East Asian Narrow (Na)
/// See: http://unicode.org/reports/tr11/#ED5
unicodeScalar.isEastAsianNarrow
/// Neutral (Not East Asian):
/// See: http://unicode.org/reports/tr11/#ED7
unicodeScalar.isEastAsianNeutral
/// East Asian Halfwidth (H)
/// See: http://unicode.org/reports/tr11/#ED3
unicodeScalar.isEastAsianHalfwidth
/// East Asian Fullwidth (F)
/// See: http://unicode.org/reports/tr11/#ED2
unicodeScalar.isEastAsianFullwidth
/// East Asian Ambiguous (A)
/// See: http://unicode.org/reports/tr11/#ED6
unicodeScalar.isEastAsianAmbiguous
And if you want to know just it is Fullwidth(全角) or Halfwidth(半角),
you can use isFullwidth
and so on.
// Fullwidth
unicodeScalar.isFullwidth
// Halfwidth
unicodeScalar.isHalfwidth
// NOTE:
// `isFullwidth` and `isHalfwidth` does not include East Asian Ambiguous.
// If you want to include it, you can use `isFullwidthOrAmbiguous` / `isHalfwidthOrAmbiguous` instead.
unicodeScalar.isFullwidthOrAmbiguous
unicodeScalar.isHalfwidthOrAmbiguous
String
ExtensionsString
extension provides containsXXX
methods that check if specific East Asian Width characters are contained.
// East Asian Width
string.containsEastAsianWideCharacters
string.containsEastAsianNarrowCharacters
string.containsEastAsianNeutralCharacters
string.containsEastAsianHalfwidthCharacters
string.containsEastAsianFullwidthCharacters
string.containsEastAsianAmbiguousCharacters
// Fullwidth or Halfwidth
string.containsFullwidthCharacters
string.containsFullwidthOrAmbiguousCharacters
string.containsHalfwidthCharacters
string.containsHalfwidthOrAmbiguousCharacters
UnicodeScalarView
ExtensionsUnicodeScalarView
extension provides countByEastAsianWidth
method that counts string length by East Asian Width.
By default, Ambiguous
characters are marked as Halfwidth
, length of Halfwidth
is 1, and Fullwidth
is 2.
You can configure them with parameters.
// count by defualt settings
"あいうえおアイウエオ".unicodeScalars.countByEastAsianWidth() // 15
// you can configure with parameters.
string.unicodeScalars.countByEastAsianWidth(halfwidthAs: 2, fullwidthAs: 4, markEastAsianAmbiguousAsFullwidth: false)
CharacterSet
?Main reason is technical problems of CharacterSet
.
We cannot create union of CharacterSet
that has different byte length characters.
let c1 = CharacterSet(charactersIn: "\u{AAAA}")
let c2 = CharacterSet(charactersIn: "\u{AAAAA}")
c2.contains("\u{AAAAA}") // true
c1.union(c2).contains("\u{AAAAA}") // false 😫
But some East Asian Width
definitions include different byte length characters.
So I cannot support CharacterSet
…
EastAsianWidth.swift
requires / supports the following environments: