Unicode TR39 Confusables

Skeleton algorithm from Unicode TR39 for testing confusability of strings

License

License

GroupId

GroupId

com.github.mpkorstanje
ArtifactId

ArtifactId

tr39-confusables
Last Version

Last Version

8.0.10
Release Date

Release Date

Type

Type

pom
Description

Description

Unicode TR39 Confusables
Skeleton algorithm from Unicode TR39 for testing confusability of strings
Project URL

Project URL

https://github.com/mpkorstanje/tr39-confusables
Source Code Management

Source Code Management

https://github.com/mpkorstanje/tr39-confusables

Download tr39-confusables

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.mpkorstanje/tr39-confusables/ -->
<dependency>
    <groupId>com.github.mpkorstanje</groupId>
    <artifactId>tr39-confusables</artifactId>
    <version>8.0.10</version>
    <type>pom</type>
</dependency>
// https://jarcasting.com/artifacts/com.github.mpkorstanje/tr39-confusables/
implementation 'com.github.mpkorstanje:tr39-confusables:8.0.10'
// https://jarcasting.com/artifacts/com.github.mpkorstanje/tr39-confusables/
implementation ("com.github.mpkorstanje:tr39-confusables:8.0.10")
'com.github.mpkorstanje:tr39-confusables:pom:8.0.10'
<dependency org="com.github.mpkorstanje" name="tr39-confusables" rev="8.0.10">
  <artifact name="tr39-confusables" type="pom" />
</dependency>
@Grapes(
@Grab(group='com.github.mpkorstanje', module='tr39-confusables', version='8.0.10')
)
libraryDependencies += "com.github.mpkorstanje" % "tr39-confusables" % "8.0.10"
[com.github.mpkorstanje/tr39-confusables "8.0.10"]

Dependencies

There are no dependencies for this project. It is a standalone project that does not depend on any other jars.

Project Modules

  • tr39-confusables-skeleton
  • tr39-confusables-table-generator-maven-plugin

tr39-confusables Maven Central Build Status

Skeleton algorithm from Unicode TR39 for testing confusability of strings.

Version 8.0.10 matches version 8.0 draft 10 of TR39

Usage

import static com.github.mpkorstanje.unicode.tr39confusables.Skeleton.skeleton;
...
// Skeleton representations of unicode strings containing 
// confusable characters are equal 
skeleton("paypal").equals(skeleton("paypal")); // true
skeleton("paypal").equals(skeleton("𝔭𝒢ỿ𝕑𝕒ℓ")); // true
skeleton("paypal").equals(skeleton("ΟβΊΡƒπ“…π’‚ΧŸ")); // true
skeleton("ΟβΊΡƒπ“…π’‚ΧŸ").equals(skeleton("𝔭𝒢ỿ𝕑𝕒ℓ")); // true
skeleton("ΟβΊΡƒπ“…π’‚ΧŸ").equals(skeleton("𝔭𝒢ỿ𝕑𝕒ℓ")); // true

// The skeleton representation does not transform case
skeleton("payPal").equals(skeleton("paypal")); // false

// The skeleton representation does not remove diacritics
skeleton("paypal").equals(skeleton("pΓ α»³pΔ…l")); // false

Note on the use of Skeleton, from TR39:

A skeleton is intended only for internal use for testing confusability of strings; the resulting text is not suitable for display to users, because it will appear to be a hodgepodge of different scripts. In particular, the result of mapping an identifier will not necessary be an identifier. Thus the confusability mappings can be used to test whether two identifiers are confusable (if their skeletons are the same), but should definitely not be used as a "normalization" of identifiers.

Maven

<dependency>
  <groupId>com.github.mpkorstanje</groupId>
  <artifactId>tr39-confusables-skeleton</artifactId>
  <version>8.0.10</version>
</dependency>

Versions

Version
8.0.10
0.5.0