grapheme-breaker

WebJar for grapheme-breaker

License

License

MIT
GroupId

GroupId

org.webjars.npm
ArtifactId

ArtifactId

grapheme-breaker
Last Version

Last Version

0.3.2
Release Date

Release Date

Type

Type

jar
Description

Description

grapheme-breaker
WebJar for grapheme-breaker
Project URL

Project URL

http://webjars.org
Source Code Management

Source Code Management

https://github.com/foliojs/grapheme-breaker

Download grapheme-breaker

How to add to project

<!-- https://jarcasting.com/artifacts/org.webjars.npm/grapheme-breaker/ -->
<dependency>
    <groupId>org.webjars.npm</groupId>
    <artifactId>grapheme-breaker</artifactId>
    <version>0.3.2</version>
</dependency>
// https://jarcasting.com/artifacts/org.webjars.npm/grapheme-breaker/
implementation 'org.webjars.npm:grapheme-breaker:0.3.2'
// https://jarcasting.com/artifacts/org.webjars.npm/grapheme-breaker/
implementation ("org.webjars.npm:grapheme-breaker:0.3.2")
'org.webjars.npm:grapheme-breaker:jar:0.3.2'
<dependency org="org.webjars.npm" name="grapheme-breaker" rev="0.3.2">
  <artifact name="grapheme-breaker" type="jar" />
</dependency>
@Grapes(
@Grab(group='org.webjars.npm', module='grapheme-breaker', version='0.3.2')
)
libraryDependencies += "org.webjars.npm" % "grapheme-breaker" % "0.3.2"
[org.webjars.npm/grapheme-breaker "0.3.2"]

Dependencies

compile (2)

Group / Artifact Type Version
org.webjars.npm : brfs jar [1.2.0,2)
org.webjars.npm : unicode-trie jar [0.3.1,0.4)

Project Modules

There are no modules declared in this project.

grapheme-breaker

A JavaScript implementation of the Unicode grapheme cluster breaking algorithm (UAX #29)

It is important to recognize that what the user thinks of as a “character”—a basic unit of a writing system for a language—may not be just a single Unicode code point. Instead, that basic unit may be made up of multiple Unicode code points. To avoid ambiguity with the computer use of the term character, this is called a user-perceived character. For example, “G” + acute-accent is a user-perceived character: users think of it as a single character, yet is actually represented by two Unicode code points. These user-perceived characters are approximated by what is called a grapheme cluster, which can be determined programmatically.

Installation

You can install via npm

npm install grapheme-breaker

Example

var GraphemeBreaker = require('grapheme-breaker');

// break a string into an array of grapheme clusters


GraphemeBreaker.break('Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞') // => ['Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍', 'A̴̵̜̰͔ͫ͗͢', 'L̠ͨͧͩ͘', 'G̴̻͈͍͔̹̑͗̎̅͛́', 'Ǫ̵̹̻̝̳͂̌̌͘', '!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞']


// or just count the number of grapheme clusters in a string


GraphemeBreaker.countBreaks('Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞') // => 6


// use nextBreak and previousBreak to get break points starting
// from anywhere in the string
GraphemeBreaker.nextBreak('😜🇺🇸👍', 3) // => 6
GraphemeBreaker.previousBreak('😜🇺🇸👍', 3) // => 2

Development Notes

In order to use the library, you shouldn't need to know this, but if you're interested in contributing or fixing bugs, these things might be of interest.

  • The src/classes.trie file is automatically generated from GraphemeBreakProperty.txt in the Unicode database by src/generate_data.js. It should be rare that you need to run this, but you may if, for instance, you want to change the Unicode version.

  • You can run the tests using npm test. They are written using mocha, and generated from GraphemeBreakTest.txt from the Unicode database, which is included in the repository for performance reasons while running them.

License

MIT

org.webjars.npm
📃 Typography and Page Layout in JavaScript

Versions

Version
0.3.2