parse-entities
Parse HTML character references: fast, spec-compliant, positional information.
Install
npm:
npm install parse-entities
Use
var decode = require('parse-entities')
decode('alpha & bravo')
// => alpha & bravo
decode('charlie ©cat; delta')
// => charlie ©cat; delta
decode('echo © foxtrot ≠ golf 𝌆 hotel')
// => echo © foxtrot ≠ golf 𝌆 hotel
API
parseEntities(value[, options])
options
options.additional
Additional character to accept (string?
, default: ''
). This allows other characters, without error, when following an ampersand.
options.attribute
Whether to parse value
as an attribute value (boolean?
, default: false
).
options.nonTerminated
Whether to allow non-terminated entities (boolean
, default: true
). For example, ©cat
for ©cat
. This behavior is spec-compliant but can lead to unexpected results.
options.warning
Error handler (Function?
).
options.text
Text handler (Function?
).
options.reference
Reference handler (Function?
).
options.warningContext
Context used when invoking warning
('*'
, optional).
options.textContext
Context used when invoking text
('*'
, optional).
options.referenceContext
Context used when invoking reference
('*'
, optional)
options.position
Starting position
of value
(Location
or Position
, optional). Useful when dealing with values nested in some sort of syntax tree. The default is:
{
start: {line: 1, column: 1, offset: 0},
indent: []
}
Returns
string
— Decoded value
.
function warning(reason, position, code)
Error handler.
Context
this
refers to warningContext
when given to parseEntities
.
Parameters
reason
Human-readable reason the error (string
).
position
Place at which the parse error occurred (Position
).
code
Machine-readable code for the error (number
).
The following codes are used:
Code | Example | Note |
---|---|---|
1 |
foo & bar |
Missing semicolon (named) |
2 |
foo { bar |
Missing semicolon (numeric) |
3 |
Foo &bar baz |
Ampersand did not start a reference |
4 |
Foo &# |
Empty reference |
5 |
Foo &bar; baz |
Unknown entity |
6 |
Foo € baz |
Disallowed reference |
7 |
Foo � baz |
Prohibited: outside permissible unicode range |
function text(value, location)
Text handler.
Context
this
refers to textContext
when given to parseEntities
.
Parameters
value
String of content (string
).
location
Location at which value
starts and ends (Location
).
function reference(value, location, source)
Character reference handler.
Context
this
refers to referenceContext
when given to parseEntities
.
Parameters
value
Encoded character reference (string
).
location
Location at which value
starts and ends (Location
).
source
Source of character reference (Location
).
Related
stringify-entities
— Encode HTML character referencescharacter-entities
— Info on character entitiescharacter-entities-html4
— Info on HTML4 character entitiescharacter-entities-legacy
— Info on legacy character entitiescharacter-reference-invalid
— Info on invalid numeric character references