SungBeom's picture
Upload folder using huggingface_hub
4a51346
|
raw
history blame
2.71 kB

emoji-regex Build status

emoji-regex offers a regular expression to match all emoji symbols (including textual representations of emoji) as per the Unicode Standard.

This repository contains a script that generates this regular expression based on the data from Unicode Technical Report #51. Because of this, the regular expression can easily be updated whenever new emoji are added to the Unicode standard.

Installation

Via npm:

npm install emoji-regex

In Node.js:

const emojiRegex = require('emoji-regex');
// Note: because the regular expression has the global flag set, this module
// exports a function that returns the regex rather than exporting the regular
// expression itself, to make it impossible to (accidentally) mutate the
// original regular expression.

const text = `
\u{231A}: ⌚ default emoji presentation character (Emoji_Presentation)
\u{2194}\u{FE0F}: ↔️ default text presentation character rendered as emoji
\u{1F469}: πŸ‘© emoji modifier base (Emoji_Modifier_Base)
\u{1F469}\u{1F3FF}: πŸ‘©πŸΏ emoji modifier base followed by a modifier
`;

const regex = emojiRegex();
let match;
while (match = regex.exec(text)) {
  const emoji = match[0];
  console.log(`Matched sequence ${ emoji } β€” code points: ${ [...emoji].length }`);
}

Console output:

Matched sequence ⌚ β€” code points: 1
Matched sequence ⌚ β€” code points: 1
Matched sequence ↔️ β€” code points: 2
Matched sequence ↔️ β€” code points: 2
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘©πŸΏ β€” code points: 2
Matched sequence πŸ‘©πŸΏ β€” code points: 2

To match emoji in their textual representation as well (i.e. emoji that are not Emoji_Presentation symbols and that aren’t forced to render as emoji by a variation selector), require the other regex:

const emojiRegex = require('emoji-regex/text.js');

Additionally, in environments which support ES2015 Unicode escapes, you may require ES2015-style versions of the regexes:

const emojiRegex = require('emoji-regex/es2015/index.js');
const emojiRegexText = require('emoji-regex/es2015/text.js');

Author

License

emoji-regex is available under the MIT license.