regexpu
regexpu is a source code transpiler that enables the use of ES6 Unicode regular expressions in JavaScript-of-today (ES5). It rewrites regular expressions that make use of the ES6 u
flag into equivalent ES5-compatible regular expressions.
Traceur v0.0.61+, Babel v1.5.0+, and esnext v0.12.0+ use regexpu for their u
regexp transpilation. The REPL demos for Traceur, Babel, and esnext let you try u
regexps as well as other ES.next features.
Example
Consider a file named example-es6.js
with the following contents:
var string = 'foo💩bar';
var match = string.match(/foo(.)bar/u);
console.log(match[1]);
// → '💩'
// This regex matches any symbol from U+1F4A9 to U+1F4AB, and nothing else.
var regex = /[\u{1F4A9}-\u{1F4AB}]/u;
// The following regex is equivalent.
var alternative = /[💩-💫]/u;
console.log([
regex.test('a'), // false
regex.test('💩'), // true
regex.test('💪'), // true
regex.test('💫'), // true
regex.test('💬') // false
]);
Let’s transpile it:
$ regexpu -f example-es6.js > example-es5.js
example-es5.js
can now be used in ES5 environments. Its contents are as follows:
var string = 'foo💩bar';
var match = string.match(/foo((?:[\0-\t\x0B\f\x0E-\u2027\u202A-\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?:[^\uD800-\uDBFF]|^)[\uDC00-\uDFFF]))bar/);
console.log(match[1]);
// → '💩'
// This regex matches any symbol from U+1F4A9 to U+1F4AB, and nothing else.
var regex = /(?:\uD83D[\uDCA9-\uDCAB])/;
// The following regex is equivalent.
var alternative = /(?:\uD83D[\uDCA9-\uDCAB])/;
console.log([
regex.test('a'), // false
regex.test('💩'), // true
regex.test('💪'), // true
regex.test('💫'), // true
regex.test('💬') // false
]);
Known limitations
-
regexpu only transpiles regular expression literals, so things like
RegExp('…', 'u')
are not affected. -
regexpu doesn’t polyfill the
RegExp.prototype.unicode
getter because it’s not possible to do so without side effects. -
regexpu doesn’t support canonicalizing the contents of back-references in regular expressions with both the
i
andu
flag set, since that would require transpiling/wrapping strings. - regexpu doesn’t match lone low surrogates accurately. Unfortunately that is impossible to implement due to the lack of lookbehind support in JavaScript regular expressions.
Installation
To use regexpu programmatically, install it as a dependency via npm:
npm install regexpu --save-dev
To use the command-line interface, install regexpu globally:
npm install regexpu -g
API
regexpu.version
A string representing the semantic version number.
regexpu.rewritePattern(pattern, flags)
This function takes a string that represents a regular expression pattern as well as a string representing its flags, and returns an ES5-compatible version of the pattern.
regexpu.rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'
regexpu.rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'u');
// → '(?:[a-z]|\\uD834[\\uDF06-\\uDF08])'
regexpu.rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'ui');
// → '(?:[a-z\\u017F\\u212A]|\\uD834[\\uDF06-\\uDF08])'
regexpu can rewrite non-ES6 regular expressions too, which is useful to demonstrate how their behavior changes once the u
and i
flags are added:
// In ES5, the dot operator only matches BMP symbols:
regexpu.rewritePattern('foo.bar');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])bar'
// But with the ES6 `u` flag, it matches astral symbols too:
regexpu.rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'
regexpu.rewritePattern
uses regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, it’s better to include it directly:
var rewritePattern = require('regexpu/rewrite-pattern');
This prevents the Recast and Esprima dependencies from being loaded into memory.
regexpu.transformTree(ast)
or its alias regexpu.transform(ast)
This function accepts an abstract syntax tree representing some JavaScript code, and returns a transformed version of the tree in which any regular expression literals that use the ES6 u
flag are rewritten in ES5.
var regexpu = require('regexpu');
var recast = require('recast');
var tree = recast.parse(code); // ES6 code
tree = regexpu.transform(tree);
var result = recast.print(tree);
console.log(result.code); // transpiled ES5 code
console.log(result.map); // source map
regexpu.transformTree
uses Recast, regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, it’s better to include it directly:
var transformTree = require('regexpu/transform-tree');
This prevents the Esprima dependency from being loaded into memory.
regexpu.transpileCode(code, options)
This function accepts a string representing some JavaScript code, and returns a transpiled version of this code tree in which any regular expression literals that use the ES6 u
flag are rewritten in ES5.
var es6 = 'console.log(/foo.bar/u.test("foo💩bar"));';
var es5 = regexpu.transpileCode(es6);
// → 'console.log(/foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar/.test("foo💩bar"));'
The optional options
object recognizes the following properties:
-
sourceFileName
: a string representing the file name of the original ES6 source file. -
sourceMapName
: a string representing the desired file name of the source map.
These properties must be provided if you want to generate source maps.
var result = regexpu.transpileCode(code, {
'sourceFileName': 'es6.js',
'sourceMapName': 'es6.js.map',
});
console.log(result.code); // transpiled source code
console.log(result.map); // source map
regexpu.transpileCode
uses Esprima, Recast, regjsgen, regjsparser, and regenerate as internal dependencies. If you only need this function in your program, feel free to include it directly:
var transpileCode = require('regexpu/transpile-code');
Transpilers that use regexpu internally
If you’re looking for a general-purpose ES.next-to-ES5 transpiler with support for Unicode regular expressions, consider using one of these:
Author
Mathias Bynens |
License
regexpu is available under the MIT license.