version 5.0.0 Regexregex
Concept

Key Concepts

Core abstractions and design patterns


Overview
ansi-regex — Architecture Overview ANSI escape sequence detection via compiled regular expressions Application Calls ansi-regex() ansiRegex() Factory Function Pattern Builder Joins escape sequences into a single pattern RegExp Engine new RegExp(pattern, 'g') CSI Sequences [\u001B\u009B][[\]()#;?]* Control Sequence Introducer ESC [ … m / \u009B … m OSC / BEL Sequences …\u0007 terminated Operating System Command ESC ] … BEL Compiled RegExp flag: global (g) Input String Text with ANSI codes Match Results Detected escape spans Primary call / return flow Internal data flow Usage / application

This page explains the core abstractions and design patterns behind ansi-regex. Understanding how the library constructs and applies its regular expression helps you use it correctly and integrate it confidently into your own tooling. At its heart, ansi-regex generates a single, composable RegExp object that matches ANSI escape sequences — the control codes terminals use for color, cursor movement, and formatting.


Content

What Is an ANSI Escape Sequence?

ANSI escape sequences are special byte patterns embedded in strings to control terminal output. They typically begin with an escape character (\u001B, the ESC character) or a C1 control code (\u009B), followed by a structured payload that encodes a command such as "set foreground color to red" or "move cursor up two lines."

Because these sequences are invisible in rendered output but very much present in raw strings, they can cause problems when you need to measure string length, strip formatting for plain-text output, or search file content programmatically. ansi-regex gives you a reliable pattern to detect them.

The Generated Regular Expression

Calling the exported function returns a new RegExp instance constructed from two sub-patterns joined with | (alternation):

  • OSC/DCS sequences — sequences terminated by the BEL character (\u0007), which are used for operating-system commands such as setting the window title.
  • CSI sequences — sequences that begin with a Control Sequence Introducer and end with a command byte in a defined range. These cover the majority of color and cursor-control codes you encounter in terminal output.

The resulting expression is compiled with the g (global) flag, which means a single call to String.prototype.matchAll, String.prototype.replace, or RegExp.prototype.exec in a loop will find every occurrence in a string, not just the first.

Why a Factory Function?

The library exports a factory function rather than a pre-built RegExp literal. This is an intentional design choice:

  • Stateless by default. Regular expressions with the g flag maintain internal state (lastIndex) between calls to .exec(). By calling the factory each time you need a fresh pattern, you avoid subtle bugs caused by a shared, stateful RegExp object.
  • Predictable. Every call to ansiRegex() returns an independent instance. You can safely use separate instances in concurrent or recursive contexts without one call interfering with another.

The Global Flag and Matching Behavior

Because the returned expression always carries the g flag, you should be aware of how JavaScript handles global regexes:

  • Use string.match(regex) or [...string.matchAll(regex)] to collect all matches at once.
  • If you call .exec() in a loop, each call advances lastIndex until no match is found and the index resets.
  • To test whether a string contains any ANSI code without collecting matches, prefer creating a fresh regex instance per test, since .test() also advances lastIndex on a global regex.

Scope: What the Pattern Matches

The pattern is designed to match the full escape sequence as a single token — from the opening escape character through the terminating command byte or BEL character. It does not parse or decode the meaning of each sequence. If you need to interpret sequences (for example, to extract a specific color code), you must process the captured matches yourself. For most integration scenarios — stripping, measuring, or detecting ANSI codes — the raw match is all you need.


Examples

Generating the regex and inspecting it

const ansiRegex = require('ansi-regex');

const regex = ansiRegex();
console.log(regex);

Expected output (the compiled pattern with the global flag):

/[\u001B\u009B][[\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[a-zA-Z\d]*)*)?\u0007)|(?:(?:\d{1,4}(?:;\d{0,4})*)?[\dA-PRZcf-ntqry=><~]))/g

Testing whether a string contains ANSI sequences

Create a fresh instance for each test to avoid lastIndex side effects:

const ansiRegex = require('ansi-regex');

const raw = '\u001B[31mHello\u001B[0m world';
const plain = 'Hello world';

console.log(ansiRegex().test(raw));   // true
console.log(ansiRegex().test(plain)); // false

Expected output:

true
false

Collecting all matches in a string

const ansiRegex = require('ansi-regex');

const str = '\u001B[4mUnderline\u001B[0m and \u001B[1mBold\u001B[0m';
const matches = str.match(ansiRegex());

console.log(matches);

Expected output:

[ '\u001B[4m', '\u001B[0m', '\u001B[1m', '\u001B[0m' ]

Stripping ANSI sequences from a string

Use String.prototype.replace with the regex to remove all escape codes:

const ansiRegex = require('ansi-regex');

const colored = '\u001B[32mSuccess\u001B[0m';
const stripped = colored.replace(ansiRegex(), '');

console.log(stripped); // 'Success'
console.log(stripped.length); // 7

Expected output:

Success
7

Related concepts
  • Installation — How to add ansi-regex to your project as a dependency.
  • API Reference — Full signature of the exported factory function and the RegExp it returns.
  • Integration Patterns — Recipes for common tasks such as stripping ANSI codes, measuring visible string length, and filtering file content.
  • ANSI Escape Code Specification — External reference: the ECMA-48 standard and VT100 documentation describe the full grammar that the regex is designed to match.