Easy Tutorial
❮ Regexp Intro Regexp Tutorial ❯

Regular Expressions - Meta Characters

The table below contains the complete list of meta characters and their behavior in the context of regular expressions:

Character Description
\ Marks the next character as either a special character, a literal, a backreference, or an octal escape. For example, 'n' matches the character "n". '\n' matches a newline character. The sequence '\' matches "\" and "(" matches "(".
^ Matches the start of the input string. If the Multiline property of the RegExp object is set, ^ also matches positions after '\n' or '\r'.
$ Matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches positions before '\n' or '\r'.
* Matches the preceding subexpression zero or more times. For example, zo* can match "z" and "zoo". * is equivalent to {0,}.
+ Matches the preceding subexpression one or more times. For example, 'zo+' can match "zo" and "zoo", but not "z". + is equivalent to {1,}.
? Matches the preceding subexpression zero or one time. For example, "do(es)?" can match "do" or "does". ? is equivalent to {0,1}.
{n} n is a non-negative integer. Matches exactly n times. For example, 'o{2}' cannot match "Bob" but can match "food".
{n,} n is a non-negative integer. Matches at least n times. For example, 'o{2,}' cannot match "Bob" but can match "foooood". 'o{1,}' is equivalent to 'o+'. 'o{0,}' is equivalent to 'o*'.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and at most m times. For example, "o{1,3}" will match the first three o's in "fooooood". 'o{0,1}' is equivalent to 'o?'. Note that there must be no spaces between the comma and the numbers.
? When this character is immediately followed by any other quantifier (*, +, ?, {n}, {n,}, {n,m}), the matching mode is non-greedy. The non-greedy mode matches as few of the searched string as possible, while the default greedy mode matches as many as possible. For example, for the string "oooo", 'o+?' will match a single "o", while 'o+' will match all 'o's.
. Matches any single character except newline characters (\n, \r). To match any character including '\n', use the pattern "(. \n)".
(pattern) Matches pattern and captures the match. The captured match can be retrieved from the Matches collection generated. In VBScript, use the SubMatches collection, and in JScript, use the $0…$9 properties. To match parentheses, use '(' or ')'.
(?:pattern) Matches pattern but does not capture the match, i.e., it is a non-capturing match and does not store it for future use. This is useful when combining parts of a pattern with the "or" character ( ). For example, 'industr(?:y ies)' is a more concise expression than 'industry industries'.
(?=pattern) Positive lookahead assertion, matches at the beginning of any string that matches pattern. This is a non-capturing match, meaning it does not need to be stored for later use. For example, "Windows(?=95 98 NT 2000)" will match "Windows" in "Windows2000" but not in "Windows3.1". Lookahead does not consume characters, meaning after a match occurs, the next match search begins immediately after the last match, not after the characters within the lookahead.
(?!pattern) Negative lookahead assertion, matches at the beginning of any string that does not match pattern. This is a non-capturing match, meaning it does not need to be stored for later use. For example, "Windows(?!95 98 NT 2000)" will match "Windows" in "Windows3.1" but not in "Windows2000". Lookahead does not consume characters, meaning after a match occurs, the next match search begins immediately after the last match, not after the characters within the lookahead.
(?<=pattern) Positive lookbehind assertion, similar to positive lookahead but in the opposite direction. For example, " (?<=95 98 NT 2000)Windows" can match " 2000Windows" in "Windows", but not " 3.1Windows" in "Windows".
(? Negative lookbehind assertion, similar to negative lookahead but in the opposite direction. For example, " (? 98 NT 2000)Windows" can match " 3.1Windows" in "Windows", but not " 2000Windows" in "Windows".
x y Matches either x or y. For example, 'z food' can match "z" or "food". '(z f)ood' matches "zood" or "food".
[xyz] Character set. Matches any one of the enclosed characters. For example, '[abc]' can match 'a' in "plain".
[^xyz] Negated character set. Matches any character not enclosed. For example, '[^abc]' can match 'p', 'l', 'i', 'n' in "plain".
[a-z] Character range. Matches any character in the specified range. For example, '[a-z]' can match any lowercase letter from 'a' to 'z'.
[^a-z] Negated character range. Matches any character not in the specified range. For example, '[^a-z]' can match any character not in the range 'a' to 'z'.
\b Matches a word boundary, the position between a word and a space. For example, 'er\b' can match 'er' in "never", but not 'er' in "verb".
\B Matches a non-word boundary. 'er\B' can match 'er' in "verb", but not 'er' in "never".
\cx Matches the control character specified by x. For example, \cM matches a Control-M or carriage return. x must be in the range A-Z or a-z. Otherwise, c is treated as a literal 'c' character.
\d Matches a digit character. Equivalent to [0-9].
\D Matches a non-digit character. Equivalent to [^0-9].
\f Matches a form feed character. Equivalent to \x0c and \cL.
\n Matches a newline character. Equivalent to \x0a and \cJ.
\r Matches a carriage return character. Equivalent to \x0d and \cM.
\s Matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v].
\S Matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v].
\t Matches a tab character. Equivalent to \x09 and \cI.
\v Matches a vertical tab character. Equivalent to \x0b and \cK.
\w Matches any word character (alphanumeric and underscore). Equivalent to '[A-Za-z0-9_]'.
\W Matches any non-word character. Equivalent to '[^A-Za-z0-9_]'.
\xn Matches n, where n is a hexadecimal escape sequence. The hexadecimal escape sequence must be exactly two digits long. For example, '\x41' matches "A". '\x041' is equivalent to '\x04' & "1". ASCII encoding can be used in regular expressions.
\num Matches num, where num is a positive integer. This is a reference to a previously matched group. For example, '(.)1' matches two consecutive identical characters.
\n Indicates an octal escape sequence or a backreference. If \n is preceded by at least n captured sub-expressions, it is a backreference. Otherwise, if n is an octal digit (0-7), it is an octal escape sequence.
\nm Indicates an octal escape sequence or a backreference. If \nm is preceded by at least nm captured sub-expressions, it is a backreference. If \nm is preceded by at least n captured sub-expressions, it is a backreference followed by the literal m. If none of these conditions are met, and n and m are octal digits (0-7), \nm matches the octal escape sequence nm.
\nml If n is an octal digit (0-3) and m and l are octal digits (0-7), it matches the octal escape sequence nml.
\un Matches n, where n is a Unicode character expressed as four hexadecimal digits. For example, \u00A9 matches the copyright symbol (©).

Example

Next, we analyze a regular expression for matching email addresses, as shown below:

Example

``` var str = "abcd [email protected] 1234"; var patt1 = /\b[\w.%+-]+@[\w.-]+.[a-zA-Z]{2,6}\b/g; document.write(str.match(patt1));

The following marked text is the expression that matches:

[email protected]

❮ Regexp Intro Regexp Tutorial ❯