Regular Expressions - Introduction
Unless you have used regular expressions before, you may be unfamiliar with some of the terminology. However, it is undeniable that you have used certain concepts of regular expressions without involving scripts.
For example, you have likely used the ?
and *
wildcards to find files on your hard drive. The ?
wildcard matches 0 or 1 character in the filename, while the *
wildcard matches zero or more characters. A pattern like data(\w)?\.dat
will find the following files:
data.dat
data1.dat
data2.dat
datax.dat
dataN.dat
Using the *
character instead of the ?
character increases the number of files found. data.*\.dat
matches all of the following files:
data.dat
data1.dat
data2.dat
data12.dat
datax.dat
dataXYZ.dat
Although this search method is useful, it is limited. By understanding how the *
wildcard works, the concepts on which regular expressions rely are introduced, but regular expressions are more powerful and flexible.
The use of regular expressions allows for powerful functionality through simple means. Below is a simple example:
^
matches the start of the input string.[0-9]+
matches multiple digits,[0-9]
matches a single digit, and+
matches one or more.abc$
matches the lettersabc
and ends withabc
, where$
matches the end of the input string.
When writing a user registration form, you can allow usernames to contain letters, digits, underscores, and hyphens -
, and set the username length using the following regular expression:
The above regular expression can match tutorialpro, tutorialpro1, run-oob, run_oob, but not ru because it contains too few letters, less than 3, and cannot match tutorialpro$ because it contains special characters.
Example
Match a string that starts with a digit and ends with abc
:
var str = "123abc";
var patt1 = /^[0-9]+abc$/;
document.write(str.match(patt1));
The following marked text is the matched expression: 123abc
Continuing to read this tutorial will allow you to freely apply such code.
Why Use Regular Expressions?
Typical search and replace operations require you to provide the exact text that matches the expected search result. While this technique may be sufficient for simple search and replace tasks on static text, it lacks flexibility, and searching dynamic text using this method, if not impossible, is at least difficult.
By using regular expressions, you can:
- Test patterns within a string.
- Replace text.
- Extract substrings from a string based on pattern matching.
For example, you might need to search an entire website, remove outdated material, and replace certain HTML formatting tags. In this case, you can use regular expressions to determine whether the material or HTML formatting tags appear in each file. This process narrows down the list of affected files to those that contain the material that needs to be removed or changed. Then, you can use regular expressions to remove the outdated material. Finally, you can use regular expressions to search and replace tags.
History
The "ancestors" of regular expressions can be traced back to early studies on how the human nervous system works. Neurophysiologists Warren McCulloch and Walter Pitts developed a mathematical approach to describe these neural networks.
In 1956, a mathematician named Stephen Kleene built on the early work of McCulloch and Pitts and published a paper titled "Representation of Neural Network Events," introducing the concept of regular expressions. Regular expressions are used to describe what he called the "algebra of regular sets," hence the term "regular expression."
Subsequently, this work was found applicable to some early studies using Ken Thompson's computational search algorithm, who is a principal inventor of Unix. The first practical application of regular expressions was the qed editor in Unix.
As they say, the rest is history. Since then, regular expressions have been an integral part of text-based editors and search tools.
Application Domains
Currently, regular expressions are widely used in many software, including *nix (Linux, Unix, etc.), HP, and other operating systems, as well as development environments like PHP, C#, Java, and many application software.
C# Regular Expressions
In our C# tutorial, the C# Regular Expressions section specifically introduces C# regular expressions.
Java Regular Expressions
In our Java tutorial, the Java Regular Expressions section specifically introduces Java regular expressions.
JavaScript Regular Expressions
In our JavaScript tutorial, the JavaScript Regular Expressions section specifically introduces JavaScript regular expressions. In our JavaScript tutorial, the section on JavaScript RegExp Object specifically introduces knowledge about JavaScript regular expressions, and we also provide a comprehensive JavaScript RegExp Object Reference Manual.
Python Regular Expressions
In our Python basic tutorial, the section on Python Regular Expressions specifically introduces knowledge about Python regular expressions.
Ruby Regular Expressions
In our Ruby tutorial, the section on Ruby Regular Expressions specifically introduces knowledge about Ruby regular expressions.
| Command or Environment | . | [ ] | ^ | $ | ( ) | { } | ? | + | | | ( ) | | vi | √ | √ | √ | √ | √ | | | | | | | Visual C++ | √ | √ | √ | √ | √ | | | | | | | awk | √ | √ | √ | √ | | awk supports this syntax, but you need to add --posix or --re-interval parameters on the command line, as seen in the man awk interval expression | √ | √ | √ | √ | | sed | √ | √ | √ | √ | √ | √ | | | | | | delphi | √ | √ | √ | √ | √ | | √ | √ | √ | √ | | python | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | | java | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ | | javascript | √ | √ | √ | √ | √ | | √ | √ | √ | √ | | php | √ | √ | √ | √ | √ | | | | | | | perl | √ | √ | √ | √ | √ | | √ | √ | √ | √ | | C# | √ | √ | √ | √ | | | √ | √ | √ | √ |