Regular Expressions/Basic Regular Expressions

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Basic Regular Expressions: Note that particular implementations of regular expressions interpret the backslash symbol differently in front of some of the metacharacters. For example, egrep and perl interpret unbackslashed parentheses and vertical bars as metacharacters, reserving the backslashed versions to mean the literal characters themselves. Old versions of grep did not support the pipe alternation operator.

Operator Effect
. The dot operator matches any single character.
[ ] boxes enable a single character to be matched against a character lists or character range.
[^ ] A compliment box enables a single character not within in a character list or character range to be matched.
* An asterisk specifies zero or more characters to match.
^ The caret anchor matches the beginning of the line
$ The dollar anchor matches the end of the line
Example Match
".at" any three-character string like hat, cat or bat
"[hc]at" hat and cat
"[^b]at" all the matched strings from the regex ".at" except bat
"^[hc]at" hat and cat but only at the beginning of a line
"[hc]at$" hat and cat but only at the end of a line

Since many ranges of characters depends on the chosen locale setting (e.g., in some settings letters are organized as abc..yzABC..YZ while in some others as aAbBcC..yYzZ).

The Posix Basic Regular Expressions syntax provided extensions for consistency between utility programs such as grep, sed and awk. These extensions are not supported by some traditional implementations of Unix tools.

Use in Tools[edit]

Tools and languages that utilize this regular expression syntax include: TBD