PHP Programming/Regular expressions

From Wikibooks, open books for an open world
Jump to navigation Jump to search
Integration Methods (HTML Forms, etc.) PHP Programming
Regular expressions
Data Structures


Usual regular expressions
Character Type Explanation
. Dot any character
[...] Brackets character class: all the enumerated characters in the class
[^...] Brackets and circumflex complemented class: all the characters except for the enumerated ones
^ Circumflex string or line start
$ Dollar string or line end
| Pipe alternative
(...) Parenthesis capture group: also used to limit the range of an alternative
* Asterisk 0, 1 or several occurrences
+ Plus 1 or several occurrences
? Interrogation 0 or 1 occurrence
POSIX characters classes[1]
Classe Signification
[[:alpha:]] any letter
[[:digit:]] any digit
[[:xdigit:]] hexadecimal characters
[[:alnum:]] any letter or digit
[[:space:]] any white space
[[:punct:]] any punctuation letter
[[:lower:]] any small cap letter
[[:upper:]] any capital letter
[[:blank:]] space or tabulation
[[:graph:]] displayable et printable characters
[[:cntrl:]] escaping characters
[[:print:]] printable characters, except for the control ones
Unicode regex[2]
Expression Signification
\A String start
\b Start or end of word character
\d Digit
\D Non digit
\s Space characters
\S Non space characters
\w Letter, digit or underscore
\W Non letter, digit or underscore character
\X Unicode character
\z String end


  • ?:: ignore the capture group when numeration. Ex: ((?:ignored_substring|other).)
  • ?!: negation. Ex: ((?!excluded_substring).)
  • $1: first capture group result.

Attention: to search for a dollar, "\$" doesn't work because it's the variables format, so the simple quotes must be used instead of the double quotes: '\$'.

in PHP, the regex patterns must always be surrounded by a delimiter symbol. We generally use the grave accent (`), but we also find / and #.

In addition, we can add some options after these delimiters:

i case insensibility
m the "." include carriage returns
x ignore spaces
o only treat the first match
u count the Unicode characters (in multi-byte)


The function ereg(), which allowed to research in regex, has been replaced by preg_match() since PHP 5.3.


The function preg_match[3] is the main regex search function[4]. It returns a Boolean and asks the two mandatory parameters: the regex pattern and the string to scan.

The third parameter represents the variable which stores the results array.

Finally, the fourth accepts an PHP flag allowing to modify the function base behavior.

  • Minimal example:
$string = 'PHP regex test for the English Wikibooks.';

if (preg_match('`.*Wikibooks.*`', $string)) {
    print('This texts talks about Wikibooks');
} else {
    print('This texts doesn\'t talk about Wikibooks');
  • Advanced example:
$string = 'PHP regex test for the English Wikibooks.';

if (preg_match('`.*Wikibooks.*`', $string), results, $flag) {
} else {
    print('This texts doesn\'t talk about Wikibooks');

Flag examples:[5]

  • PREG_OFFSET_CAPTURE: displays the searched substring position in the string.
  • PREG_GREP_INVERT: displays the inverse in preg_grep().


This function searches into arrays[6].


To get all true results in one array, replace preg_match by preg_match_all[7], and print by print_r.

Example to filter a file content:

$regex = "/\(([^)]*)\)/";
preg_match_all($regex, file_get_contents($filename), $matches);



The function preg_replace accepts three parameters: the replaced and replacing string to treat.

// Replace spaces by underscores
$string = "PHP regex test for the English Wikibooks.";
$sortedString = preg_replace('`( )`', '_', $string);
echo $sortedString;


Same as preg_replace() but its result only include the replacements.


Decomposes a string.


Integration Methods (HTML Forms, etc.) PHP Programming
Regular expressions
Data Structures