|
Meta-Character
|
Description |
|
^
|
This meta-character, the caret, matches the beginning of a
string or, if the /m option isused, match the beginning of a
line. It is one oftwo pattern anchors, the other anchor is the $.
|
|
.
|
This meta-character will match any single character except
for the newline character unless the /s option is specified.
If the /s option is specified, then the newline will also be
matched. |
|
$
|
This meta-character will match the end of a string or,if the
/m option is used, match the end of a line.It is one of two pattern
anchors; the other anchoris the ^. |
|
|
|
This meta-character, called alternation, lets you specify
two values that can cause the match to succe|ed. For instance, m/a|b/
means that the $_variable must contain the "a" or "b"
character forthe match to succeed. |
|
*
|
This meta-character indicates that the "thing" immediately
to the left should be matched zero or more times in order to be evaluated
as true (thus .*matches any number of characters). |
|
+
|
This meta-character indicates that the "thing" immediately
to the left should be matched one or more times in order to be evaluated
as true. |
|
?
|
This meta-character indicates that the "thing" immediately
to the left should be matched zero or one times to be evaluated as true.
When used inconjunction with the +, ?, or {n, m}
meta-characters and brackets, it means that the regular expression should
be non-greedy and match the smallest possible string. |
|
Meta-Brackets
|
Description |
|
()
|
The parentheses let you affect the order of pattern evaluation
and act as a form of pattern memory. See the "Special Variables"
chapter for moredetails. |
|
(?...)
|
If a question mark immediately follows the left parentheses,
it indicates that an extended mode component is being specified; this
is new to Perl 5. |
|
(?#comment)
|
Extension: comment is any text. |
|
(?:regx)
|
Extension: regx is any regular expression but ()
are not saved as a backreference. |
|
(?=regx)
|
Extension: Allows matching of zero-width positive lookahead
characters (that is, the regular expression is matched but not returned
as being matched). |
|
(?!regx)
|
Extension: Allows matching of zero-width negative lookahead
characters (that is, negated form of (=regx)). |
|
(?options)
|
Extension: Applies the specified options to the pattern
bypassing the need for the option to specified in the normal way. Valid
options are: i (case insenstive), m (treat as multiple
lines), s (treat as single line), and x (allow whitespace
and comments). |
|
{n, m}
|
Braces let you specify how many times the "thing"
immediately to the left should be matched. {n} means that it
should be matched exactly n times. {n,} means it must
be matched at least n times. {n, m} means that it must
be matched at least n times but not more than m times.
|
|
[]
|
Square brackets let you create a character class. For instance,
m/[abc]/ evaluates to True if any of "a", "b",
or "c" is contained in $_. The square brackets are
a more readable alternative to the alternation meta-character. |
|
Meta-Sequences
|
Description |
|
\
|
This meta-character "escapes" the character which
follows. This means that any special meaning normally attached to that
character is ignored. For instance, if you need to include a dollar sign
in a pattern, you must use \$ to avoid Perl's variable interpolation.
Use \\ to specify the backslash character in your pattern. |
|
\nnn
|
Any octal byte where nnn represents the octal number;
this allows any character to be specified by its octal number. |
|
\a
|
The alarm character; this is a special character which, when
printed, produces a warning bell sound. |
|
\A
|
This meta-sequence represents the beginning of the string.
Its meaning is not affected by the /m option. |
|
\b
|
This meta-sequence represents the backspace character inside
a character class; otherwise, it represents a word boundary. A word boundary
is the spot between word (\w) and non-word (\W) characters.
Perl thinks that the \W meta-sequence matches the imaginary characters
of the end of the string. |
|
\B
|
Match a non-word boundary. |
|
\cn
|
Any control character where n is the character (for
example, \cY for Ctrl+Y). |
|
\d
|
Match a single digit character. |
|
\D
|
Match a single non-digit character. |
|
\e
|
The escape character. |
|
\E
|
Terminate the \L or \U sequence. |
|
\f
|
The form feed character. |
|
\G
|
Match only where the previous m//g left off. |
|
\l
|
Change the next character to lowercase. |
|
\L
|
Change the following characters to lowercase until a \E
sequence is encountered. |
|
\n
|
The newline character. |
|
\Q
|
Quote regular expression meta-characters literally until the
\E sequence is encountered. |
|
\r
|
The carriage return character. |
|
\s
|
Match a single whitespace character. |
|
\S
|
Match a single non-whitespace character. |
|
\t
|
The tab character. |
|
\u
|
Change the next character to uppercase. |
|
\U
|
Change the following characters to uppercase until a \E
sequence is encountered. |
|
\v
|
The vertical tab character. |
|
\w
|
Match a single word character. Word characters are the alphanumeric
and underscore characters. |
|
\W
|
Match a single non-word character. |
|
\xnn
|
Any hexadecimal byte. |
|
\Z
|
This meta-sequence represents the end of the string. Its meaning
is not affected by the /m option. |
|
\$
|
The dollar character. |
|
\@
|
The ampersand character. |
|
\%
|
The percent character. |