Near Matches
Ignore Exact
Everything
2
regular expression (thing)
See all of regular expression
, there are 7 more in this node.
(
thing
)
by
Kung
Mon Jul 10 2000 at 10:03:59
A
regular expression
is a
string of characters
that defines a set of one or more other strings.
Any
string
that is defined by a regular expression is said to
match
that expression.
Regular Expressions
are
implemented
by a number of different languages and tools an unfortunately each implementation tends to be slightly different. This writeup attempts to be a general overview of
RE
s.
Delimiters
A delimiter is a
special character
that is used to mark the beginning and end of a regular expression. A common delimiter is
/
.
The most basic regular expression contains no special characters other than the
delimiter
and matches only itself
For example
/ring/
matches
ring
as in sp
ring
,
ring
ing, st
ring
ing
To get a regular expression to match more than one string you use
special characters
that have special meaning when part of a
regular expression
The following lists special characters and their meaning together with examples. In the examples the only the strings in
bold
are matched. Oh and the examples
build
on each other
.
(
period
)
Matches any
single
character
For Example
/.alk/
matches all strings with any character preceding 'alk'
as in
balk
or
talk
ing
*
(
asterisk
)
An asterisk will match
zero or more occurrences
of the character directly before it. Note that the character directly before it can be defined by a regular expression.
For example
/ab.*c/
matches
ab
followed by zero or more occurrences of any character followed by
c
as in
abc
or
abjhgt gfafdg 43543 fgd c
^
(
caret
)
Causes the regular expression to only match strings at the
beginning
of a line
For example
/^T/
matches a
T
at the start of a line
as in
T
his line
but not This line
$
(
dollar sign
)
Causes the regular expression to only match strings at the
end
of a line
For example
/:$/
matches any
colon
that ends a line
as in
this line
:
but not : this one
[]
(
square brackets
)
define a
character class
that matches any single character within the
bracket
s.
For Example
/t[aeiou].k/
matches
t
followed by a
lower case
vowel
, any character and a
k
as in
talk
or s
tink
or
teak
Within square brackets
*
,
/
and
$
lose their special meanings
If the first character following [ is a
^
it has a new meaning - the
character class
now matches any single character
not
within the brackets
Also you can use a
-
to denote a
range of characters
For Example
/[^a-zA-Z]/
matches any single character that is not a letter
Turning Special Characters Off
You can turn a special character off by preceding it with a
\
( backslash ). This is known as
quoting
For Example
/\*/
matches a single
asterisk
and
/\\/
matches a single
backslash
and
/and\/or/
matches
and/or
Longest Match Possible
A regular expression will always match the
longest match possible
For example, given the following string:
This (Dman) is a quite ( opinionated young fellow ), isn't he?
/Th.*is/
matches
This (Dman) is a quite ( opiniated young fellow ), is
and
/(.*)/
matches
(Dman) is a quite ( opiniated young fellow )
while
/([^)]*)/
matches
(Dman)
NOTE
YMMV
: For example in
Perl
the
Longest Match Possible
doesn`t hold. Perl will match the first string it finds. Also ( is a special character in Perl's implementation so it would need to be escaped using \.
regex
No rexen for the wildcard
World's most narrowly useful programming language
Mastering Regular Expressions
10 steps to becoming a Perl Ninja
animal book
Leaning Toothpick Syndrome
Perl
Kleene star
my first perl program
regular language
SED
*n?x
O'Reilly
regexp
Unicode Technical Report
grep
vi
s///
Comparing UNIX to DOS
The Jakarta Project
steps to UNIX familiarity
the key commands all emacs users should know
E2 node autolinker in perl