PATTERN MATCHING in Perl

7	Perl Pattern Matching

Basic Matching

One of the most powerful resources in Perl is it's capability to search through text to find patterns and perform operations on the matched substrings. We've been introduced to pattern matching previously within the VI editor and as it has been used with split. At it's most basic, we can search through a linefeed terminated string for a specific matching array of characters and return a true or false, depending if the pattern we were searching for was found or not. Pattern matching functions largely as subset of Perl's quoting functionality. If we want to match a pattern, we can use the m(Find This); quoting syntax, or we can use the default quotes of /Find This/;. The only difference between the two syntaxes is that in the first example we can choose our own quotes, and in the second, the default slash is our quote. Within the quotes is a string which undergoes both standard and non-standard interpolation. For example, assume we have a file created with the vi editor which looks like this:

With great resolve Pickard looked at the ambassador and said,
"A lie of omission is still a lie." He stormed out of the room
and beamed up to the Enterprise. When he arrive on the ship,
the crew was awaiting him. One look and they all realized that not
everything went well on the planet below

Now we want to search through this file for the word 'Enterprise'. We can accomplish this readily using the following snippet of code:

open ST, "$/home/ruben/startrek";
while(< ST >){
	$line = $_;
	print "Found it!\n" if ($line =~ m!Enterprise!);
}

We can now see the two elements of pattern matching in their natural environment. First we see the bindery operator =~ which is the operator which binds a string to a pattern. Secondly we see the matching quote m!Enterprise!.

When we use the operator =~, the string on the left of the operator is searched for the pattern describe in the matching quote on the right. The return value of the bindery operator is a boolean value. Perl has two bindery operators.

=~ Does it match? or is the pattern present?

!~ Does it not match? or Is the pattern Absent

The quote mechanism can look like any of the follow. These all have the same meaning accept for the quote character (which is a metacharacter by definition.

m!Enterprise! ! is the quote character.

m|Enterprise| | is the quote character.

/Enterpriser/ Standard Matching quote character is /. m is not needed

The pattern string in interpolated in the quotes as with any double quoted string. In this program, for example, we see that in both cases we get a successful match.

#!/usr/bin/perl $pat = "Bicycle"; $string = "Mary likes a bicycle"; if( $string =~ m!$pat!i ){ print "Found Match\n"; }else{ print "Uh Oh!\n"; } if( $string =~ m!Bicycle!i ){ print "Found Match\n"; }else{ print "Uh Oh!\n"; }

Note that the i after the pattern tells Perl to match regardless of the case of the letters. This is a switch to turn on case insensitive matching. In addition to the regular interpolation, Perl provides a series of powerful metacharacters to extend it's pattern matching capabilities. These metacharacters together formulate a whole division of programming called regular expressions. Regular expressions are not unique to Perl. They were originally designed for Unix and were used in the ed editor, in the shell, and extended later to sed, grep, VI and awk, among other places.

Completely mastering regular expressions can take years of practice. Even so, the novice programmer can master quickly a subset of regex to produce powerful and useful code. Let's look first at 10 metacharacters in Perl regex and see how they are used with simple m// syntax.

. Match any character m/b./ matches ba bb bc ...
Note that any character means it doesn't care about what character
it matches and retains no memory of what it matched. It doesn't care
what character is in front of it. It does not match a linefeed

* Match the previous character zero or more times m/br*/ will match ba bb br and even b. Note this is not the same behaviour seen on the command line. This is used mostly with a dot.
".*" behaves similarly to * in the born shell.

? Matches the previous character zero or one times. m/br?/ matches br, brrr, bragley or b

+ Match 1 or more of the previous character m/br+/ matches brrrrr brrring but not b

\ Escapes from the default meaning as usual m/\./ matches b. bbb. . etc

^ Matches the beginning of the line m/^Br/ matches Br but not rBr

$ Matches the end of the line Similar to VI where 1,$s/ra/ar/ goes from line one to
the end. m/ben$/ matches Ruben not Rubin

\d matches digits m/\d+/ matches 123 not abc

\s Matches whitespace m/b\s+b/ Match "b b" but not "bb"

\w Matches word characters Matches alphanumerics and underscores m/\w+/ matches May_var

\b Matches the word boundary m/\bring/ matches ring, not bring

Here is a little program that you can use to help you understand how the metacharacters work. Experimentation and use is the quickest route to mastering these metacharacters.

#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "baaaa"; $four = "bagly"; $five = "agly"; @array = ($zero, $one, $two, $three, $four, $five); for $tmp (@array){ if($tmp =~ m/(bb+)/){ print "$tmp matches\n"; print "$1 is the actual matched pattern\n\n"; } }

Perl's regular expression engine also takes switches after the ending quote character. We saw that m/enterprise/i will match without regard to case. 'i' is case incentive matching. Use can use the acronym misx:

m multiline searching This switch changes this default behaviour to match the \n.
$ and ^ change behaviour and mark lines instead of entire string
If we use the string $starwars = "Once upon a time\n in
a galaxy far far away"; and run the match $starwars =~ m/time$/m we match:
'time'. m/time$/ is an empty match

i Match case insensitive Enterprise, enterPrise and enterprise are all viewed as
the same pattern.

s Match strings Changes the default behavior of the . metacharacter\n
The dot doesn't match a \n normally to facilitate matching standard input.
This switch changes this behaviour and the . does match \n.
If we use the string $starwars = "Once upon a time\n in
a galaxy far far away"; and run the match $starwars =~ m/time..far$/s we match:
'time\n a Galaxy far'. m/time$/ is an empty match

x Permits white space in the regular expression string
to easy legibility of the code Permits things like m!\s.+away \.!x

Explicit Quantifiers

Curly Braces have a meaning within the quoted matching string outside of the normal Perl mean. The {#,#} term within the pattern is a general quantifier on the match. The metacharacters, *,?, and + are really a subset of the general quantifier.

For example if we have a matching quote string of m!Galaxy.*far! we can use this quote instead:

m!Galaxy.{0,}far!

The quantifier tries to match the last character and perform repeated matches of it according to the following rules:

The first digit matches the rest of the string a minimum number of times according to the number.
If there is no second number and no comma, the last character must match the rest of the string exactly that number of times, no more, no less.
If there is a comma following the first digit, but no number, the last character matched is matched at least the minimum number of times, up to infinity number of times.
If there is 2 numbers in the curly braces, each of which is separated by a comma, then the last character is matched a minimum of the first digit and a maximum of the second digit.

Let's look at the sample program again, slightly modified:

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time\n in a galaxy far far away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ m/(mo{0})/s){
 		print "$tmp matches\n";
 		print "\'$1\' is the actual matched pattern\n\n";
 	}
 }

ruben@ruben:/home/ruben/perl_course > ./file63.pl
moooooooo matches
'm' is the actual matched pattern

Once upon a time
in a galaxy far far away matches
'm' is the actual matched pattern

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time\n in a galaxy far far away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ m/(mo{0,})/s){
 		print "$tmp matches\n";
 		print "\'$1\' is the actual matched pattern\n\n";
 	}
 }

ruben@ruben:/home/ruben/perl_course > ./file64.pl
moooooooo matches
'moooooooo' is the actual matched pattern

Once upon a time
in a galaxy far far away matches
'm' is the actual matched pattern

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time\n in a galaxy far far away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ m/(mo{0,2})/s){
 		print "$tmp matches\n";
 		print "\'$1\' is the actual matched pattern\n\n";
 	}
 }

ruben@ruben:/home/ruben/perl_course > ./file65.pl
moooooooo matches
'moo' is the actual matched pattern

Once upon a time
in a galaxy far far away matches
'm' is the actual matched pattern

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time\n in a galaxy far far away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ m/(mo{1,3})/s){
 		print "$tmp matches\n";
 		print "\'$1\' is the actual matched pattern\n\n";
 	}
 }

ruben@ruben:/home/ruben/perl_course > ./file66.pl
moooooooo matches
'mooo' is the actual matched pattern

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time\n in a galaxy far far away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ m/(mo{1,})/s){
 		print "$tmp matches\n";
 		print "\'$1\' is the actual matched pattern\n\n";
 	}
 }

ruben@ruben:/home/ruben/perl_course > ./file67.pl
moooooooo matches
'moooooooo' is the actual matched pattern

Our example program has been using a little trick all along which I'm sure is bothering you. We have returned the actual string which is matched by our pattern. We can do this with the parenthesis around the part of the pattern which we want to match. Each set of parenthesis is automatically assigned to magic variables $1, $2, $3 etc. It also groups that part of the pattern together for other purposes, as we will explore in a minute. This ability to capture a pattern out of a string is very powerful. It's also often abused by lazy programmers who use the pattern matching engine when they should be using one of the string functions like substr. If you need to get first 3 characters of a string....please use substr. If you need to determine if a pattern exists which might not and can be anywhere in the string for some reason, this is a better use of regex operations.

If the parenthesis are used with the explicit quantifier, the quantifier represents the grouping in the parenthesis. Otherwise, it only represents the last character matched.
#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "moooooooo"; $four = "bagly"; $five = "agly"; $six = "Once upon a time\n in a galaxy far far away"; $seven = "Once upon a time\n in a galaxy far far far far far far far away"; @array = ($zero, $one, $two, $three, $four, $five, $six, $seven); for $tmp (@array){ if($tmp =~ m/((far ){1,})/s){ print "$tmp matches\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." fars in our string\n\n"; } }

ruben@JBSapphire:~/perl_course>./file68.pl
Once upon a time
 in a galaxy far far away matches
'far far ' and 'far ' are the actual matched pattern
There are 2 fars in our string

Once upon a time
 in a galaxy far far far far far far far away matches
'far far far far far far far ' and 'far ' are the actual matched pattern
There are 7 fars in our string

In this example, we have nested parenthesis. I wanted to match 'far ' as many times as it appears and needed to use the the internal parenthesis to accomplish this. The external parenthesis assign $1 since they are seen first, and the internal $2. Notice that if we change the {1,} to {4,} that string 6 is not a match and string 7 is! Try it on your own

Alternate Matching

In addition to a default matching, perl allows for you to choose between several patterns using the '|' symbol in your pattern. It looks for the first pattern, and if it can't find it, it then looks for the second patterns. Without parenthesis, it defaults to matching a single character. If you use parenthesis, it considers them a single unit generically called an atom in regex speak.

Hence:
#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "moooooooo"; $four = "bagly"; $five = "agly"; $six = "Once upon a time\n in a galaxy far far away"; $seven = "Once upon a time\n in a galaxy far far far far far far far away"; @array = ($zero, $one, $two, $three, $four, $five, $six, $seven); for $tmp (@array){ if($tmp =~ m/((far )|(far far))/s){ print "$tmp matches\n"; print "\'$1\' and \'$2\' and \'$3\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." fars in our string\n\n"; } }

ruben@JBSapphire:~/perl_course>file69.pl
Once upon a time
 in a galaxy far far away matches
'far ' and 'far ' and '' are the actual matched pattern
There are 1 fars in our string

Once upon a time
 in a galaxy far far far far far far far away matches
'far ' and 'far ' and '' are the actual matched pattern
There are 1 fars in our string

But with a little change:

#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "moooooooo"; $four = "bagly"; $five = "agly"; $six = "Once upon a time\n in a galaxy far far away"; $seven = "Once upon a time\n in a galaxy far far far far far far far away"; @array = ($zero, $one, $two, $three, $four, $five, $six, $seven); for $tmp (@array){ if($tmp =~ m/((Once )|(far far))/s){ print "$tmp matches\n"; print "\'$1\' and \'$2\' and \'$3\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our string\n\n"; } }

ruben@JBSapphire:~/perl_course>file69a.pl
Once upon a time
 in a galaxy far far away matches
'Once ' and 'Once ' and '' are the actual matched pattern
There are 1 words in our string

Once upon a time
 in a galaxy far far far far far far far away matches
'Once ' and 'Once ' and '' are the actual matched pattern
There are 1 words in our string

#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "moooooooo"; $four = "bagly"; $five = "agly"; $six = "Once upon a time\n in a galaxy close close away"; $seven = "Once upon a time\n in a galaxy far far far far far far far away"; @array = ($zero, $one, $two, $three, $four, $five, $six, $seven); for $tmp (@array){ if($tmp =~ m/((close close )|(far far))/s){ print "$tmp matches\n"; print "\'$1\' and \'$2\' and \'$3\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our string\n\n"; } }

ruben@JBSapphire:~/perl_course>file69b.pl
Once upon a time
 in a galaxy close close away matches
'close close ' and 'close close ' and '' are the actual matched pattern
There are 2 words in our string

Once upon a time
 in a galaxy far far far far far far far away matches
'far far' and '' and 'far far' are the actual matched pattern
There are 2 words in our string

Notice and account for the different order that the patterns are assigned when printed.

Substitution

Just like we can in the Visual Editor, we can do substation within perl using regular expressions. Instead of the m/pattern/ syntax, we need the s/pattern/replacement/ syntax. With straight matching, the 'm' is optional. With substitution, the 's' is mandatory. The string the the second half of s/regex/string/ is a simple double quoted string. It does not interpolate regex metachars.

#!/usr/bin/perl 
 
 $zero = "bbbb";
 $one = "b";
 $two = "brrrr";
 $three = "moooooooo";
 $four = "bagly";
 $five = "agly";
 $six = "Once upon a time ....\n in a galaxy close close away";
 
 @array = ($zero, $one, $two, $three, $four, $five, $six);
 
 for $tmp (@array){
 	if($tmp =~ s/(close close )/far far /){
 		print "$tmp is the matched variable\n";
 		print "\'$1\' is  the actual matched pattern\n";
 		@fars = split /\s+/, $1;
 		print "There are " . @fars ." words in our string\n\n"; 
 	}
 }

is and example of a simple match. In the second half of the substitution, some of the things we can do wrong is to try to include parenthesis around far:

if($tmp =~ s/(close close )/(far far) /){
if($tmp =~ s/(close close )/far far.* /){
if($tmp =~ s/(close close )/far far\s/){

are all WRONG and bomb your code.

Some things you can put there are regular double quote stuff like:

if($tmp =~ s/(close close )/far far $tmp/){
if($tmp =~ s/(close close )/far far $1/){
if($tmp =~ s/(close close )/far far $array[0]/){

These are OK and can work. By the way, you can see I used $1 in on example on the OK list. Instead of $1 we can also use what is called a back reference:

if($tmp =~ s/(close close )/far far \1/){

Personally, I prefer $1. You might also note that $+ is what the last bracket matched. $`,$&, and $' are assigned everything before the match, the matched string and everything after the match. These are less seen but are documented in man perlre.

The general rule for substitution is:

s/PATTERN/REPLACEMENT/egimosx

substitution has different switches then plain matches. This table explains the use of the switches.

s///i Case insensitive Matching Just like with plain matching

s///g Global replacement. Replaces every pattern matched in the bound variable. Similar to VI

s///e evaluate the second half of the substitution string. This can be most useful, but can also be a security risk. It works similarly as the function eval (perldoc -f eval). Any correct perl syntax can be put on the right side and it is evaluated on the fly by Perl.

s///m Multiple Lines. As before, changes the default behavior of '^' and '$' to stop with\n.

s///s Single line As before, it changes the behavior of '.' so that it matches linefeeds.

s///x Allows white space in your string Works like matches

s///o Compile regex once. Normally, the regular expression engine will evaluate a pattern on the left. Before doing so, it interpolates any scalars that might be included within it. If you are running it in a loop or under other conditions, the pattern will keep being re-evaluated and the scalars re accessed. If you don't want this to happen, with it's overhead to your program, use can use the s///o switch to evaluate the pattern only once. If the scalars change, the pattern will not change.

This is an example program designed to help you understand what these switches do. Note that when you examine this source code, that the back references retain the value of the last previously matched value when they come up blank within a scope like our for loops.

#!/usr/bin/perl $zero = "far"; $one = "close"; $two = "far far "; $three = "Solar System"; $four = "TIME"; $five = "print $tmp"; $six = "Once upon a time ....\n in a galaxy close close close close close away"; @array = ($zero, $one, $two, $three, $four, $five, $six); for $tmp (@array){ if($tmp =~ s/((close ){1,})/far far /){ print "This is the FIRST altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; for $tmp (@array){ if($tmp =~ s/((close ){1,})/$two/){ print "This is the SECOND altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; for $tmp (@array){ if($tmp =~ s/((close ){1,})/"$zero " x5/e){ print "This is the THIRD altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; for $tmp (@array){ if($tmp =~ s/((close ){1,})/"$zero " x5/){ print "This is the FORTH altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; for $tmp (@array){#Notice it is not easy to predict WHEN this will print if($tmp =~ s/((close ){1,})/"$zero " x5; print "$tmp\n";/e){ print "This is the FIFTH altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; $six = $array[6]; for $tmp (@array){ if($tmp =~ s/(($six){1,})/"$zero " x5/e){ print "This is the SIXTH altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } } @array = &setright; for $tmp (@array){ $i++; $place = $tmp; print "This is the SEVENTH sub $i string before attempting to alter it:\n$place\n"; $tmp =~ s/(($place){1,})/"$zero " x5/oe; print "This is the SEVENTH sub $i the string after we attempted to alter:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; }# NOTICE THAT $1 and $2 seem to remain 'far' after the initial match #and $place within the regex does not alter $i = 0; @array = &setright; for $tmp (@array){ $i++; $place = $tmp; $tmp =~ s/(($place){1,})/"$zero " x5/e; print "This is the EIGHT sub $i altered string:\n$tmp\n"; print "\'$1\' and \'$2\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; }# NOTICE THAT $1 and $2 change $i = 0; @array = &setright; for $tmp (@array){ $i++; $place = $tmp; $tmp =~ s/($one)/$zero/g; print "This is the NINTH sub $i altered string:\n$tmp\n"; print "\'$1\' is the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." words in our matched string\n\n"; } sub setright{ my $zero = "far"; my $one = "close"; my $two = "far far "; my $three = "Solar System"; my $four = "TIME"; my $five = "print $tmp"; my $six = "Once upon a time ....\n in a galaxy close close close close close away"; my @array = ($zero, $one, $two, $three, $four, $five, $six); return @array; }
We have seen the use of the bar '|' in regex to give us alternate possible patterns to match. We can take this idea further by matching a series of characters with square brackets. square brackets permit you to descibe that and number of characters match, or that a set of characters are not allowed to match.

s{[abcd]}{efg} will replace any of the letters a or b or c or d with the string efg. You can also use a hyphen in your case to expraplote the characters such as m/[a-z]/ which will match all characters between a and z, but not numbers or capital letters, extended ascii characters etc.

You can invere the logic of the match and permit a match of anything but the letters in your class by begging the class with a carot '^'. m([^.\-&\\]) will match anything but the period, hyphen, ampestand or slash. Yoou might see such code in security functions. Since '-' has special meaning within the class brackets, backslash out of it if you wish to match it.

You now have a fairly good introduction to pattern matching in perl for your general needs. Perl regular expressions is much more detailed than what is covered in this section. The two most important documents with your Perl distribution is man perlre and man perlop. Very complex things can be done with regex in Perl, including look aheads, returned lists, and a host of special variables which alter how your pattern behaves. In truth, Perl, regular expressions is so extenssive that a full course can be given on the subject.

Beaware that many perl function interplay with regular expression. Two of the most important ones is split and grep.

spit is defined as @array = split PATTERN, $scalar. split is used extensively for data manipulation. It is often the case that you receive data as some deliminated string. The Unix /etc/passwd file which difines users is an example. It is a colon deliminated text file. If you want to open at and assign each users record to a database in memory, split is the way to go. Try this program uses many of the programming techniques that we learned so far. Can you alter this program to send an email to each user on the list?

file69.pl.html
#!/usr/bin/perl $zero = "bbbb"; $one = "b"; $two = "brrrr"; $three = "moooooooo"; $four = "bagly"; $five = "agly"; $six = "Once upon a time\n in a galaxy far far away"; $seven = "Once upon a time\n in a galaxy far far far far far far far away"; @array = ($zero, $one, $two, $three, $four, $five, $six, $seven); for $tmp (@array){ if($tmp =~ m/((far )|(far far))/s){ print "$tmp matches\n"; print "\'$1\' and \'$2\' and \'$3\' are the actual matched pattern\n"; @fars = split /\s+/, $1; print "There are " . @fars ." fars in our string\n\n"; } }

=~	Does it match? or is the pattern present?
!~	Does it not match? or Is the pattern Absent

m!Enterprise!	! is the quote character.
m\|Enterprise\|	\| is the quote character.
/Enterpriser/	Standard Matching quote character is /. m is not needed

.	Match any character	m/b./ matches ba bb bc ... Note that any character means it doesn't care about what character it matches and retains no memory of what it matched. It doesn't care what character is in front of it. It does not match a linefeed
*	Match the previous character zero or more times	m/br/ will match ba bb br and even b. Note this is not the same behaviour seen on the command line. This is used mostly with a dot. "." behaves similarly to * in the born shell.
?	Matches the previous character zero or one times.	m/br?/ matches br, brrr, bragley or b
+	Match 1 or more of the previous character	m/br+/ matches brrrrr brrring but not b
\	Escapes from the default meaning as usual	m/\./ matches b. bbb. . etc
^	Matches the beginning of the line	m/^Br/ matches Br but not rBr
$	Matches the end of the line	Similar to VI where 1,$s/ra/ar/ goes from line one to the end. m/ben$/ matches Ruben not Rubin
\d	matches digits	m/\d+/ matches 123 not abc
\s	Matches whitespace	m/b\s+b/ Match "b b" but not "bb"
\w	Matches word characters	Matches alphanumerics and underscores m/\w+/ matches May_var
\b	Matches the word boundary	m/\bring/ matches ring, not bring

m	multiline searching	This switch changes this default behaviour to match the \n. $ and ^ change behaviour and mark lines instead of entire string If we use the string $starwars = "Once upon a time\n in a galaxy far far away"; and run the match $starwars =~ m/time$/m we match: 'time'. m/time$/ is an empty match
i	Match case insensitive	Enterprise, enterPrise and enterprise are all viewed as the same pattern.
s	Match strings	Changes the default behavior of the . metacharacter\n The dot doesn't match a \n normally to facilitate matching standard input. This switch changes this behaviour and the . does match \n. If we use the string $starwars = "Once upon a time\n in a galaxy far far away"; and run the match $starwars =~ m/time..far$/s we match: 'time\n a Galaxy far'. m/time$/ is an empty match
x	Permits white space in the regular expression string to easy legibility of the code	Permits things like m!\s.+away \.!x

s///i	Case insensitive Matching	Just like with plain matching
s///g	Global replacement.	Replaces every pattern matched in the bound variable. Similar to VI
s///e	evaluate the second half of the substitution string.	This can be most useful, but can also be a security risk. It works similarly as the function eval (perldoc -f eval). Any correct perl syntax can be put on the right side and it is evaluated on the fly by Perl.
s///m	Multiple Lines.	As before, changes the default behavior of '^' and '$' to stop with\n.
s///s	Single line	As before, it changes the behavior of '.' so that it matches linefeeds.
s///x	Allows white space in your string	Works like matches
s///o	Compile regex once.	Normally, the regular expression engine will evaluate a pattern on the left. Before doing so, it interpolates any scalars that might be included within it. If you are running it in a loop or under other conditions, the pattern will keep being re-evaluated and the scalars re accessed. If you don't want this to happen, with it's overhead to your program, use can use the s///o switch to evaluate the pattern only once. If the scalars change, the pattern will not change.

7

Perl Pattern Matching

Basic Matching

Explicit Quantifiers

Alternate Matching

Substitution