6

Structured Programming Technique and Subroutines


Until now we have been looking at individual aspects of Perl Programming with examining techniques to use them together to create functional code. Now, with the detailed discussion of subroutines upon us, we are going to examine the foundation for sound program design.

In the early years of programming, people often created programs in a haphazard fashion, using a lot of goto statements to navigate around the code in an chaotic fashion. As programs became bigger, new techniques were introduced to allow the breaking up of the program into logical chuncks. They also provided for the seperation of different aspects of a program into seperate files. Perl inherits both of these inovations with the use of packages and subroutines. In this section we are going to focus on how subroutines enable us to create rational code. We will also look closer on how scoping varriables help reducing confussion in the maintemence of your programs.

As we saw previously, we can create subroutines by using the keyword sub in front of curly braces. As with previous uses of curly braces, within the braces is a scopable memory space. Unlike while and for loops, scoping varriables within subroutines is of great importance. By using scoped varriables within the subroutine, we can focus on our computing problems within them relatively carefree.

Anything we want to send to our subroutines from the outside program is best accomplished by the use of parameters. Let's look at an example:

 
#!/usr/bin/perl -w
 use strict;
 
 #This program creates a database on the filesystem from user input
 
 print "Welcome to your address book.\nPlease enter the prompted information
 for each record you want entered.\n";
 
 while(1){
 	my $record = &gather();
 	&place($record);
 	&cont() or exit;
 }
 
 sub gather{
 	my($last,$first,$phone,$email, $string);
 	print "Please Enter The Last Name==>";
 	$last = <>;
 	chomp $last;
 	print "Please Enter The First Name==>";
 	$first = <>;
 	chomp $first;
 	print "Please enter the Phone Number==>";
 	$phone = <>;
 	chomp $phone;
 	print "Please enter the email address==>";
 	$email = <>;
 	chomp $email;
 	$string = join "\t", $last, $first, $phone, $email;
 	return $string;
 }
 
 sub place{
 	my($info) = shift @_;
 	my($home) = $ENV{HOME}; #retrieve home from the external env
 	if( -e "$home/rolodex.db"){
 		open FH, ">>$home/rolodex.db" or die $!;
 	}else{
 	open FH, ">$home/rolodex.db" or die $!;
 	}
 	print FH "$info\n" or die $!;
 	close FH;
 }
 	
 sub cont{
 	print "Should we continue? (y or n )\n";
 	my($go);
 	$go = <>;
 	chomp $go;
 	print "$go:\n";
 	return 0 if $go eq 'n';
         return 1;
 }      
 	

This program has only 3 lines in the main loop. It might take some time, but it is worth understanding every line in the program. First notice that we have 3 functions in this program that do the bulk of the work for us. Each one is a logical block of code.

The first subroutine is called gather. gather collects information from the user and presents it as a tab dilimited string. The second subroutine is called place. It puts the string into a database safetly. The last subroutine is called cont, and it determines if the user wants to continue or end the program. On line 1 of our program we are giving the Perl interpreter a directive with the flag -w. This tells Perl to give us extra debugging information in event we make an error in our program. On the second line we do something completely new. The line -

use strict;

imports into our program an external file called a module. This module is commonly used in advanced perl programming. It restricts some of Perl's default behavior, forcing us to write tighter code which is less likely to contain bugs. One thing strict prevents is the use of unscoped varriables. If you want to use a global varriable with strict turned on, you need to specify it's package. For example $main::record = 3 is accecpted. $record = 3 causes an error with strict turned on. my($record) = 3 is accepted.

Line 10 calls the subroutine gather and assigns it's return value in the lexically scoped scalar $record. Lexically scoped means that it we use my, and that it is not in the packages symbol table. On line 11, the string within record is sent as a parameter to the subroutine place. All parameters that are sent to a subroutine are placed in the array @_. Line 12 calls the subroutine cont() and it's return value is evaluated just like it is when we call the built in function open. If it returns 0, we exit the program.

Subroutines are also stored in the namespace symbol table. They can be stored as references and can be deferenced.

$ref_sub = sub gather;
&$ref_sub($argument);

This can be handy for passing subroutine to other subroutines, or other constructions. It also demonstrates that subroutines are actualy stored in memnory just like varriables, but instead of raw data, their value is tibits of executable code. In the subroutine place, on line 34, the scalar arguement stored inside @_ is shifted off the array and stored in the lexically scoped scalar $info. The @_ array is a magic system varriable. The parameters it contain are referenced by value to the exterior just like the magic user defined varriable in the for loop. If you change any of the elements of the array, by using notation like $_[0], you change the parameter that exists outside of the function. The parameters are not copied and assigned into @_, but directly access the data. To aviod changing the values of the parameters, normaly we either copy the data through assignment, or shift the values off the array using shift. If we assign the data to a my scoped varriable, the memory is returned after the function is completed and the variable goes out of scope.

Line 35 uses a system hash called %ENV. It contains the enviormental varriables of the perl process. One of these keys store the users home directory. We capture that information and check to see if the file rolodex.db exists. The operator which tests for the existence of the file is -e. -e is a unary operator which returns true if the specified file exsts, or false if it doesn't exist. In addition to -e, we also have -r, and -w to test if a file can be read or written to. The man perlfunc pages lists a fw more such varriables that test or return other possible information about files.

To quote the manual:

  		   -r  File is readable by effective uid/gid.
                   -w  File is writable by effective uid/gid.
                   -x  File is executable by effective uid/gid.
                   -o  File is owned by effective uid.

                   -R  File is readable by real uid/gid.
                   -W  File is writable by real uid/gid.
                   -X  File is executable by real uid/gid.
                   -O  File is owned by real uid.

                   -e  File exists.
                   -z  File has zero size.
                   -s  File has nonzero size (returns size).

                   -f  File is a plain file.
                   -d  File is a directory.
                   -l  File is a symbolic link.
                   -p  File is a named pipe (FIFO).
                   -S  File is a socket.
                   -b  File is a block special file.
                   -c  File is a character special file.
                   -t  Filehandle is opened to a tty.

                   -u  File has setuid bit set.
                   -u  File has setuid bit set.
                   -g  File has setgid bit set.
                   -k  File has sticky bit set.

                   -T  File is a text file.
                   -B  File is a binary file (opposite of -T).

                   -M  Age of file in days when script started.
                   -A  Same for access time.
                   -C  Same for inode change time.

As you learn more about the Unix operating system and become more sufisticated programers, more of these unary file operators become useful. In this case, we just wanted to know if the file existed or not before blowing the file away by accident rather than appending to an existing file.

On line 41, we checked the return value of print to make sure our write to the hard drive took effect. When printing to STDOUT, it doesn't matter much. But when writing to a hard drive, this is prudent.

Recall that a subroutine is a process. It takes in parameter, and returns a value. All subroutines have a declaration and a definition. In a classic Perl function, the function is called with a & and the parameters are entered in a list as follows:

&sub myroutine($par1, $par2 ...)

The parameters are a list. And it behaves like a list. If an array is passed, it is reduced to scalar values before being passed.

&sub myroutine($par1, $par2, $par3, $par4 );

is the same as

@myarray = ($par3, $par4);
&sub myroutine($par1, @myarray, $par4);

Return values are also flattened to a single list. Subroutines return the last value evaluated in the definition block. You don't need to use a return keyword at the end of the block, although some consider this good programming practice. Often the question comes, how do I pass an array as an arguement, or return 2 hashes. In truth, you can't. Both the input of a function and the output is a flat list. If you try to return a hash, it converts to an array which is simple enough to reform into a hash. Returning two hashes requires some trickery. The easiest way to do this is to simply to pass the arrays or hashes as references and then dereference them. This also removes some of the overhead of copying over the value of the array to a new varriable.

 
 #!/usr/bin/perl
 
 @words = qw(Life is too Short);
 
 $ret = &responder(\@words);
 print "@{$ret}\n";
 
 sub responder{
 	my $reader = shift(@_);
 	@reply =qw"Be Happy!";
 	print "@{$reader}\n";
 	return \@reply;
 }
 
Today, the use of the & in the calling function is not needed. This is one of the previous examples without the & symbol.

 
 #!/usr/bin/perl -w
 use strict;
 
 #This program creates a database on the filesystem from user input
 
 print "Welcome to your address book.\nPlease enter the prompted information
 for each record you want entered.\n";
 
 while(1){
 	my $record = &gather();
 	&place($record);
 	&cont() or exit;
 }
 
 sub gather{
 	my($last,$first,$phone,$email, $string);
 	print "Please Enter The Last Name==>";
 	$last = <>;
 	chomp $last;
 	print "Please Enter The First Name==>";
 	$first = <>;
 	chomp $first;
 	print "Please enter the Phone Number==>";
 	$phone = <>;
 	chomp $phone;
 	print "Please enter the email address==>";
 	$email = <>;
 	chomp $email;
 	$string = join "\t", $last, $first, $phone, $email;
 	return $string;
 }
 
 sub place{
 	my($info) = shift @_;
 	my($home) = $ENV{HOME}; #retrieve home from the external env
 	if( -e "$home/rolodex.db"){
 		open FH, ">>$home/rolodex.db" or die $!;
 	}else{
 	open FH, ">$home/rolodex.db" or die $!;
 	}
 	print FH "$info\n" or die $!;
 	close FH;
 }
 	
 sub cont{
 	print "Should we continue? (y or n )\n";
 	my($go);
 	$go = <>;
 	chomp $go;
 	print "$go:\n";
 	return 0 if $go eq 'n';
         return 1;
 }      


If you predeclare a sub routine you are allowed to use it without parentheses. Predeclaring it is simply using the keyword sub with the symbol you are using it access the code.

In order to enforce that parameters send to your function are of a specific type, you can prototype the parameters in the function declaration. The simplist prototype is one that takes a scalar or two. We can create a prototyped subroutine that must take to scalars as follows:

sub lowers($);

It is called as follows:

$string = "Man does not live by bread alone\n";
lowers $string;

Do not call the prototyped subroutine with an Amperstand (&) as it causes the function call to ignor the parameter contraints.

If we include a semicolon in the prototype, we are adding an optional parameter to the declaration. Hence:

sub lowers($;$);
can be called with:

lowers($string1, $string2); or lowers($string1);

The prototype:

sub lowers($$); must be called with two strings:

lowers($string1, $string2);

If you try to call it with one scalar value, it fails.

In addition to protyping scalars, we also can prototype for an array, a hash, typeglobs or a function using their dereferencing symbols( @, %, &, *). But, beaware that like all list assignments, once you prototype for an array, the array absorbs the entire rest of the list.

sub lowers($@); is alright.
sub lowers(@$) is completely sensless because the scalar will never be filled.
Lastly, we can prototype with the slash like:

sub lower(\$\@\$);

The slash tells the function call that the next character must absolutely be the first character of the arguement and then passes a reference of the arguement to your routine. For example:


 #!/usr/bin/perl -w
 
 use diagnostics;
 use strict;
 
 #array to sort, datatype, direction
 sub smartsort(\@\$\$);
 my($data, $direct);
 
 my(@words) = qw(Alpha Beta Charlie Delta Echo Foxtrot);
 my(@numbers) = qw(10 9 8 7 6 5 4 3 2 1 0);
 #############################################
 $data = "strings";
 $direct = "reverse";
 
 my(@sorted) = smartsort @words,$data,$direct; 
 print "@sorted \n";
 #############################################
 $data = "numbers";
 $direct = "foward";
 
 @sorted = smartsort @numbers,$data,$direct; 
 print "@sorted \n";
 
 
 sub smartsort(\@\$\$){
     my $array_ref = shift @_;
     my $data_ref = shift @_;
     my $direct_ref = shift @_;
 
 CHOOSE:{
                if($$data_ref eq "strings" and $$direct_ref eq "foward"){
                        return sort @$array_ref;
                }
                
                if($$data_ref eq "strings" and $$direct_ref eq "reverse"){
                        return reverse sort @$array_ref;
                }
                
                if($$data_ref eq "numbers" and $$direct_ref eq "foward"){
                        return sort {$a <=> $b} @$array_ref;
                }
                
                if($$data_ref eq "numbers" and $$direct_ref eq "reverse"){
                        return sort {$b <=> $a} @$array_ref;
                }
                print STDERR "I don't know what happened\n";
                print STDERR "\$\$data_ref $$data_ref \$\$direct_ref $$direct_ref\n";
        }
 }


External Functions

Large programs and professional programmers make extenssive use of external packages to create their programs. Using external packages with a set of functions permits us to recycle code and organize our programs further in a structured programming motif. Let's create a program that interfaces with our external environment. The idea is that we are creating a package of functions which permit us to to do a variety of things to control our enviorment. We need to first create our main program as usual. Inside our main program we use the key word 'use'We can also use the key word 'require', but use is generally better for reasons we'll pass on discussing until we become more familiar with the syntax and use of modules.
  1. #!/usr/bin/perl -w
  2. use lib '/home/ruben/perl_course' #CHANGE TO YOUR HOME DIRECTORY
  3. use MYWORLD;
  4. use strict;
  5. use diagnostics;
  6. &report;
  1. #This Module contains routines which manipulate and report on my
  2. #environment. Version 1.0 11/30/1999
  3. sub report{
  4. print "Parameter\tSetting";
  5. for $tmp (sort(keys %ENV)){
  6. print"$tmp\t$ENV{$tmp}\n";
  7. }
  8. }
  9. return 1;

When we write our package, the last line should always return a 'true' value. This package has only one function in it so far. We tell Perl that we that we need to locate our module in all it's default paths, (such as where the package strict and diagnostics are stored), and to also look in /home/ruben. We do this with the 'use lib' directive. Then we use the module with 'use MYWORLD' MYWORLD.pm is the name of the file under /home/ruben. The functions inside of the file are automatically loaded into our name space. This is only true because we failed to give a package command within our module. If use use package in our module, the name spaces are completely seperated unless we do something to export the functions into the namespace of the calling package (main in these simple cases).


 #!/usr/bin/perl -w
 
 use strict;
 use diagnostics;
 use lib '/home/ruben/';
 use MYWORLD2;
 
 #my($home,$host,$path);
 MYWORLD2->report;
 


 #This Module contains routines which manipulate and report on my
 #environment.  Version 1.0 11/30/1999
 package MYWORLD2;
 sub report{
 	print "Parameter\tSetting\n";
 	for $tmp (sort(keys %ENV)){
 		print"$tmp\t$ENV{$tmp}\n";
 	}	
 }
 
 
 
 
 
 
 return 1;

In this case, if we try to use &record without refering to the package first, our program bombs because strict prevents us from spontaneously creating the the symbol &main::record on the fly. There are ways of moving varriables from package MYWORLD2 to package main on command. But this is beyound the scope of this section. We can, however, change our default package with the package command. See this program for example:


 #!/usr/bin/perl -w
 
 use strict;
 use diagnostics;
 use lib '/home/ruben/';
 use MYWORLD2;
 
 #my($home,$host,$path);
 #MYWORLD->report;
 package MYWORLD2;
 
 report();
 


NEXT: Regular Expressiosn