On Perl


Location: Chicago/Delhi, United States

Wednesday, February 01, 2006

Taint checking

'perl --help' reveals a lot of options for Perl compiler.
Two of them are:
-t enable tainting warnings
-T enable tainting checks

What is Taint Checking?
Data can be tainted by a malacious user. Often when making CGI scripts, the arguments can be changed, client-side checks by-passed. A good example of SQL maniplulation is illustrated here @ Unixwiz.

I have seen many scripts that simply do what they are supposed but have plenty of loop-holes to be exploited and mis-used. That is where smart programmers who can think like crackers save companies & web-sites from falling into abyss without everknowing that they are sinking!

It pays to glance through your web-logs once in a while and see if any one has been acting funny. My first web-site project was developing LANScan. It was a web-based search engine for LAN's (local networks). I would often see web logs to see what users were inputting for query strings. What was supposed to be a simple english words search engine was also getting regular expressions searches like '*', '*.*', 'english|English' etc. Clearly, users were software engineers who were smart, innovative; those who tried to get the maximum juice out for they efforts. That prompted me to add regular expression search for LANScan. Initially I had seperate input-boxes for regular english word search and one for regular expression search. Now some smarties tried to regular expression in egnlish-word-only box. "Tinker-tinker-tinker till it breaks" was their golden rule perhaps. Ultimately I had one box on site that handled english-words, non-english-words, regular-expressions, badly-formed regular-expression and everything I could think of. Building robust systems is different from just getting the darn thing to work.

Thus, it is important to validate each user-input to ensure script does not perform unexpected operations and keep a regular check on your web-logs. It tells you what your users want and that is important!

Friday, July 22, 2005

CGI - Keeping Perl and HTML separate

When ever you do CGI programming, you should keep your HTML and scripting (Perl) logic seperate.

HTML is the design aspect of your work, while Perl is the logic behind-the-scene that gets the data, manipulates it and finally displays it on the HTML page to the end-users.

Most of the CGI scripts found freely on the internet don't follow this approach. A cusory glance will show you it looks like a Zebra-code. The entire file is alternated between HTML and Perl. Their typical CGI file has some HTML tags in beginning like HEAD, TITLE etc. Then they have some perl logic, that get's data and modifies it. Some script even add a BOLD tag right there while generating the data. Then they concatenate this with other HTML strings.

In order to keep HTML and Perl separate, we'll have a minimum of two files. One is the Perl script - the one with business logic in it. And the other is the HTML content file. This is a perfect HTML file except it has place holders where Perl script will substitute it's data.

Eg. Zebra-code: CGI script with both HTML and Perl.

use CGI;
my $query = new CGI;

print <<"EndOfText";
Content-type: text/html

<TITLE>My sample page</TITLE>
<H1>Sample Page</H1>

my $name = $query->param('name');
print "Hello <B>$name</B>, how are you doing?<BR>\n";

my $address = get_address_of($name);
print "We found this as your address: <BR>$address<BR>\n" if ($address);
print "</BODY></HTML>";

Eg. CGI script with HTML and Perl seperated
HTML File is:

<TITLE>My sample page</TITLE>

<H1>Sample Page</H1>
Hello <B> %!name!% </B>, how are you doing?<BR>
We found this as your address: <BR> %!address!% <BR>

Perl script is:

use CGI;
my $query = new CGI;

my $name = $query->param('name');
my $address = get_address_of($name);

open(HTML, '<', './content.htm') or die "open failed\n";
my @html = </HTML>;

my $html_str = join(/ /, @html);
$html_str =~ s/%!name!%/$name/g;
$html_str =~ s/%!address!%/$address/g;

print "Content-type: text/html\r\n\r\n $html_str"

Which one do you think is easier to read and hence, maintain?

Monday, July 11, 2005

Password generator

A simple routine to make passwords.

sub makePassword{
my $length = shift;
@chr = (0..9,'A'..'Z','a'..'z');
$passwd .= @chr[rand @chr] for 1..$length;
return $passwd;

print makePassword(10);
print makePassword(5);
print makePassword(1);

Friday, June 17, 2005

Stages of a Perl Programmer

There is an excellent article by Nathan Torkington on Perl, rather on stages in the life of a Perl programmer - novice to wizard. Like it says, 'You won't come away a wizard, but you'll know what you need to do in order to become one'. Must read for all perl mongers.

Thursday, June 16, 2005

sprintf hack

Often, you need to prefix 0 to maintain uniform column size.
We generally do it like this using sprintf:
for (0..99){
$output = sprintf("%02d", $_);
print $output;

A better alternative would be
for (0..99) {
$_ = "0$_" if $_ < 10;

Here, $_ is changed only 10 times and we avoid calling sprint 100 times as in earlier case.

Trouble with system() in Perl

You created a script to generate some files and these files have to be processed by another (perl or shell) script called from within your first perl script using
system("./secondScript.sh argProcessALLFiles /some/dir");

However something strange is happening. You find that the files are being created properly, but, if these have to be uploaded, it happens only one at a time, and waits for the Ctrl-C for the next one. Is there anything you could do in the perl script, that would not wait for the Ctrl C?

Apparently there is nothing in Perl script that you can do about it. The problem is with underlying code in secondScript.sh shell script. Also, when you kill you perl program with a Control-C, the underlying system call will not kill the invoked application/script as the system() and backticks block SIGINT and SIGQUIT.

A piped open is another alternative to running system, and even allows you to send a Control-C as follows:

open my $handle, "| @commands" or die "Can't fork. $!";
print $handle , "\x3"; #Hex-03 is Control-C
close $handle or warn "Couldn't close $handle. $!";

Wednesday, June 15, 2005

Installing Perl modules on your system

In order to avoid re-inventing the wheel, Perl has modules. CPAN is a repository for all modules in Perl. As you program in Perl, you will eventually find yourself in a position where you will need to install a module that did not come installed with your Perl installation. CPAN also lists a page on how to install modules.

Apart from what's given there, there are some more methods that should work on both Unix and Windows:

1. Use PPM
Perl Package Manager(PPM) is a perl script which is used to install modules on your system. Perl automatically downloads any other dependent modules (or prompts you atleast). Here is how to use it.
C:\Perl\bin>perl ppm.pl
PPM interactive shell (2.1.5) - type 'help' for available commands.
PPM> install module-name

You can also use PPM to search for modules like this:
PPM> search MP3
Packages available from http://ppm.ActiveState.com/cgibin/PPM/ppmserver.pl?urn:/
Bundle-MP3 [1.00] A bundle to install all MP3-related modules
MP3-Daemon [0.63] a daemon that possesses mpg123
MP3-ID3v1Tag [1.11] Edit ID3v1 Tags from an Audio MPEG Layer 3.
MP3-Info [1.02] Manipulate / fetch info from MP3 audio files

2. Use CPAN Shell
To invoke the CPAN shell, use this:
perl -MCPAN -e shell

To install and search for modules:
cpan>i /mp3/ /* search for all modules with /mp3/ in the name*/
cpan>i MP3::Info /* show information about this module*/
cpan>install "MP3::Info" /*install the module*/

To see the cpan shell's configurations do:
cpan>o conf

3. Universal Method
Decompress and Unpack your module (like CPAN says). Use Winzip/gzip/tar etc.

Go the downloaded-module's folder and run the following steps (in order and look out for errors and warnings):
perl Makefile.PL
make test
make install

For windows, you may use nmake or dmake instead of make.

Finally, a Test
After you have installed your module, you may want to test whether it has been installed properly or not. Try these options:
c:\> perldoc MP3::Info /*Should show you the pod page for MP3::Info module*/

Try compiling or running a perl script that has only one line:
use MP3::Info; /*stored in test.pl*/

perl -c test.pl

Tuesday, June 14, 2005

print Quiz

We always take printf, System.out.println, print etc to be the simplest parts of any programming language. Here is perl teaser on print:

What will the following piece of code print?
print ( 2 + 2 ) * 5 ;

Since I have told you it is a teaser, then you already know the answer is not going to be 20 i.e. 4*5. So what will it be? Will this piece of code give an error? No.

At this point, you probably want to read the perldoc on print or simply run this code.

It prints 4. The part corresponding to (2+2) only. What happens to the rest? It's all there in perldoc which says:
Also be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print--interpose a `+' or put parentheses around all the arguments.

So the arguments to print are (2+2) only, and the rest i.e. * 5, is simply ignored.