Optimize Perl
By Martin C. Brown2005-04-11
Approaching Optimization
First of all, it's worth remembering that Perl is a compiled language. The source code you write is compiled on the fly into the bytecode that is executed. The bytecode is itself based on a range of instructions, all of which are written in a highly optimized form of C. However, even within these instructions, some operations that can achieve similar results are more highly optimized than others. Overall, this means that it's the combination of the logic sequence you use and the bytecode that is generated from this that ultimately affects performance. The differences between certain similar operations can be drastic. Consider the code in Listings 1 and 2. Both create a concatenated string, one through ordinary concatenation and the other through generating an array and concatenating it with join.
Listing 1. Concatenating a string, version 1
my $string = 'abcdefghijklmnopqrstuvwxyz';
my $concat = '';
foreach my $count (1..999999)
{
$concat .= $string;
}
Listing 2. Concatenating a string, version 2
my $string = 'abcdefghijklmnopqrstuvwxyz';
my @concat;
foreach my $count (1..999999)
{
push @concat,$string;
}
my $concat = join('',@concat);
Running Listing 1, I get a time of 1.765 seconds, whereas Listing 2 requires 5.244 seconds. Both generate a string, so what's taking up the time? Conventional wisdom (including that of the Perl team) would say that concatenating a string is a time-expensive process, because we have to extend the memory allocation for the variable and then copy the string and its addition into the new variable. Conversely, adding a string to an array should be relatively easy. We also have the added problem of duplicating the string concatenation using join(), which adds an extra second.
The problem, in this instance, is that push()-ing strings onto an array is time-intensive; first of all, we have a function call (which means pushing items onto a stack, and then taking them off), and we also have the additional array management overhead. In contrast, concatenating a string is pretty much just a case of running a single opcode to append a string variable to an existing string variable. Even if we set the array size to alleviate the overhead (using $#concat = 999999), we still only save another second.
The above is an extreme example, and there are times when using an array will be much quicker than using strings; a good example here is if you need to reuse a particular sequence but with an alternate order or different interstitial character. Arrays are also useful, of course, if you want to rearrange or reorder the contents. By the way, in this example, an even quicker way of producing a string that repeats the alphabet 999,999 times would be to use:
$concat = 999999 x 'abcdefghijklmnopqrstuvwxyz';
Individually, many of the techniques covered here won't make a huge difference, but combined in one application, you could shave a few hundred milliseconds, or even seconds, off of your Perl applications.
Tutorial Pages:
» Squeeze the Most From Your Code
» Sloppy Programming, Sloppy Performance
» Approaching Optimization
» Use References
» String Handling
» Loops
» Sorts
» Using Short Circuit Logic
» Use AutoLoader
» Using Bytecode and the Compiler Back Ends
» Other Tools
» Putting it All Together
» Resources
First published by IBM DeveloperWorks
| Related Tutorials: » Random subroutines in Perl » Log Script Use » Creating Perl Modules for Web Sites » Bit Vector, Using Perl Vec » Build a Perl/CGI Voting System » Perl Range Operator |
