Helping ordinary people create extraordinary websites!
HOME TUTORIALS SCRIPTS WEB HOSTING BLOG FORUM
Get Our Newsletter
Email:

Optimize Perl

By Martin C. Brown
2005-04-11


Sorts

Another common operation related to loops is sorting information, particularly keys in a hash. It's tempting in this instance to embed some processing of list elements into the sort operation, such as the one shown here in Listing 5.

Listing 5. Bad sorting
my @marksorted = sort {sprintf('%s%s%s',

$marked_items->{$b}->{'upddate'},
$marked_items->{$b}->{'updtime'},
$marked_items->{$a}->{itemid}) <=>
sprintf('%s%s%s',
$marked_items->{$a}->{'upddate'},
$marked_items->{$a}->{'updtime'},
$marked_items->{$a}->{itemid}) } keys %{$marked_items};
This is a fairly typical sort of complex data, in this case ordering something by date, time, and ID number by concatenating the numbers into a single number that we can then sort numerically. The problem is that the sort works through the list of items and moves them up or down through the list based on the comparison operation. In effect, this is a type of loop, but unlike the loop examples we've already seen, a sprintf call has to be made for each comparison. That's at least twice for each iteration, and the exact number of iterations through the list will depend how ordered it was to begin with. For example, with a 10,000-item list you could expect to call sprintf over 240,000 times.

The solution is to create a list that contains the sort information, and generate the sort field information just once. Taking the sample in Listing 5 as a guide, I'd rewrite that fragment into something like the code in Listing 6.

Listing 6. Better sorting
map { $marked_items->{$_}->{sort} = sprintf('%s%s%s',

$marked_items->{$_}->{'upddate'},
$marked_items->{$_}->{'updtime'},
$marked_items->{$_}->{itemid}) } keys %{$marked_items};
my @marksorted = sort { $marked_items->{$b}->{sort} <=>
$marked_items->{$a}->{sort} } keys %{$marked_items};
Instead of calling sprintf all those times, we call it just once for each item in the hash in order to generate a sort field in the hash, and then use that sort field directly during the sort. The sorting process only has to access the sort field's value. You have cut down the calls on that 10,000-item hash from 240,000 to just 10,000. It depends on what you are doing in that sort section originally, but it's possible to save as much as half the time it would take using the method shown in Listing 6.

If you produce these hashes through results from a database query -- through MySQL or similar -- using sorting within the query and then recording the order as you build the hash, you won't need to iterate over the information again.

Tutorial Pages:
» Squeeze the Most From Your Code
» Sloppy Programming, Sloppy Performance
» Approaching Optimization
» Use References
» String Handling
» Loops
» Sorts
» Using Short Circuit Logic
» Use AutoLoader
» Using Bytecode and the Compiler Back Ends
» Other Tools
» Putting it All Together
» Resources


First published by IBM DeveloperWorks


 | Bookmark
Related Tutorials:
» Random subroutines in Perl
» Log Script Use
» Creating Perl Modules for Web Sites
» Bit Vector, Using Perl Vec
» Build a Perl/CGI Voting System
» Perl Range Operator