Benchmarking Your Code
Perl’s motto is "There’s More Than One Way To Do It" but sometimes you may have trouble determining which is the best way to do it. Here’s one way to pick the best way: the Benchmark module.
The Benchmark module works by running your code many, many times and averaging the amount of time it took to run each one. It then reports on the total amount of time taken.
Since this module is included in the standard Perl distribution, there’s no excuse not to use it. Of course, speed isn’t the only way to measure the difference between different ways of Doing It, but it is one good indication. The format for using the Benchmark module looks something like this:
use Benchmark;
timethese(10_000, {
Version1 => &version1,
Version2 => &version2
});
sub version1 { .... }
sub version2 { .... }
The timethese() subroutine accepts two parameters: a number (the _ in 10_000 is like the comma in 10,000; it has no meaning to Perl but makes it easier to count the 0’s), and a reference to a hash. The hash should contain elements whose key is a string and whose value is a reference to a subroutine which is to be tested.
Then, just populate the subroutines to do your thing. In the above example, change "version1" and "version2" to more appropriate names. You can add additional ones as well if you like, just by adding them to the list.
Here’s an example:
use Benchmark;
timethese(1_000_000, { foreach_keys => &foreach_keys,
foreach_sort => &foreach_sort,
while_each => &while_each });
my %hash = (dog => "perro", cat => "gato",
horse => "caballo", cow => "vaca");
sub foreach_keys {
foreach my $key (keys %hash) {
print "$key: $hash{$key}n";
}
}
sub foreach_sort {
foreach my $key (sort keys %hash) {
print "$key: $hash{$key}n";
}
}
sub while_each {
while(my($key, $value) = each %hash) {
print "$key: $valuen";
}
}
The results look like this (the output of the "print" statements is suppressed by Benchmark):
Benchmark: timing 1000000 iterations of foreach_keys, foreach_sort, while_each... foreach_keys: 7 wallclock secs ( 6.00 usr + 0.01 sys = 6.01 CPU) @ 166389.35/s (n=1000000) foreach_sort: 7 wallclock secs ( 7.05 usr + 0.00 sys = 7.05 CPU) @ 141843.97/s (n=1000000) while_each: 4 wallclock secs ( 3.89 usr + 0.00 sys = 3.89 CPU) @ 257069.41/s (n=1000000)
As a result, you can see that while_each is the fastest. Of course neither of these methods will sort the results, so if you need sorting, you would have to use the foreach_sort technique. With a larger data set, you would see a more dramatic difference. Sorting 4 keys is not a difficult task!
Note: If you get a message that says "(warning: too few iterations for a reliable count)", then you should increase the number. For example, if you change the above from 1,000,000 to only 100,000 iterations, it will create that warning (at least on my system).
