When in a tight loop of many records from a database, using the quick & dirty solution of calling $sth->fetchrow_hashref can be expensive. I was working on a project to walk through 6,000,000 records and it was slower than I wanted. Some benchmarking showed me that I was paying dearly for the convenience of being able to say my $title = $row->{title};.

When I converted my code to bind variables to the columns in the statement handle, I cut my run time about 80%. It was as simple as adding this line:

$sth->bind_columns( my $interestlevel,
my $av_flag, my $isbn, my $title );

before calling the main loop through the database. Now DBI knows to put the data directly in there, without creating an expensive temporary hash. I also can't make a typo such as $row->{ISBN} inside the loop, so there's a measure of safety as well.

The benchmarks below show the relative speeds of each of four techniques:

hashref     took 31.5048 wallclock secs
array       took 8.83724 wallclock secs
arrayref    took 5.5308 wallclock secs
direct_bind took 4.46956 wallclock secs

Here's the key parts of the benchmark program I used:

use Benchmark ':hireswallclock';
sub hashref {
while ( my $row = $sth->fetchrow_hashref ) {
my $interestlevel = $row->{interestlevel};
my $av_flag = $row->{av_flag};
my $isbn = $row->{isbn};
my $title = $row->{title};
}
$sth->finish;
}
sub array {
while ( my @row = $sth->fetchrow_array ) {
my ($interestlevel, $av_flag, $isbn, $title) = @row;
}
$sth->finish;
}
sub arrayref {
while ( my $row = $sth->fetchrow_arrayref ) {
my $interestlevel = $row->[0];
my $av_flag = $row->[1];
my $isbn = $row->[2];
my $title = $row->[3];
}
$sth->finish;
}
sub direct_bind {
$sth->bind_columns( my $interestlevel,
my $av_flag, my $isbn, my $title );
while ( my $row = $sth->fetch ) {
# no need to copy
}
$sth->finish;
}
for my $func ( qw( hashref array arrayref direct_bind ) ) {
my $sql = <<"EOF";
select interestlevel, av_flag, isbn, title
from testbook
limit 1000000
EOF
# This sub calls the SQL and returns a statement handle
$sth = sqldo_handle( $sql );
my $t = timeit( 1, "$func()" );
print "$func took ", timestr($t), "n";
}

Did you find this article useful? Or does it not belong on Perlbuzz? Let us know what you think.