Bind output variables in DBI for speed and safety
When in a tight loop of many records from a database, using the quick & dirty solution of calling $sth->fetchrow_hashref can be expensive. I was working on a project to walk through 6,000,000 records and it was slower than I wanted. Some benchmarking showed me that I was paying dearly for the convenience of being able to say my $title = $row->{title};.
When I converted my code to bind variables to the columns in the statement handle, I cut my run time about 80%. It was as simple as adding this line:
$sth->bind_columns( my $interestlevel, my $av_flag, my $isbn, my $title );
before calling the main loop through the database. Now DBI knows to put the data directly in there, without creating an expensive temporary hash. I also can't make a typo such as $row->{ISBN} inside the loop, so there's a measure of safety as well.
The benchmarks below show the relative speeds of each of four techniques:
hashref took 31.5048 wallclock secs array took 8.83724 wallclock secs arrayref took 5.5308 wallclock secs direct_bind took 4.46956 wallclock secs
Here's the key parts of the benchmark program I used:
use Benchmark ':hireswallclock'; sub hashref { while ( my $row = $sth->fetchrow_hashref ) { my $interestlevel = $row->{interestlevel}; my $av_flag = $row->{av_flag}; my $isbn = $row->{isbn}; my $title = $row->{title}; } $sth->finish; } sub array { while ( my @row = $sth->fetchrow_array ) { my ($interestlevel, $av_flag, $isbn, $title) = @row; } $sth->finish; } sub arrayref { while ( my $row = $sth->fetchrow_arrayref ) { my $interestlevel = $row->[0]; my $av_flag = $row->[1]; my $isbn = $row->[2]; my $title = $row->[3]; } $sth->finish; } sub direct_bind { $sth->bind_columns( my $interestlevel, my $av_flag, my $isbn, my $title ); while ( my $row = $sth->fetch ) { # no need to copy } $sth->finish; } for my $func ( qw( hashref array arrayref direct_bind ) ) { my $sql = <<"EOF"; select interestlevel, av_flag, isbn, title from testbook limit 1000000 EOF # This sub calls the SQL and returns a statement handle $sth = sqldo_handle( $sql ); my $t = timeit( 1, "$func()" ); print "$func took ", timestr($t), "n"; }
Did you find this article useful? Or does it not belong on Perlbuzz? Let us know what you think.