Regexes

How to: Find all modules used in a tree

February 7, 2008 Regexes, Tools 1 comment

On the perl-qa list tonight, we were discussing how best to find all the modules used in a source tree. To do the job right, you’d have to run the code and then look at the %INC:: hash, which shows the paths of all modules loaded. The low-tech and usually-good-enough solutions we came up with use ack:

$ ack -h '^uses+(w+(?:::w+)*).*' --output=$1 | sort -u

Thanks to Andy Armstrong for coming up with a better regex than mine, which assumed that the use statement would necessarily end with a semicolon.

Code broken by regex fixes in 5.10.0, or, why it’s good to help test release candidates

December 20, 2007 Regexes 5 comments

My baby, ack, broke under Perl 5.10.0, because of a fix in regex behavior that I had been using unknowingly. See, I had always used my regex objects like this:

my $re = qr/^blah blah/;
if ( $string =~ /$re/sm )...

when I should have been using it like this:

my $re = qr/^blah blah/sm;
if ( $string =~ /$re/ )...

The bug in 5.8.x is that the /$re/sm would incorrectly apply the /sm modifiers to $re. This made the code happen to work, but for the wrong reason. What was especially tricky about finding my bug was that in 5.10.0, the call to /$re/sm ignores the /sm, but doesn’t tell you that.

After some back and forth on p5p, a patch was submitted that gave the warning about the ignored /sm flags, but alas, Perl 5.10 was already out. It wouldn’t have been so bad if it hadn’t been the day AFTER it was released.

So, lesson learned: Test your code against new release candidates of Perl, both for your code’s sake, AND for Perl’s sake.

And y’know, now that I think of it, this is probably a great policy for Perl::Critic just waiting to happen. I wonder how many other people are doing their regexes the wrong way, too.