Linux Journal has an article on creating Excel files using Spreadsheet::WriteExcel. It has its quirks, like creating corrupted spreadsheets if you try to populate a cell more than once, but when you need it, there's nothing else to do what it does.
I'm in the middle of a game of Scrabulous with Christoper Humphries on Facebook, and I get "tolkien" handed to me in my tray. Good letters, and I ought to be able to make a bingo out of them. Alas, the best I could get to play on the board was "knot", but what else could I have made? Perl to the rescue!
All I need to do is match across the contents of /usr/share/dict/words in a Perl one-liner. The -n flag means "loop over the input file, but don't print $_". My little program goes in -e, and it looks like this:
$ perl -lne'print if /t/ && /o/ && /l/ && /k/ && /i/ &&
/e/ && /n/' /usr/share/dict/words
allokinetic
ankylopoietic
anticlockwise
automatonlike
bibliokleptomania
....
Lots of good words, but they're awfully long. Let's limit it to seven-letter bingos. We have to use the -l flag to drop the linefeed from the input lines, so the length call is accurate.
$ perl -lne'print if /t/ && /o/ && /l/ && /k/ && /i/ &&
/e/ && /n/ && length($_)==7' /usr/share/dict/words
$
Shoot, nothing there. Let's try eight.
perl -lne'print if /t/ && /o/ && /l/ && /k/ && /i/ &&
/e/ && /n/ && length($_)==8' /usr/share/dict/words
knotlike
townlike
"knotlike"! That would have been beautiful. Oh well. :-(
Thread over on perlmonks talks about Tom Christiansen's assertion that you should use it, by default, even when you only have one command-line argument to parse:
What seems to happen is that at first we just want to add--oh say for example JUST ONE, SINGLE LITTLE -v flag. Well, that's so easy enough to hand-hack, that of course we do so... But just like any other piece of software, these things all seem to have a way of overgrowing their original expectations... Getopt::Long is just *wonderful*, up--I believe--to any job you can come up with for it. Too often its absence means that I've in the long run made more work for myself--or others--by not having used it originally. [Emphasis mine -- Andy]
I can't agree more. I don't care if you use Getopt::Long or Getopt::Declare or Getopt::Lucid or any of the other variants out there. You know know know that you're going to add more arguments down the road, so why not start out right?
Yes, it can be tricky to get through all of its magic if you're unfamiliar with it, but it's pretty obvious when you see enough examples. Take a look at prove or ack for examples. mech-dump is pretty decent as an example as well:
GetOptions(
'user=s' => \$user,
'password=s' => \$pass,
forms => sub { push( @actions, \&dump_forms ); },
links => sub { push( @actions, \&dump_links ); },
images => sub { push( @actions, \&dump_images ); },
all => sub { push( @actions, \&dump_forms, \&dump_links, \&dump_images ); },
absolute => \$absolute,
'agent=s' => \$agent,
'agent-alias=s' => \$agent_alias,
help => sub { pod2usage(1); },
) or pod2usage(2);
Where the value in the hashref is a variable reference, the value gets stored in there. Where it's a sub, that sub gets executed with the arguments passed in. That's the basics, and you don't have to worry about anything else. Your user can pass --abs instead of --absolute if it's unambiguous. You can have mandatory flags, as in agent=s, where --agent must take a string. On and on, it's probably got the functionality you need.
One crucial reminder: You must check the return code of GetOptions. Otherwise, your program will carry on. If someone gives your program an invalid argument on the command-line, then you know that the program cannot possibly be running in the way the user intended. Your program must stop immediately.
Not checking the return of GetOptions is as bad as not checking the return of open. In fact, I think I smell a new Perl Critic policy....
From The Pragmatic Programmer:
What's the value of pi? If you're wondering how much edging to put around a circular flower bed, then "3" is probably good enough. If you're in school, then maybe "22/7" is a good approximation. If you're in NASA, then maybe 12 decimal places will do.
Alfie John over at rental-property.co.nz wrote to tell that the source code for the entire site, written using Mason and Class::DBI, is available for download.
For someone wanting to see an overview of how either Mason or Class::DBI work with real-world examples, not just samples from documentation, this is a great place to start.
Adam Kennedy posted an excellent article about huge performance hits he found with File::Find::Rule. From the docs, there's this sample to find all the *.pm files in @INC:
# Find all the .pm files in @INC my @files = File::Find::Rule->file ->name( '*.pm' ) ->in( @INC );What this search REALLY says is "Find every single file in all these trees, then do an slow IO stat call to the operating system on every single one to work out which ones are files, and only then do a quick regex match on the file names to keep the 5% that have the ending we want and throw away the 95% that don't".
Now I'm worried about if I'm doing the right order of checking in File::Next, a lightweight file finder that ack relies on.
I'd been working on a new functional index for the work website. I created a Pgsql function to normalize the title of a book
CREATE OR REPLACE FUNCTION exacttitle_key( TEXT )
RETURNS text AS $$
DECLARE
key TEXT := upper( $1 );
BEGIN
key = regexp_replace( key,
'^ *(?:A|AN|EL|LA|LO|THE|LOS|LAS)\\M *', '' );
key = regexp_replace( key, '[^0-9A-Z ]+', '', 'g' );
key = regexp_replace( key, ' {2,}', ' ', 'g' );
RETURN trim( key );
END
$$ LANGUAGE 'plpgsql'
IMMUTABLE STRICT;
and tested it out, and all looked well. It was marked as IMMUTABLE, so Pg can use it as an index function. I created the index in psql:
create index testbook_exacttitle on testbook
using btree (exacttitle_key(title));
And all was well. Now I wanted to see how long it took to create that index, so from the shell I did:
$ time psql -c'drop index testbook_exacttitle; \
create index testbook_exacttitle on testbook \
using btree (exacttitle_key(title));'
I knew it would take about 5 minutes to add this index on 6.7 million records in testbook, so I didn't expect it to come back right away. Then I realized that site response fell off the table. ptop showed a couple dozen SELECT queries waiting to run. I killed the process that was running the CREATE INDEX. All the pending queries went on their merry way. Everything was back to normal.
I tried that command line again, and the results were identical. Dozens of queries backed up until I killed the CREATE INDEX process. But why were those queries backing up? That index was not used by any code yet. I asked in #postgresql, but nobody knew the answer. Then, someone said a word that clicked in my head. I made a little change to how I was running the commands, and everything worked just fine.
What was the word that helped Encyclopedia Lester figure out the problem? Turn to page 47 for the answer.
The word was "transaction". If there are multiple commands as part of the -c option to psql, they are executed in in one transaction. DROP INDEX blocks on the entire table, so the entire transaction blocked. When I ran the DROP INDEX separately, and then reran the CREATE INDEX by itself, there was only the long blocking on the new index, which did not yet exist.
(With apologies to Donald J. Sobol and Encyclopedia Brown)
People have been posting in their blogs about what command they run, based on their shell histories. The command that I've seen looks like this:
history|awk '{a[$2]++} END{for(i in a){ \
printf "%5d\t%s \n",a[i],i}}'|sort -rn|head
That works, of course, but who wants to use awk and the shell? I pulled out the old Data::Hash::Totals module I wrote a while back, along with Perl's built-in awk simulation:
$ history | perl -MData::Hash::Totals -ane'$x{$F[1]}++;' \
-e'END{print as_table(\%x, comma => 1)}' | head
207 vim
143 svn
125 make
90 ack
77 cd
45 sdvx
34 ssq
31 ls
25 ./login-fixup
19 tail
alester:~ : cat `which sdvx`
#!/bin/sh
svn diff -x -w $* | view -
and ssq is just an alias for svn status -q.
Google's main search screen now returns code snippets in its list of results. This is not just in code.google.com any more.
I needed to find the docs for the PHP function ftp_connect, so searched Google for it. (I could have gone to php.net and searched there, but why?) The list of results has three hits to PHP manual pages, and the fourth and fifth are bits of code that use ftp_connect. Anyone know if they're getting Perl stuff in there as well? I tried it with WWW::Mechanize, but couldn't turn up hits.
Here's a little article about the "file header tax", lines of boilerplate at the top of files that serve no purpose. Copyright notices, disclaimers, maybe even some revision history, it's all just clutter, and clutter is technical debt.
Take a look at the next file you edit. Is there anything at the top of it that is not functional code? Ask yourself if it really needs to be there. If in doubt, throw it out.