It’s not every day you get to save a dog with Perl, but Perlbuzz
reader Adam Gotch did just that the other day.

Adam tells me “I’m a telecommute Perl/Python contract programmer
at O’Reilly Media. I live in Springboro, OH. I’ve been coding in
Perl for about 10 years and love it.”

On Saturday, Adam found a dog wandering the highway about a mile
from his home. The local shelters didn’t open until Monday,
so he took it upon himself to try to find the owner himself.

Adam explains:

I located the Warren County dog registration
website

and discovered a simple web form that allowed you to look up an
owner if you had the dog license # and registration year. Not having
a clue what a license # looked like, I entered ‘1’ with year ‘2011’
and got a result. Dog license #’s were simple integers. Using binary
search, I quickly discovered that there were 24996 registration
records for 2011. The web form’s search result provided a dog’s
owner’s name, address and phone as well as the dog’s breed, color
and sex. With this knowledge I decided it was feasible to write a
script to pull back all the records and filter for a female brown
lab.

The dog registration website was ASP.NET with __VIEWSTATE and
__EVENTVALIDATION post variables so a simple LWP script was going
to be a pain. I had worked with
WWW::Mechanize before
so I checked the CPAN docs to see if it was going to work. It seemed
to have everything I needed so I began coding. I wrote a quick test
to see if I could pull back one record, but no luck. I ran
wireshark
captures of both a manual post in Chrome and my test script.
Comparison of the captures revealed that the submit button name/value
was not being sent by my script. Looking at the WWW::Mechanize docs,
I found the button parameter to the submit_form() method for
simulating a submit button click. It worked. I finished the script,
looping over all 24996 records and soon I was pulling down all the
Warren County dog registration records for 2011.

Here’s the program Adam wrote:

use WWW::Mechanize;
use HTML::TreeBuilder::XPath;
use strict;
my $m = WWW::Mechanize->new();
$m->get('http://www.co.warren.oh.us/auditor/licensing/dog_search/');
my @info = ();
$| = 1;
for (my $i = 1; $i < 24997; $i++) {
my $response;
eval {
$response = $m->submit_form( form_number => 1,
fields => {
'ctl00$ContentPlaceHolder1$txtlicense' => "$i",
'ctl00$ContentPlaceHolder1$txtyear' => '2011'
},
button => 'ctl00$ContentPlaceHolder1$btnSubmit');
};
if (!$@ && $response->is_success) {
my $tree = HTML::TreeBuilder::XPath->new;
$tree->parse($response->decoded_content);
# Use XPath selectors to find fields in the table
my $owner_info = $tree->findvalue('//div/fieldset[1]/p');
my $dog_info = $tree->findvalue('//div/fieldset[2]/p');
push @info, [$owner_info, $dog_info, $i];
print "$owner_info|$dog_info|$in";
}
else {
warn "WARNING: POST FAILED";
}
$m->back();
}

After that, it was some simple calls to grep to filter the results:

cat warren_county_dogs.txt | 
grep -i springboro | 
grep -i lab | 
grep -i brown | 
grep -i female 
> brown_labs.txt

This narrowed down the 25,000 records to 39. That made it easily
to visually scan the list and find the addresses that were closest
to where the dog was found. That narrowed it down to three. Adam
Googled the phone numbers, found that one was a cell, and texted
it.

I texted the first number, explaining I had found this dog on the
highway and sure enough, it was the owner! He promptly drove to my
house to pick up “Izzy”. When he arrived he was very glad to have
his dog back but also confused as to how I found his phone number.
I told him I “scraped” the dog registration site and left it at
that (yeah it’s a bit unnerving how easy it is to find information
on people).

Note that if Adam was using a system that didn’t have grep or
ack, he could have done the string
matching in the Perl program before writing out to the file:

next unless $owner_info =~ /springboro/i;
next unless $dog_info =~ /lab/ && $dog_info =~ /brown/
&& $dog_info =~ /female/ && $dog_info =~ /lab/;

He could probably have done the matching with XPath as well, but I
am very green on XPath. Such a modification is left as an exercise
to the reader.

Thanks for the story, Adam!