Who’s making bogus web requests?

Yesterday I noticed in my Apache access log a lot of 404s that looked like this:

aaa.xx.65.186 - - [25/Jul/2007:05:55:05 -0500] "GET http://www.some-advertising-site.com/banner/digits HTTP/1.1" 404 305 "http://some-different-website.com/" "legitimate-looking agent"

Not only am I not hosting banner ads, the GET request is invalid. It should be GET /banner/digits..., without the scheme and hostname part of it. I wondered how many I had of these, and how many hits I was getting. A Perl one-liner to the rescue!

perl -MData::Dumper -nae'++$n{$F[0]} if /GET http/; 
END{print Dumper%n}' access.log
$VAR1 = {
'aaa.xx.65.186' => 132, # Real IPs obscured
'bb.yyy.7.60' => 48,
'ccc.zzz.46.147' => 111,
'dd.qq.71.82' => 33
};

So it looked like I was getting hit by a couple of 0wnz0red boxes with some sort of virus on them. I added them to my iptables DROP list and was done with it.

A roadmap for Perl 6 and Parrot

Patrick Michaud, bless him, has produced for Perl 6 and Parrot what it’s needed for a long time: A roadmap for future development. (Note that the link points to the Subversion repository for Parrot, and may be moved over time.)
It’s well worth reading, especially if you’re wondering what Parrot and Perl 6 are all about as far as implementation and development.

Key tech points for those not into reading so much:

  • Pugs is the Perl 6 on Haskell implementation being used to work out the language itself
  • The Pugs repository will be the home of the “official Perl 6 test suite” for now, and will be reorganized to make sure it reflects the current Perl 6 spec.
  • The Perl 6 compiler has four components
    1. the parsing grammar
    2. some parsing support subroutines
    3. the AST (Abstract Syntax Tree) transformation
    4. runtime support
  • Parrot 0.5.0 has a new object model that eliminates many obstacles in compiler work
  • Larry has written a standard grammar for Perl 6, and the compiler will be reworked to align with it
  • NQP is “Not Quite Perl 6”, a lightweight version of Perl 6 for bootstrapping the language. Most of the Perl 6 compiler will be written in NQP.

Best of all, there’s a big section called “How others can start hacking and contributing,” with clear instructions for interested bystanders who want to jump in. Too often on projects I see current participants not realize that there may be a perceived barrier to entry for outsiders, and Patrick has done what he can to eliminate it.


I love love love it. Thank you, Patrick. I hope this article helps pick up some interest from those who have not yet joined.

Mechanix, the new Perlbuzz section

When I asked if you wanted more technical articles in Perlbuzz, there were two answers: a resounding “Yes!” and a less resounding but no less emphatic “No!” The path was clear: More technical articles for the people that want them, and sequestered in their own section for those that don’t.

We’ve now started the Mechanix column at http://perlbuzz.com/mechanix. The main Perlbuzz feed will continue to be mostly news & opinion about Perl in general, while articles relating to coding and programming specifically will wind up in Mechanix. I suspect that most of what winds up in Mechanix will be pointers to other technical articles, as with the current Mechanix articles, “80% Programmers” and Mark Dominus on undefined behavior. I also suspect that at some point that someone will complain that “This should have been in that section,” but such is the nature of arrangements like these.

Please also note that Mechanix is specifically vague.* I picked the name because it evokes a sort of code jockey feel to it, without tying down to anything in particular. For example, an article I have in my “to be written” folder is about my foray into the PHP 5.3 source code and the horrors that I’ve found.

I hope you’ll give Mechanix a try in your newsreader, and as always tell us what you think, either in comments on this entry, or emailing us at the “editors” mailbox for perlbuzz.com.

* I bet that the Ruby guys pick up on my idea but call it “Four Horsemen”.

80% programmers

Ben Collins-Susmann writes about the 80/20 split of programmers and how the 20% of programmers who are “alpha programmers” have to account for the 80% who are not, and how they use their tools.

Although the post talks about Subversion and distributed VCSes, the lessons hold for those who use Perl, too. How many programmers have we worked with who don’t know about CPAN, or are afraid to use code from CPAN? How about programmers who don’t understand the internals workings of “standard” Perl objects (i.e. blessed hashes), who don’t realize that a {} is an anonymous hash constructor, not a “class” or “object” constructor? Or who are afraid to use the map and grep constructs?

On the flip side, you don’t want to dumb down your code to the lowest common denominator. Although both Mark Dominus and chromatic have written about it recently, I like Randal Schwartz’s phrasing best: “Sooner or later you’re going to have to write in Perl.” I’m dealing with PHP code at work where the original programmer did not use keyed lookups (PHP arrays are effectively ordered versions of Perl hashes) to check to see if a given string was in a list of special strings. I’m assuming that he was unaware of the ability to look up array elements by key, but I think it would be even worse if he specifically didn’t use the feature out of fear, or worrying about future programmers not knowing what the code did.

Assuming that you’re a 20% programmer (and that you’re reading a programming blog suggests that you are), how do you deal with 80% programmers? Any tricks for the rest of us?

Addendum: Not five minutes after I posted this, I found this article “What if powerful languages and idioms only work for small teams?”, with most of the value in the comments from readers.

Perl gratitude, 2007

Here in the US, it’s Thanksgiving, a day of eating lots of food,
watching football, and sometimes, just sometimes, expressing gratitude
and giving thanks for those things that make life wonderful.

Here are the things I’m grateful for in late 2007, in no
particular order after the first.

Google Code

Google’s project hosting
service
has been a godsend. It’s changed the way I do open
source projects. It has leapfrogged SourceForge for ease of
maintenance, and the bug tracker trumps RT
for CPAN
that we’ve been using for so long. Add that to the
integration with Google Groups which makes it trivial to create
mailing lists, and it’s at the tops of my list for 2007. I can’t
say enough good about it.

The readers of Perlbuzz

Eleven weeks ago, Skud and I started this little website called
Perlbuzz as an alternative to
the “more traditional outlets” for news in the Perl world. The
response has been tremendous. We get 600 RSS readers every day,
and have had over 10,000 unique visitors in that time. It makes
me happy that our little venture is used and appreciated by the
community.

Test::Harness 3.0

It’s been over a year in the making, but the new version of the crucial
Test::Harness 3.0
means more flexibility for module authors, and
lots of UI improvements for people who just want to run prove
and make test.

Mark Dominus

MJD is so much a fixture in Perl it’s easy to forget that he’s
there. For 2007, though, never mind all the things he’s done for
Perl in the past, or the hours I’ve spent being enthralled in talks
of his. His Universe Of Discourse
blog
is the single most intelligent blog out there, and sometimes
it just happens to be about Perl.

Andy Armstrong

Was Andy Armstrong always around, or did I just not notice? His
time and dedication spent on climbing on board with Ovid and Schwern
and the rest of the Test::Harness 3.0 crew has been invaluable in
getting it out. Plus, he’s a really swell guy anyway.

Dave Hoover

When I finally despaired of the amount of time and frustration
it took to organize content for Chicago.pm‘s Wheaton meetings,
Dave Hoover stepped up and volunteered to take it over. I’m thankful,
but not as much as I hope the other Chicago.pm folks are.

Perl::Critic

I’m all about having the machine keep an eye out for the stupid things
we do, and the goodness of Perl::Critic
is always impressive. You won’t like everything Perl::Critic says about your code,
but that’s OK. It’s an entire framework for enforcing good Perl
coding practices.

The Perl Community in general

The Perl community is populated by some tremendous folks. Some
names are more known than others, but these people help make daily
Perl life better for me. In no particular order, I want to single
out Pete Krawczyk, Kent Cowgill, Elliot Shank, Liz Cortell, Jason
Crome, Yaakov Sloman, Michael Schwern, Andy Armstrong, Ricardo
Signes, Julian Cash, Jim Thomason, chromatic, Chris Dolan, Adam
Kennedy, Josh McAdams and of course Kirrily Robert. If you think
you should be on this list, you’re probably right, and I just forgot.

My wife, Amy Lester

Because even if she doesn’t understand this part of my life, she
at least understands its importance to me.


I’d love to hear back from any readers about what they’re thankful for. I’m thinking about having a regular “Love Letters to Perl” column where people write about what they love in Perl.

Perl trumps Ruby and Erlang in the Wide Finder Project

Tim Bunce points me to this post about Perl being faster than Ruby in Tim Bray’s Wide Finder code competition.

The Wide Finder is at heart an Apache log analysis tool to show commonly hit pages, but for purposes of this comparison, it’s analyzing 971MB. Bray explains:

It’s a classic example of the culture, born in Awk, perfected in Perl, of getting useful work done by combining regular expressions and hash tables. I want to figure out how to write an equivalent program that runs fast on modern CPUs with low clock rates but many cores; this is the Wide Finder project.

All the talk about Erlang and parallelism makes me want to get back to working through my copy of Programming Erlang. Oh tuits, come to me!