Data munging: November 2007 Archives

Yesterday I noticed in my Apache access log a lot of 404s that looked like this:

aaa.xx.65.186 - - [25/Jul/2007:05:55:05 -0500] "GET http://www.some-advertising-site.com/banner/digits HTTP/1.1" 404 305 "http://some-different-website.com/" "legitimate-looking agent"

Not only am I not hosting banner ads, the GET request is invalid. It should be GET /banner/digits..., without the scheme and hostname part of it. I wondered how many I had of these, and how many hits I was getting. A Perl one-liner to the rescue!

perl -MData::Dumper -nae'++$n{$F[0]} if /GET http/; \
    END{print Dumper\%n}' access.log

$VAR1 = {
          'aaa.xx.65.186' => 132, # Real IPs obscured
          'bb.yyy.7.60' => 48,
          'ccc.zzz.46.147' => 111,
          'dd.qq.71.82' => 33
        };

So it looked like I was getting hit by a couple of 0wnz0red boxes with some sort of virus on them. I added them to my iptables DROP list and was done with it.

About this Archive

This page is a archive of entries in the Data munging category from November 2007.

Find recent content on the main index or look in the archives to find all content.

Other Perl Sites

Other Swell Blogs

  • geek2geek: An ongoing analysis of how geeks communicate, how we fail and how to fix it.

Data munging: November 2007: Monthly Archives

Technorati Profile