Should Perl 6 use the CPAN?

| 28 Comments

I just gave my keynote at Frozen Perl, and one of the big points I made was that we don't know what Perl 6 is going to look like. It's totally a green field. There's no toolchain, no LWP, no DBI, etc.

My big question: Should Perl 6 use the CPAN?

Does an 11 year-old distribution system make sense in 2009? In 1998, when we didn't have everything living in a cloud, and hosting websites took a lot of money, and if you wanted massive bandwidth, you were at a big company or a university. In 2009, those are no longer true.

Of course, I'm not suggesting that we don't distributing thousands of excellently awesome modules to the world. If we didn't, we wouldn't be Perl. But does it need to be through a centralized distribution channel like PAUSE + CPAN?

I don't have an answer.

Discuss.

28 Comments

Yes we should have a centralised repository. The fragmentation of ruby gems causes me pain on a day to day basis.

Yes we should have a centralised repository(It can of course be distributed). The fragmentation of ruby gems causes me pain on a day to day basis. Having to add new repos as and when my developers find a new shiny gem to use. Admittedly this is made worse by only a few gems being packaged for debian. Also supposedly turning gems into debs isnt as easy as perl modules.

Yes we should have a centralised repository(It can of course be distributed). The fragmentation of ruby gems causes me pain on a day to day basis. Having to add new repos as and when my developers find a new shiny gem to use. Admittedly this is made worse by only a few gems being packaged for debian. Also supposedly turning gems into debs isnt as easy as perl modules.

I agree with all the bobs on all points. Having as many distribution systems as there are websites and not being able to batch the install of something like WWW::Mechanize (which pulls in about 40 modules automatically while I watch movies) would be a huge mistake.

At least, if it isn't centralized, there should be a place to report your repo and there should be some standard format? But given ExtUtils::MakeMaker vs Module::Build vs Module::Install (etc) how on earth could you ever ensure there'd be a way to install the hundred or so modules you use on a regular basis? If one or two of the sites WWW::Mechanize needed were missing ... well, you just live without it for a few days?

Yeah, I think the centralized strategy (if it isn't on CPAN it doesn't exist) is a huge strength. Since I'm used to it, I can't see a reason to do anything else. I understand python people hate it, but I really don't understand why. Going to 40 different websites to get "eggs" blows. I mean, it really sucks.

I'll "fourth" bob's analysis. Call it CPAN 6 to separate it in order to encourage new modules to fit in with Perl 6's style. If a more distributed model is desired, make it only semi-distributed so that the CPAN software can still go to one place to figure out the Right Thing to do.

How do we know how "the CPAN software" is going to work? We don't.

Everything is green field. We have no toolchain. No distribution model. Are we going to convert ExtUtils::MakeMaker? I sure hope not!

I would say that CPAN, and the centralized distribution model, is the saving grace of Perl. Kill that, and we lose one of our major advantages. Sure, you'll modernize it, you'll have a chance to add more checks and more tests to standardize distribution, but don't destroy the core. It works, and it works well.

I agree with everyone above basically. I think CPAN is totally great. It shows its age sometimes and it would be pretty cool if we modernized it some, but for the most part I think that CPAN is awesome.

Also: your slides are awesome; we will be excellent to each other :-)

> How do we know how "the CPAN software" is going to work? > We don't.

It's a heluva lot more likely CPAN will be up than *all* the disparate websites involved in a typical perl install for me.

> ExtUtils::MakeMaker? I sure hope not!

I happen to really like MakeMaker. Makefiles actually work, unlike Module::Build. But that's a religious argument. See, I'm not saying things shouldn't be re-done from the ground up -- particularly given the chance to redo everything from scratch -- but don't throw away the centralized repository!!

The lack of CPAN really ruins the other languages for me. Let's keep it that way.

Bob's right on, but his comment about debs leads me to this observation:

If each CPAN6 module cannot be automatically converted to .deb -- if, in fact, CPAN6 does not offer a standard Debian apt repository, as well as a yum repository - then CPAN6 will have failed to live up to its potential.

Package management is simply not optional any more.

Mmmm, I kind of like the way the cheeseshop does it, where it redirects you to another site, but does it all automatically so you have no idea it's happening. Perhaps both that and the ability to keep modules in the CPAN would be good? But actually, I think it's simpler for module authors to just upload to a centralized resource. We just need a better UI like the CPANHQ interface that bricas was working on.

-Max

CPAN is a big success. Of course it needs to be modernized, but why abandon something that has been working so well through the years?

I'm not a fan of decentralizing. For example the recent changes that you can add your own bugtracker url - now you have to *register* for just reporting one bug in a module (if it's not using rt.cpan.org). This might discourage people who already have an account at rt.cpan.org to report a bug.

Max: That'd be very bad. When the keepers of N servers holding N*M modules lose interest or go out of business ... the code is gone!

CPAN is holographic. Any CPAN server contains all of CPAN. This is a Good Thing.

I agree with everyone here - a version of CPAN should exist...my big hopes/suggestions would be that it has digg-like features and maybe gives rankings or something based on the number of times the module has been installed (or at the very least, the number of times a modules documentation has been viewed)...better tagging and grouping by the crowd would also be a nice bonus (so take digg and delicious and apply them to the current CPAN and I think it will be just the improvement we're all looking for without compromising any of the current benefits we are all used to getting/using).

I agree with everyone here - a version of CPAN should exist...my big hopes/suggestions would be that it has digg-like features and maybe gives rankings or something based on the number of times the module has been installed (or at the very least, the number of times a modules documentation has been viewed)...better tagging and grouping by the crowd would also be a nice bonus (so take digg and delicious and apply them to the current CPAN and I think it will be just the improvement we're all looking for without compromising any of the current benefits we are all used to getting/using).

I agree with everyone about the WHAT - a centralized directory of modules is essential - otherwise you get into repository hell. Today I know that if it's worth using then it's in CPAN. No need to try to dig unknown code from untrusted sources (as much as you trust CPAN not to host malicious code, of course)

BUT - I'd like to suggest a HOW which might take advantage of the technology and tools developed since CPAN began.

I mean things like P2P, bittorrent, HTTP-based data etc.

An HTTP, web based central directory might still exist to ease the initial access, and maybe access when no better tool is available/installed (e.g. when all you've got is a browser or an old-style CPAN client).

But the repositories behind this HTTP/web interface might actually be linked through bittorrent or other p2p means which will download pieces of the packages from other users, automatically finding the closest peers and sharing the load of hosting the data among all the CPAN users.

Perl 6 will be unusable without the CPAN or something like it, because the standard Perl 6 library will contain only enough modules to connect to a CPAN and install other modules. A 6PAN is not optional.

> automatically converted to .deb

I definitely don't think perl6/6pan should understand .debs, but there should be some standard way of talking about what comes with a module so the module distribution can easily be converted to a .deb, .rpm, or a ... windows package...

It's always a shock to perl5 newcomers that there's no way to uninstall a package. It should be a lot easier to integrate with your OS-du-jour's package system.

I was surprised to see this entry today as I have just uploaded my first Perl 6 packaged to CPAN yesterday night.

CPAN is a great asset to perl - it is not only the centralised aspect, but also the aspect of a high minimum standard - documentation, test suite etc. Even barely written pre-pre-pre alpha modules have a sketch of documentation and tests, and this does a lot for the credibility of perl.
People have mentioned the mess around ruby gems - even though there are some forges containing thousands of gems, most of them have barely a description, almost no documentation, and the quality standards are all over the place. So downloading one is a lot like code russian roulette. There is a similar issue in php and to some extent in python.
On the other hand should CPAN evolve with the times and have a better web layer, with more ways to tag, comment, interact with and recommend/share modules, even externally? Hell yes! I think it is important also to figure out how CPAN (and other community solutions) can evolve to encourage, sustain and promote the development of (existing and new) frameworks - because many people do not look at the quality and quantity of the modules when deciding for a project, but at whole frameworks, and perl is just off the radar from that perspective.

I agree with many of the comments here. CPAN is one of the biggest selling points of Perl as a platform.

However, Perl 6 gives us the opportunity to build something more modern, more social, more inclusive for commenting and bug reporting, more agile for developers, with choice about the manner in which you interact with the system (centralized, distributed, etc.).

One should be able to use a CPAN-hosted version control system, or specify an external location if you choose to host your version control elsewhere (ie., an SVN deployment, GitHub, Gitorious, Assembla, etc.). Same with bug tracking (though this may prove less than ideal).

It does seem imperative that we address package management in some way, not just making tar.gz files available, but also debs, rpms, tgz, msi, etc. And we need a standard and easy way to build these packages (OWTDI, period, ever). I think we'll see higher adoption and market penetration for Perl 6 if it's super easy to create a package for every platform, and super easy to install the same package on every server that you manage using your existing methodology (pssh, Capistrano, Windows group policy, whatever exists for RedHat, SuSE, etc.).

I think there may also be value in making it easy to centrally create and manage your own fork of a module, and perhaps eventually make that fork be *the* module, so that when someone goes to grad school, changes careers, gets married, dies, or whatever, a valuable module can still carry on (or local modifications can be made, and kept in a personal repo).

Maybe we could take Launchpad and extend it with such features. I think it embodies the spirit of what I'm getting at here, with its flexible project definition features and PPAs.

As visitors of YAPC::EU and various other Perl Workshop may know, I am already working on a possible follow-up for CPAN, under the original name "CPAN6". It is already three years under development.

My CPAN6 is not only capable of storing Perl6 modules, but also other Parrot related stuff (and more). It can be used to create any number of CPAN like archives, where you can determine your own level of regulation; from very loose to extremely strict.

It should be ready when Perl6 really gets users. Maybe, the current economy helps me (and other people) to complete the implementation. Read more at http://cpan6.org

Really, the web interface (with Digg, ratings, etc.) that people dream about could be independent of the actual repository. It's just additional metadata (think mashup).

I've been impressed lately with the way Maven 2 handles dependencies. For example, you can declare that jUnit 3.1 is required for testing, but not for deployment (see pom.xml documentation -- the Project Object Model). Maven repositories can be proxied and cached, leading to a decentralized distribution that can be aggregated to a single point with a bit of configuration. It's working very well for me at the moment.

I can imagine a state, sometime in the future, where Perl 6, Java, Python, etc. resources are available through similar means and language interoperability makes it such that a Perl 6 library could depend on a well-written Java library. Maven seems to be structured in a way that would allow that language-independent artifact repository.

Of course Maven is a lot more than an artifact repository, but that's potentially a different discussion.

My opinion is that something like the CPAN system is a must for Perl 6. Not just CPAN itself, but the CPAN clients, PAUSE, RT, search.cpan.org, testers, the whole ball of wax. No question. But some time needs to be taken to figure out how to make things better (and, maybe, to thrash out what "better" is). For example:

We need some sort of module life cycle management. All the current CPAN system has that I am aware of is the module registration system, which is a start but has deficiencies. For example, it has an 'abandoned' status, but modules do not get there without the author putting them in that state before he or she gets up and walks away. It is not hard to think of modules whose authors are (for whatever reasons) ignoring the RT queue but still claiming their modules.

We need a better way to manage bug reports, one that is more in line with TIMTOWTDI. RT is okay, and I use it. Other module authors use other methods. This means the user of the module has to figure out the module author's personal preferences and register in the correct place before a bug report can be submitted. What I think I'm asking for is not a central bug-management system, but a bug dispatcher, so that the module user can submit a bug report without knowing what the author wants done with it, and the author can configure the dispatcher for his or her own preferences, rather than rooting around in all the places users might decide to put them.

The client must be self-configuring. This implies a fair amount, like awareness of where the client is, mirror status, and a whole bunch of things. The CPAN client tries to do this, but misses on several (to me) important points: you don't get a mirror (vital!), and you don't by default get several best practices like some variant of 'sudo make install' for your makemaker install command.

The client should have a GUI interface. I would say 'must', but that's hard when Perl doesn't have one (or rather, has too many, all of which are hard to install). Increasingly, you have to be a geek to mess with a command-line interface. If we want to restrict Perl to geeks, that's fine. If we want Perl applications to be ubiquitous, we have to recognize that not everyone is a geek, and not require people to become geeks to use them. Sorry guys -- I know this is potential flame bait, and it is not meant as such. But I just tried to (long-distance) get a Perl application onto my sister-in-law's machine, so I feel this is a point of view that needs to be considered.

I'm sure I could think of more if I sat longer, but this is enough to be going on with.

If CPAN6 can at all learn from github and do what they're doing right (it's so beautiful) along with providing the ability to do chained installs, etc, a la CPAN5, then CPAN6 could be the sweetest thing on the planet. If it's a strong enough tool, the private repository model that Github uses could bring money to TPF.

This brings up another question: is CPAN6 just going to be about Perl6/Rakudo, or is it going to be modular enough so that other languages can employ it as well. Since Parrot provides multilingual capacity and will soon allow, for example, a Perl module to use a Ruby module or a Python module. Could it be an effort that transcends languages, and instead empowers and employs the entire open source community? If Parrot transcends the P in CPAN, should the focus be CPAN or should it instead focus on CAN?
~Seth Viebrock

One more thought, since chromatic alluded to it: Support the DarkPAN specifically. We need to be friendly to everyone behind a firewall, or not on the net.

Another gotcha: People want to be able to uninstall modules.

i really want to have the tool for uninstall modules, because some modules aren't unfinished and wrong.

Leave a comment

Job hunting for programmers


Land the Tech Job You Love, Andy Lester's guide to job hunting for programmers and other technical professionals, is available in PDF, ePub and .mobi formats, all DRM-free, as well as good old-fashioned paper.