Lately, I've been thinking about the issue of package testing on a linux distribution that does rolling upgrades.
I've arrived at the idea of mainly using three different repositories: stable, testing and unstable. You may notice that I borrowed the same nomenclature that the debian project uses, and that's because I intend these repositories to mean more or less the same that they mean for debian.
So, what is different about my idea? I shall explain that during the course of the article, but mainly it is different because I am referring to rolling upgrades, which debian does not use on their stable repositories. Also, for my approach to work, tight integration with a new GUI package manager is required, to support the workflow that I am about to describe.
The Unstable Branch
Packages are first uploaded by packagers to the unstable branch. I shall ignore the issue of who can upload to this branch -- most distributions already have a packaging team or some kind of organizational structure to support this.
Only stable upstream releases should reach this branch. I will come back to the issue of unstable development packages later on.
Now, the interesting part is: how do packages retire from the unstable branch into the testing branch, or are "popped out" of the unstable branch because they do not work?
I propose the usage of a GUI package manager that allows users to easily select from which branch they want to install an application, and to mix-and-match versions from the three branches, whenever possible. Of course there isn't much sense in a user that has a stable KDE SC 4.3.5 release to install Konqueror from KDE SC 4.4.1 -- the usual package management rules must be applied here, so the user either installs an older stable Konqueror, or KDE SC 4.4.1 as a whole.
But imagine now, that the stable branch contained Firefox 3.5 and Firefox 3.6 was just released and uploaded to the unstable branch. In the package manager, users can see that there is a new version for Firefox on the unstable branch.
Let's imagine that a user -- John -- chooses to install this package. The package manager does its work, and notifies John that Firefox 3.6 is done installing. It also adds the package to the "Verification queue" for John.
John fires up Firefox 3.6, goes to slashdot, and all seems ok. He can then go back to the package manager, and tick Firefox 3.6 on his verification queue as "Ok".
The idea is that packages retire from unstable to testing not on a time-based schedule, but when sufficient users give it an "Ok" after a quick test. If there was an issue and the Firefox package had something wrong, for example if it crashed on startup, a "Failure" would be reported on the verification queue on the package manager. Attached to this Failure, a small comment could be made explaining it, and optionally, links to more detailed bug reports could be added (either to the distro's own bug reporting system, or to an upstream bug).
To choose at which point a package with a number of "Ok" and "Failure" reports gets picked to the testing branch or is popped off the unstable branch, I think a scoring system could be adopted, but to simplify matters for now, let's say if a package has > 20:1 ratio of Ok/Failure reports and more than 100 Oks it is selected to the testing branch, and if it has more than 10 Failure reports it gets popped off unstable.
The main idea of this unstable branch is really a quick "does it seem to work?" test. Open Firefox, browse to a site. No problems? Check "Ok". With a Kernel release, boot it. It booted and you're back staring at your desktop? Check "Ok".
The Testing Branch
The testing branch works on a time-based schedule. Users can also choose to install packages from here, and they will also be added to the Verification Queue, but in a different way: they will be there, but users will need to have the package installed for at least a week to be able to do mark them as "Ok".
On the other hand, if they experience any problem, they can flag a "Failure" right away, and follow the same procedures that they did on the unstable branch to comment on why they marked their test of the package as a "Failure" and to link it to bug reports.
The idea is that a week is usually enough for users to notice problems in usage of the package for their normal work.
After enough users have verified the package as "Ok", and not enough users had problems (again I shall discuss my ideas for scoring later on), the package gets transferred to the Stable branch.
If there are too many problems with a package, it might go back to the unstable branch or popped off completely. I'm not entirely sure what would work better for this case yet.
The Stable Branch
The stable branch behaves as expected from other distros.
Development Packages
Many distributions also want to make it easy for users and other developers to test development versions of upstream packages. For these, I propose a parallel repository system with just two different repositories: unstable-development and testing-development. Both of these should work as the ones above, but the difference is that a package never retires from testing-development to any other repository; it either stays there, or is deleted by the packagers.
Scoring System and "Social" Stuff
This scheme might work even with a simple scoring system, but I think that a more complex system might yield better results. I've held back on explaining this part so far because I think that although this should be part of the integral solution, it is not the main part, and Social-whatever is used too much as a buzzword nowadays.
The first part of this idea is that users should have an account on the package testing system. I don't know if anonymous voting would work without trying it out, so it might be an option, but anonymous votes should have the least weight on verification queue voting.
This account could be used across multiple computers by the same user (laptop, desktop), allowing the user to, for example, install Firefox 3.6 from the testing repository on their work computer on monday, install the same package on their laptop on the following friday, and report it as "Ok" on the following monday on their laptop -- that is, the testing week is not tied to the system the package is installed, but to the time and date it was first installed by that user on one of his systems.
Also, the weight of a user's "Ok" to a package would be tied to multiple things: account creation time (long-time users are more likely to give more trust-worthy reviews), number of correct verifications (the user could be penalized for giving an "Ok" to a package that was eventually popped off unstable, and rewarded if a package it marked "Ok" on testing and unstable made it through to stable), maybe a community rating (part of the distro team, helps people on the forums regularly, etc).
These combined values would determine the weight of a user's vote, and this weighted vote would be the one used when determining if a package is ready or is too buggy. For example, if 15 very trusted users gave an "Ok" for a package, maybe it can retire from unstable into testing right away, instead of having to wait for 200 anonymous users to test it.
Epilogue
I hope some (or even all) of these ideas can be applied in the creation of a simple linux distribution (or transformation of an existing one) that does rolling upgrades. Most binary, easy-to-use distros currently do not possess such a feature, and users are normally left to pick and match from experimental "repositories" that have different degrees of testing and readiness, and normally there is no standard way to flag problems or open bug reports on them.
If you read this entry so far, I thank you, and I hope I was able to spark some ideas about this issue. Please do leave your opinions on my proposal, or anything else you think is important, on the comments below.