Large Software Systems Management and EDOS Workshop, Nancy, 6/7 July 2006

Members of various Linux distributions, researchers, hackers and a few more interested people met in Nancy, France, in the context of the 7th RMLL event, to compare their experiences on current issues concerning the management of large software systems, of which Linux distributions are a prominent and peculiar example.

The meeting was separated into two parts: on July 6th, the LSSM'06 workshop was held and a series of talks from Debian key actors, Caixa Magica representatives and Mandriva contributors addressed a series of problems related to quality assurance, release mangement, maintenance and customization in their respective distributions.

On July 7th, we had the first EDOS workshop, where members of the project gave an overview, and quite a lot of demos, of the work done by EDOS over its first year and a half of existence. EDOS is a specific targeted research project funded by the European Commission under the IST activities of the 6th Framework Programme.

You can see a summary of the talks by following the links for each event, but of course you will find here more info on what actually happened.

The problem(s)

So, what are the problems faced by a Linux distribution editor today? Well, some people think that their main issues are related to stabilizing a business model, getting some cool logo and a lot of good press, but we were more interested in looking at real, hard technical and scientific problems that are specific to the pretty revolutionary role of a Linux distribution editor.

Distribution editors try to offer some kind of reference viewpoint over the breathtaking variety of free software available today: they take care of packaging, integrating and distributing tens of thousands of software sources, very few being developed in-house and almost all coming from independent developers. We believe that the role of distribution editors is deeply novel: no comparable task can be found in the traditional software development and distribution model, where the time required for just signing - let alone negotiating! - tens of thousands of distribution contracts would be prohibitive. Now, one reason why you go to a distribution editor to get your free software, instead of just collecting it on your own directly from the author's pages, is simply the quality and ease of the deployment and maintenance of the software that the editors guarantee you: you want a system that is easy to install, easy to maintain and upgrade, easy to fix when a security problem comes up, up-to-date with the latest developments.

Well, guaranteeing this level of quality is not easy, and we believe that building a sane interaction between the wonderful people that actually make the system work today, and computer science researchers that find these issues challenging, is a good step forward.

In Nancy, our goal was to put together experts from different distributions to confront experiences, ideas, tools, and solutions about the difficult task of maintaining a Linux distribution, and also to ask their informed opinions on some tools and ideas that are currently being developed in the EDOS project, which is open to collaborate with everybody interested in these issues (by the way, we also had invited RedHat, Suse and Ubuntu people to come).

Here are the subjects we talked about:

  • quality assurance in Debian
  • release management in Debian
  • building custom distributions in Debian
  • package managers: limitations and possible improvements
  • dependency analysis
  • security issues
  • testing frameworks
  • distribution frameworks
  • indicators for the distribution building process

dependency analysis tools: they may help manage sets of packages

A package A may need, to be installed, that some other packages B be installed too (this is a dependency), and may also require that some package C is not installed (this is a conflict). Such information is present in the package metadata. No matter which concrete package format one uses (DEB, RPM, or something else) one finds basically the same expressive power, which amounts to using the Boolean operators AND and OR both in dependencies and conflicts.

This metadata is exploited by the zillion depsolvers that are used daily by millions of people to maintain their own configuration on their own machine. We know that these depsolvers can still be improved, but this was not the focus of our discussion: we are aiming at improving the quality of a distribution, not of a user installation, which is a different problem.

Now, one aspect of maintaining a distribution, and improving its quality, is related to package interdependencies:

  • all packages present in a stable release should be at least installable individually, otherwise they are broken, and they should be removed or fixed
  • when moving from a release to a new one, we should ensure that the user experience is not degraded: if she was able to install a set of packages S in the old release, she should have a way to install some set S' of packages with at least the same functionality in the new release as well
  • some decent tools should be in the maintainer's hands to help him track problems and their evolution easily
The Debian people described the (complex) process of stabilizing a testing distribution, and the nice tools and methodology they use to achieve this goal.

The EDOS team working on dependencies (code named WP2) reported on the theoretical investigation of this problem. They presented a set of tools developed from scratch to provide new answers.

The most important theoretical result was the NP-completeness of installability: it has been formally prooven that checking the installability of a single given package w.r.t. a given distribution is an NP-complete problem equivalent to SAT (see the proof in Chapter 3 of the D2.1)

This begs the question of the algorithmic efficiency of testing for installability. (Recall that most theoretical computer scientists believe that problems that are NP-hard are, well, hard.) Fortunately complexity classes like NP are defined based on worst-case complexity, that is, by the time it takes for the hardest possible problems, so this theoretical result does not prevent the instances we find in practice to be quite easy. Indeed, there is another theoretical concept, that of the temperature of SAT problems, which can be used to measure their difficulty. Good news: the temperatures of the installability problems for Debian and Mandriva packages are far away from the critical 4.2 value -- they are be easy to solve. This explains why the EDOS tools devoted to testing the installability of single packages perform well: debcheck/rpmcheck can analyze a whole Debian pool or the full Mandriva Cooker in a couple of minutes, and the toolchain takes a couple of dozens of minutes.

We discussed these issues at the workshop and we thought that the SAT temperature could be useful for finding ``hot'' packages, that is, packages whose temperature approaches (from below) the 4.2 zone where the hard problems lie. (If you are curious, a sorted list for etch at the end of June 2006 is available here). You can measure the temperature of your favorite packages in your favorite distributions by yourself with the history tool (for Debian) or the toolchain (for RPM based distributions).

Computing installability efficiently is all nice and well, but wouldn't it be nicer for the maintainers and the common folk if these data were shown in a form more pleasant to the senses and more easy to work on than an ASCII dump? The stakhanovists of WP2 have some answers already.

To visualize historical dependency data, one may track the evolution over time of installability information, as it may help spotting problems and seeing the evolution of a package. For standard Debian repositories, the ``lifetable'' pages of the web server anla provide, for every package, a chart with the day-to-day evolution of the installability for stable, testing and unstable which also nicely shows the way versions migrate. This data is updated daily since January 2006.

The anla tool also provides a nicely interlinked metadata information for all Debian packages over that period and, unlike most package search engines, it does not forget old packages, even if this means taking care of the significant growth of the distributions.

Debian has good days and bad days. On bad days, over 1 package in 10 in unstable can be broken, against only 1% on good days. Although installability is a function of all the packages in the distribution on a given day, it is sound advice to avoid upgrading your system on a bad day. To know if you are on a bad day, just check the Debian Weather Service also provided by anla. Today is sunny, but I would have avoided upgrading my unstable system on on the 11th or 12th of April 2006, as there was heavy rain and thunder (12.5% of packages broken).

Mere mortals may be satisfied by looking at colourful weather icons but maintainers need to manipulate historical dependency information in complex ways to spot and dissect problems and see exactly when they started. EDOS WP2 developed a query language, dubbed "Debian Query Language", that allows to perform complex queries and tests on repository data, including installability checks.

Interested people can point their firefoxes at the EDOS console to try it out without installing anything. For serious use, one should install history which is a command-line tool without the resource and I/O limitations of an AJAX interface.

We also discussed the necessity of having some kind of transaction mechanism in the client side tools used to manage distributions: Caixa Magica people presented an experimental version of apt-rpm that supports rollbacks, and some comparisons were done with features existing in the new versions of rpm.

Testing installation of a whole distribution

Another key issue addressed by EDOS relates to the quality assurance of a Linux Distribution. During the workshop, the current status of EDOS TULIP framework has been presented. TULIP stands for ?Testing Upgrades of Linux Images Program?. TULIP's purpose is to drive upgrade tests of various Linux distributions to ensure both fine grained QA at the package level and testing the reliability of the upgrade result.

EDOS testing framework will build upon TULIP and will ultimately consist of:

  • a generic XML format describing a test or a test suite pertaining to the installation, update or functional testing of a Linux package
  • a meta-test runner that takes XML test descriptions as input and that delegates the test-run to an underlying test runner such as Dogtail for GUI testing, OpenOffice.org QA system for OpenOffice.org related tests, or to TULIP or any other test runner.
  • TULIP for testing upgrade tests
  • a web portal taking over the workflow associated with the creation of tests, the submission of test reports and the display of associated metrics.

Ideas and experiments for distributing over P2P networks

While many distributions address the problem of disseminating packages via a network of mirror sites, that the user chooses at download time, and seem happy with this solution, this is not the case for every distribution, and it is interesting to experiment with alternative solutions designed from scratch to scale up in the future.

One such alternative was discussed at the workshop: a P2P architecture for content dissemination, that improves classical hierarchy of mirrors or centralized index architectures. The main idea is to optimize the use of resources in the dissemination network, by distributing the effort among all the participants. Instead of addressing targeted peers (the publisher site, a specific mirror) for file download, for querying content, etc, users address the P2P system as a whole. The system manages itself the load balancing through fair distribution of metadata files, index entries, download bandwidth and query processing among the peers.

The general dissemination architecture proposed in EDOS is a P2P network composed of three kind of peers: a Publisher, a set of Mirrors and a large set of Clients. Each peer may play several roles such as publishing, querying, downloading, subscribing, notifying, etc. Dissemination is considered in two different situations: (i) flash-crowd, when many peers are waiting for publication of popular, large size content (e.g. a new Linux release), so many clients download almost the same content at the same time, and (ii) off-peak, when users are querying the system and download individual packages. Information management is based on a distributed index, where metadata is indexed at publishing time and that provides advanced querying capabilities.

The system is based on a Java API that provides several levels of access to EDOS dissemination functionalities. The EDOS dissemination API uses the KadoP distributed XML repository, the ActiveXML P2P document manager and the Azureus BitTorrent software for flash-crowd dissemination. The prototype under construction, uses Tomcat/Axis and provides JSP applications on top of the EDOS dissemination API for each actor type. For more details, see an overview of the system architecture and the workshop presentation slides.

Indicators for distributions

EDOS Consortium is also conducting a transverse measurement effort conducted for defining indicators of the production, the testing and the dissemination processes of a Linux distribution. The objective is to propose a framework to produce and to consume indicators measuring the achievements of the EDOS tool chain.

During the RMLL workshop, a state of the art on existing initiatives related to OSS measurement has been presented. Such initiatives include The Open Business Readiness Rating, the OSSMole project, the Open Source Maturity Model and the recent ohloh.net project.

EDOS measurement approach focuses on the process of crafting and disseminating a Linux distribution. The measurement framework relies on EDOS Project Management Interface (PMI), a generic model representing the main artefacts and workflows of OSS engineering. A series of indicators has been defined, each of which is measured by a dedicated tool produced by either by EDOS toolchain or by existing software in the OSS ecosystem. The objective is to have a customizable dashboard that will let all partakers have their own aggregated viewpoints on the processes and results they're interested in, depending on their activity profile: developers, maintainers, testers, documention writers need dashboards that display data related to their core activities.

EDOS metric main categories relate to the management of dependencies, the production process, the quality assurance process and to the package dissemination over a P2P network.

Existing tools provided by the Mandriva Linux contributor community have been presented and discussed, including Sophie that allows complex queries and statistics against the Mandriva RPM database, and Youri that features a quality assurance dashboard.

Next steps will consist continuing the implementation of the indicators defined, and in providing the community with a standard representation of the metrics of an open-source project, and more specifically a standard representation of a Linux Distribution Project. Along the line of the DOAP initiative, these standards could consist in two RDF schemas entitled "MOAP" and "MOALD", respectively standing for ?Metrics Of A Project? and ?Metrics Of A Linux Distribution?.

Version 1.51 last modified by RobertoDiCosmo on 04/10/2006 at 17:39

Comments 0

No comments for this document

Attachments 1

BZ
etch.dat.bz2 1.1
PostedBy: RobertoDiCosmo on 17/07/2006 (144kb )

Creator: RobertoDiCosmo on 2006/07/14 16:58
Copyright EDOS Consortium
1.1.1