Large Software Systems Management and EDOS Workshop, Nancy, 6/7 July 2006
Members of various Linux distributions, researchers, hackers and a few more interested people met in Nancy, France, in the context of the 7th RMLL event, to compare their experiences on current issues concerning the management of large software systems, of which Linux distributions are a prominent and peculiar example. The meeting was separated into two parts: on July 6th, the LSSM'06 workshop was held and a series of talks from Debian key actors, Caixa Magica representatives and Mandriva contributors addressed a series of problems related to quality assurance, release mangement, maintenance and customization in their respective distributions. On July 7th, we had the first EDOS workshop, where members of the project gave an overview, and quite a lot of demos, of the work done by EDOS over its first year and a half of existence. EDOS is a specific targeted research project funded by the European Commission under the IST activities of the 6th Framework Programme. You can see a summary of the talks by following the links for each event, but of course you will find here more info on what actually happened.The problem(s)
So, what are the problems faced by a Linux distribution editor today? Well, some people think that their main issues are related to stabilizing a business model, getting some cool logo and a lot of good press, but we were more interested in looking at real, hard technical and scientific problems that are specific to the pretty revolutionary role of a Linux distribution editor. Distribution editors try to offer some kind of reference viewpoint over the breathtaking variety of free software available today: they take care of packaging, integrating and distributing tens of thousands of software sources, very few being developed in-house and almost all coming from independent developers. We believe that the role of distribution editors is deeply novel: no comparable task can be found in the traditional software development and distribution model, where the time required for just signing - let alone negotiating! - tens of thousands of distribution contracts would be prohibitive. Now, one reason why you go to a distribution editor to get your free software, instead of just collecting it on your own directly from the author's pages, is simply the quality and ease of the deployment and maintenance of the software that the editors guarantee you: you want a system that is easy to install, easy to maintain and upgrade, easy to fix when a security problem comes up, up-to-date with the latest developments. Well, guaranteeing this level of quality is not easy, and we believe that building a sane interaction between the wonderful people that actually make the system work today, and computer science researchers that find these issues challenging, is a good step forward. In Nancy, our goal was to put together experts from different distributions to confront experiences, ideas, tools, and solutions about the difficult task of maintaining a Linux distribution, and also to ask their informed opinions on some tools and ideas that are currently being developed in the EDOS project, which is open to collaborate with everybody interested in these issues (by the way, we also had invited RedHat, Suse and Ubuntu people to come). Here are the subjects we talked about:- quality assurance in Debian
- release management in Debian
- building custom distributions in Debian
- package managers: limitations and possible improvements
- dependency analysis
- security issues
- testing frameworks
- distribution frameworks
- indicators for the distribution building process
dependency analysis tools: they may help manage sets of packages
A package A may need, to be installed, that some other packages B be installed too (this is a dependency), and may also require that some package C is not installed (this is a conflict). Such information is present in the package metadata. No matter which concrete package format one uses (DEB, RPM, or something else) one finds basically the same expressive power, which amounts to using the Boolean operators AND and OR both in dependencies and conflicts. This metadata is exploited by the zillion depsolvers that are used daily by millions of people to maintain their own configuration on their own machine. We know that these depsolvers can still be improved, but this was not the focus of our discussion: we are aiming at improving the quality of a distribution, not of a user installation, which is a different problem. Now, one aspect of maintaining a distribution, and improving its quality, is related to package interdependencies:- all packages present in a stable release should be at least installable individually, otherwise they are broken, and they should be removed or fixed
- when moving from a release to a new one, we should ensure that the user experience is not degraded: if she was able to install a set of packages S in the old release, she should have a way to install some set S' of packages with at least the same functionality in the new release as well
- some decent tools should be in the maintainer's hands to help him track problems and their evolution easily
Testing installation of a whole distribution
Another key issue addressed by EDOS relates to the quality assurance of a Linux Distribution. During the workshop, the current status of EDOS TULIP framework has been presented. TULIP stands for ?Testing Upgrades of Linux Images Program?. TULIP's purpose is to drive upgrade tests of various Linux distributions to ensure both fine grained QA at the package level and testing the reliability of the upgrade result. EDOS testing framework will build upon TULIP and will ultimately consist of:- a generic XML format describing a test or a test suite pertaining to the installation, update or functional testing of a Linux package
- a meta-test runner that takes XML test descriptions as input and that delegates the test-run to an underlying test runner such as Dogtail for GUI testing, OpenOffice.org QA system for OpenOffice.org related tests, or to TULIP or any other test runner.
- TULIP for testing upgrade tests
- a web portal taking over the workflow associated with the creation of tests, the submission of test reports and the display of associated metrics.
Ideas and experiments for distributing over P2P networks
While many distributions address the problem of disseminating packages via a network of mirror sites, that the user chooses at download time, and seem happy with this solution, this is not the case for every distribution, and it is interesting to experiment with alternative solutions designed from scratch to scale up in the future. One such alternative was discussed at the workshop: a P2P architecture for content dissemination, that improves classical hierarchy of mirrors or centralized index architectures. The main idea is to optimize the use of resources in the dissemination network, by distributing the effort among all the participants. Instead of addressing targeted peers (the publisher site, a specific mirror) for file download, for querying content, etc, users address the P2P system as a whole. The system manages itself the load balancing through fair distribution of metadata files, index entries, download bandwidth and query processing among the peers. The general dissemination architecture proposed in EDOS is a P2P network composed of three kind of peers: a Publisher, a set of Mirrors and a large set of Clients. Each peer may play several roles such as publishing, querying, downloading, subscribing, notifying, etc. Dissemination is considered in two different situations: (i) flash-crowd, when many peers are waiting for publication of popular, large size content (e.g. a new Linux release), so many clients download almost the same content at the same time, and (ii) off-peak, when users are querying the system and download individual packages. Information management is based on a distributed index, where metadata is indexed at publishing time and that provides advanced querying capabilities. The system is based on a Java API that provides several levels of access to EDOS dissemination functionalities. The EDOS dissemination API uses the KadoP distributed XML repository, the ActiveXML P2P document manager and the Azureus BitTorrent software for flash-crowd dissemination. The prototype under construction, uses Tomcat/Axis and provides JSP applications on top of the EDOS dissemination API for each actor type. For more details, see an overview of the system architecture and the workshop presentation slides.Indicators for distributions
EDOS Consortium is also conducting a transverse measurement effort conducted for defining indicators of the production, the testing and the dissemination processes of a Linux distribution. The objective is to propose a framework to produce and to consume indicators measuring the achievements of the EDOS tool chain. During the RMLL workshop, a state of the art on existing initiatives related to OSS measurement has been presented. Such initiatives include The Open Business Readiness Rating, the OSSMole project, the Open Source Maturity Model and the recent ohloh.net project. EDOS measurement approach focuses on the process of crafting and disseminating a Linux distribution. The measurement framework relies on EDOS Project Management Interface (PMI), a generic model representing the main artefacts and workflows of OSS engineering. A series of indicators has been defined, each of which is measured by a dedicated tool produced by either by EDOS toolchain or by existing software in the OSS ecosystem. The objective is to have a customizable dashboard that will let all partakers have their own aggregated viewpoints on the processes and results they're interested in, depending on their activity profile: developers, maintainers, testers, documention writers need dashboards that display data related to their core activities. EDOS metric main categories relate to the management of dependencies, the production process, the quality assurance process and to the package dissemination over a P2P network. Existing tools provided by the Mandriva Linux contributor community have been presented and discussed, including Sophie that allows complex queries and statistics against the Mandriva RPM database, and Youri that features a quality assurance dashboard. Next steps will consist continuing the implementation of the indicators defined, and in providing the community with a standard representation of the metrics of an open-source project, and more specifically a standard representation of a Linux Distribution Project. Along the line of the DOAP initiative, these standards could consist in two RDF schemas entitled "MOAP" and "MOALD", respectively standing for ?Metrics Of A Project? and ?Metrics Of A Linux Distribution?.
Version 1.51 last modified by RobertoDiCosmo on 04/10/2006 at 17:39
Comments: 0