Kernel sets of packages

By analysing the structure of a package base we can discover some properties that are useful for the construction of "smart" algorithms that take into account these properties and uses this knowledge in order to handle problem that would have otherwise been too hard to be considered.

In this page we show a simple analysis which tries to indentify a "good" subset of packages which occurs in "many" package dependency graphs.

The metrics which defines what is good and how much is many is not defined. However these two parameters can be used to choose, among the results of this analysis, a way for clustering the whole set of packages by using these criteria.

Goal

The goal is that of building incrementally a set R of package names. For each package P, let Pd the set of package names that are present in the dependency graph of P. The dependency graph is the one that has been extracted from the global graph (see WP2Current Analysis). For each generated R we check, for every P in the package base if R is contained in Pd. And we count how many packages have this property.

How is R built

In order to build R we use a greedy algorithm.

PB = {Package base P1...Pn}

N = {All the package names in the package base N1...Nn}

R0 = {}

while(N != empty) {

For every name in N, let PN be the package name which occurs the most in the package dependency graph of every package P in PB.

R = R union {PN}

Count the number of package dependency graph which contain R as a subset.

N = N - {PN}

PB = PB - {Pmax} (Pmax is the package corresponding to PN}

}

Result

The result of this analysis can be found at results. They show the size of R and the number of packages in which that R occurs.

The results have been generated by using the Debian Package Pool.

The kernel set of size 1 is, of course, the one with the only libc6 element. As it was foreseeable, libc6 is the package which occurs the most in all the package base (17089 packages out of 19177)

What is the best size for R can be choosen at runtime. In this way we can vary the clustering structure of the graph and make the algorithms work in more efficient way.

Version 1.5 last modified by StephaneLauriere on 02/01/2006 at 09:34

Comments 0

No comments for this document

Attachments 0

No attachments for this document

Creator: FabioMancinelli on 2005/07/03 22:44
Copyright EDOS Consortium
1.1.1