EDOS distribution system specifications
1. Introduction
The EDOS distribution system implements a P2P network that allows distributing pieces of open-source software to the participants. We distinguished three categories of objects distributed by the system: packages, utilities and collections. Participants to the P2P system are computers connected to the web, playing one (or several) of the following roles in the distribution process:- publisher
- replicator
- client
- publishing an object (package, utility or collection)
- replicating an object
- downloading an object
- creating and managing distribution channels for subscribers
- subscribing to channels
- pushing objects to subscribers
- querying the objects in the network
- etc
2. Architecture
Peer-to-peer system
There are three types of participants in the P2P distribution system:- a single publisher peer (Mandriva), which publishes new objects in the system.
- a set of trusted peers (mirrors), that keep replicas of the published objects, downloaded from the publisher peer or from another mirror.
- a large set of client peers, downloading objects from the publisher peer or from some mirror.
Normally, the number of mirrors should be as large as possible, in order to optimize distribution by parallel downloading from several sources. Parallel downloading, inspired by the BitTorrent method, applies here to sets of packages, where each package is taken from a different source. For very large packages, cutting the package in smaller pieces is also possible.
However, in distribution systems, there is naturally a limited number of mirrors ensuring a continuous service for accessing data and large bandwidth to support heavy downloading. In our system we extend the role of mirror to trusted clients. Even if such peers are not continuously connected, nor provide large bandwidth, their number at any moment is large enough to significantly improve performances. Moreover, in such large open-source distribution communities, the natural tendency is to continuously add new peers to the set of trusted peers.
Software components
The EDOS distribution system is implemented in Java, on top of the ActiveXML and KadoP systems. Each peer may use the following software components:- the ActiveXML peer layer, providing the basic P2P framework
- the KadoP peer layer, for indexing and querying information about EDOS objects - using ActiveXML
- the EDOS distribution system layer - using ActiveXML and KadoP
- the distribution application - using the EDOS layer
The main difference between clients and mirrors is that, for security reasons, clients do not participate to the distributed index. There is no KadoP layer in the client's distribution software. Instead, clients use web services provided by mirrors to access index-based functionalities. However, client objects (packages, utilities, collections) are considered as replicas (they are indexed by the system), because checking the MD5 object signature is enough to ensure security.
API levels
There are two levels of API implementation:- the logical level, providing high level functionalities for EDOS distribution: publishing, replicating, downloading, querying, subscribing to software objects. Normally, distribution applications need only this logical level API.
- the physical level, providing lower-level functions, used to implement the logical level functions. Physical level functions provide a finer grain access to EDOS distribution functionalities, in order to implement different strategies than those provided by the logical level. The set of methods provided by this API level are much more implementation dependent.
3. Modules
Data model package
This package implements the distribution API data model. Classes (to be detailed):- Package, Utility, Collection
- PackageMetadata, UtilityMetadata, CollectionMetadata
- PackageID, UtilityID, CollectionID
- Channel, Subscription, SubscriptionID, SubscriberInfo
- etc.
<!DOCTYPE EDOS_DISTRIBUTION [
<!ELEMENT EDOS_DISTRIBUTION (OBJECT*)>
<!ELEMENT OBJECT (ID, TYPE, SIZE, DOMAIN*)>
<!ELEMENT ID (NAME, VERSION?)>
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT VERSION (#PCDATA)>
<!ELEMENT TYPE ("package"|"utility"|"collection")>
<!ELEMENT SIZE (#PCDATA)>
<!ELEMENT DOMAIN (#PCDATA)>
]>
<edos_distribution>
<object>
<id>
<name>/cooker/media/main/ldconfig-2.3.5-5mdk.i586</name>
<version>2.3.5-5mdk</version>
</id>
<type>package</type>
<size>564137</size>
<domain>System/Base</domain>
</object>
</edos_distribution>
Physical level package
Classes PhysicalPublisher, PhysicalReplicator, PhysicalClient (find better names?), provide lower level methods used by the logical level methods. See the logical level package description below: actions in algorithm descriptions for logical level methods correspond to methods in physical level classes. Class Peer gathers the common features of peers whatever their role is. Classes PhysicalPublisher, PhysicalReplicator, PhysicalClient are subclasses of Peer. class Peer{- peer information (location in the network)
- sets of packages, utilities, collections: in memory structures + persistence
- KadoP, ActiveXML controllers
}
class PhysicalPublisher extends Peer{- //Actions used in the logical level class, to be completed
- void publishPackageLocation(Package p)
- void publishPackageMetadata(Package p)
- void publishPackagePush(Package p, ClientList clist)
- void publishUtilityLocation(Utility u)
- void publishPackageMetadata(Utility u)
- void publishUtilityPush(Utility u, ClientList clist)
- void publishCollectionInfo(Collection c)
- void publishCollectionPush(Collection c, ClientList clist)
- void insertIntoCollection(Collection c, ObjectList olist)
- void replacePackage(Package pold, Package pnew)
- void replaceUtility(Utility u)
}
class PhysicalReplicator extends Peer{- //Actions used in the logical level class, to be completed
- void publishReplicatedPackagePush(Package p, ClientList clist)
- void publishReplicatedUtilityPush(Utility u, ClientList clist)
- void publishReplicatedCollectionPush(Collection c, ClientList clist)
}
class PhysicalClient extends Peer{- //Actions used in the logical level class, to be completed
- LocationList locatePackage(PackageID pid)
- LocationList locateUtility(UtilityID uid)
- LocationList getBestPackageLocations(PackageID pid)
- Location getBestUtilityLocation(UtilityID uid)
- ObjectIDList getCollectionValue(CollectionID cid)
- PackageIDList getCollectionPackages(CollectionID cid)
- UtilityIDList getCollectionUtilities(CollectionID cid)
- PackageIDList computeMissingPackages(PackageIDList pidlist)
- UtilityIDList computeMissingUtilities(UtilityIDList uidlist)
- LocationMap getBestLocations(PackageIDList pidlist, UtilityIDList uidlist)
}
interface PhysicalSubscriptionServer{ //to be implemented by PhysicalPublisher- void addSubscribtion(String chname, SubscriberInfo user, Subscription s): called by the publisher web service that registers a subscription
- ~~void removeSubscribtion(String chname, SubscriberInfo user, SubscriptionID sid) ~~
- ~~void multicast(ObjectList olist, ClientList clist) ~~: multicast distribution of the objects to the subscribers
- //To be completed with actions from the logical level class
}
interface PhysicalSubscriptionClient{ //to be implemented by PhysicalClient- //To be completed with actions from the logical level class
}
Logical level package
class Publisher{ //Uses PhysicalPublisher- void publishPackage(Package p){
- publish in KadoP package location and metadata
- update the composition of the package's collection (a new package is always added to the most recent version of the collection)
- publish the composition of the package's collection
- }
- void publishUtility(Utility u){
- if utility already exist, unpublish existing replicas in KadoP
- if new utility, publish utility location in KadoP
- publish utility metadata in KadoP
- update and publish the composition of the utility's collection (if the composition changes)
- }
- ~~void publishCollection(Collection c) ~~{
- publish the composition (list of object Ids) and the metadata of the collection in KadoP
- recursively publish sub-collections, packages and utilities
- update and publish the composition of the parent collection
- }
- ~~void deleteInCollection(Collection c, ObjectList olist) ~~{
- remove the objects in the object list from the collection (from the composition list)
- republish the metadata file for the collection
- }
}
class Replicator{ //Uses PhysicalReplicator- ~~void publishReplicatedPackage(Package p) ~~{
- publish in KadoP the location of the replicated package
- }
- ~~void unpublishReplicatedPackage(Package p) ~~{
- unpublish in KadoP the location of the replicated package
- }
- ~~void (un)publishReplicatedUtility(Utility u) ~~: similar to package
- ~~void (un)publishReplicatedCollection(Collection c) ~~{
- (un)publish in KadoP the location of all the replicated packages and utilities found at any depth level of the collection
- }
}
class Client{ //Uses PhysicalClient- ~~Package getPackage(PackageID pid) ~~{
- decide if the package should be cut in pieces
- choose the best location for the package (or for the pieces)
- download the package (or the pieces in parallel)
- update local structures on the peer
- }
- ~~Utility getUtility(UtilityID uid) ~~ : similar to package
- ~~Collection getCollection(CollectionID cid) ~~{
- get the list of all the packages and the list of all the utilities in the collection
- compute lists of missing packages and utilities
- given the lists of missing packages and utilities, decide for each package/utility if it has to be cut in pieces and for each piece what is the best downloading source.
- download pieces in parallel
- construct the collection hierarchy
- update local structures on the peer
- }
}
interface SubscriptionServer{ //must be implemented by Publisher- ~~Channel createChannel(String chname, ChannelDescription chdescr, ObjectList objlist, AccessRights rights) ~~{
- create a new channel and initialize it with the list of objects
- initialize the channel publication date of the initial objects to the current date
- publish to KadoP the channel name and description
- }
- ~~void publishPackageToChannel(Package p, String chname, Date date) ~~
- ~~void unpublishPackageToChannel(Package p, String chname) ~~
- ~~void publishUtilityToChannel(Utility u, String chname, Date date) ~~
- ~~void unpublishUtilityToChannel(Utility u, String chname) ~~
- ~~void publishCollectionToChannel(Collection c, String chname, Date date) ~~
- ~~void unpublishCollectionToChannel(Collection c, String chname) ~~
}
interface SubscriptionClient{ //must be implemented by Client- ~~StringList getChannelList( ) ~~{
- get the list of channel names from KadoP
- }
- ~~ChannelDescription getChannelDescription(String chname) ~~{
- get the channel description from KadoP
- }
- ~~SubscriptionID subscribeToChannel(String chname, SubscriberInfo user, Subscription s) ~~{
- subscribe to a channel by calling the subscription service on the Publisher
- subscription indicates: the query filter on the channel objects, the action (notification, push data), new/all objects, the moment (on publish, after a given duration)
- }
}
interface Query{ //must be implemented by Publisher, Replicator, Client- ~~PackageIDList queryPackages(String query) ~~ : get the list of packages matching the query
- ~~UtilityIDList queryUtilities(String query) ~~ : get the list of utilities matching the query
- ~~CollectionIDList queryCollections(String query) ~~ : get the list of collections matching the query
- ~~String getLastVersionNb(String name, int type) ~~ : get the last version number for the object of the given type (package, utility, collection)
- //to be completed
}
4. Open questions
- How location, metadata, collection composition, channels, etc are published in the KadoP index: XML documents or DHT entries, what format for the XML documents?
- AXML Web services provided by peers
- Interface with the multicast distribution realized by Tel Aviv
- Query language definition
- getPackage with dependencies
Version 1.14 last modified by MarcLijour on 24/11/2005 at 05:59
Comments: 0