User's Guide for the EDOS distribution prototype
1. Overview
The prototype of the EDOS Distribution System consists in two distinct web applications using web services for communication. The first one instantiates a Publisher peer, playing the Publisher role in the network. The second one can serve either as a Replicator peer or as a simple Client, according to the configuration parameters. Both applications are written in Java and need a running Tomcat web server for deployment. At the lower level in the architecture, the applications use Azureus module for file transfer. This module comes along with the web applications and it has configured parameters for the prototype. Java 2 Platform Standard Edition 5.0 is needed to run the applications and also Apache Tomcat web server, version 5.5.2. Install the applications
We created a tarball archive for both Publisher and Client applications, containing:- a pre-configured Tomcat web server for each web application along with all the needed libraries
- the Publisher and the Client web applications along with their user interfaces (jsp)
- install scripts that extract the applications from the archive and change the configuration parameters for the installation machine
- TOMCAT_PORT (default=9090) - if you want to use a different port for the web server or if you run more Tomcat instances on the same machine
- PUBLISHER - the address of the Publisher peer in the network
3. Publisher webapp configuration
The configuration parameters for the Publisher application are grouped together in "edos.properties" file, located in the web application's path:- /data_store/programs/server/tomcat/webapps/edos
- edos.peer.log4j.configfile=log4j.properties
- edos.index.bootstrapnode=classos.futurs.inria.fr
- edos.index.bootstrapport=12345
- edos.index.repository=kadop.txt
- edos.axml.axis.peerurl=http://classos.futurs.inria.fr:8080/edos/servlet/AxisServlet
- edos.axml.axis.servicename=ReceiveMessageService
- edos.axml.axis.methodname=receiveMessage
- edos.release.repository.path=/local/data_store/programs/server/distributions/
- edos.idip.BittorentTrackerAddress=http://classos.futurs.inria.fr:30485/announce
- edos.home=/local/edos-publisher
- edos.peer.peertype=PUBLISHER
4. Replicator/Client webapp configuration
The configuration parameters for the Replicator/Client application are grouped together in "edos.properties" file, located in the web application's path:- /data_store/programs/client/tomcat/webapps/edos-client
- edos.peer.log4j.configfile=log4j.properties
- edos.index.bootstrapnode=classos.futurs.inria.fr
- edos.index.bootstrapport=12345
- edos.index.repository=kadop.txt
- edos.axml.axis.peerurl=http://atos.futurs.inria.fr:9090/edos-client/servlet/AxisServlet
- edos.axml.axis.servicename=ReceiveMessageService
- edos.axml.axis.methodname=receiveMessage
- edos.release.repository.path=/local/data_store/programs/client/distributions/
- idip.path=/local
- edos.peer.peertype=CLIENT
5. Publisher Peer Application
The Graphical User Interface (GUI) for the publisher peer is represented by a stand-alone web application. It is a standard webapp implemented by Java Server Pages (JSP) and deployed in a Tomcat web server.5.1. Run the Publisher
After installing the publisher application (see the "Install" chapter of this guide), the Tomcat web server is already configured and prepared to load the web application.- Note: We configured the Tomcat web server to use by default the following ports for the publisher application:
- Port 8080 - for HTTP connections
- Port 8005 - for shutdown command
- /tomcat/conf/server.xml
- /tomcat/bin/startup.sh (or /tomcat/bin/startup.bat for Windows)
- http://localhost:8080/edos (or replace "localhost" with your machine's name; see also the "Publisher webapp configuration" chapter)
- Distributions
- Channels
- Clusters
- PeerID - the unique ID assigned to this peer in the network
- Address - the IP endpoint of the peer
- ...
5.2. Publish a new distribution
As we previously presented in the system's architecture, each component of the system offers a specific set of functionalities according to its type. We grouped the functionalities into "roles" and each component (peer) can play distinct roles in the system. The main role played by the publisher peer is the Publisher role. It is the only peer in the network that has the wright to publish or to delete a new distribution from the distributed index. In other words, the publisher is the only one to insert new distributions in the system. The content of the distributions is put available for download and the metadata of each published distribution is stored in the index.- Note that when deleting a distribution, its content will not completely disapear from the system. Only the associated metadata will be deleted from the index. Each client peer storing parts of the distribution will keep its content, but this content will be no longer indexed in the system. Therefore, the queries on the deleted distribution will not be possible any more.
- Step 1°: Call "publishRelease(releaseName)" method of the Publisher role;
- Step 2°: Get the distribution's content from the Content Manager;
- Step 3°: For each DataUnit in the distribution, the MetadataBuilder module will extract the metadata and store it in an XML document;
- Step 4°: The Index Manager will publish the metadata into the distributed index (at the KadoP level).
- Start time
- End time
- Distribution
- Channel
- Step 1°: Before everything, the publisher has to put available the new distribution in the index: Publish distribution.
- Step 2°: The clients will see the new distribution and they could subscribe for downloading it: Subscribe to channel.
- Step 3°: The publisher makes the list of all the clients subscribed to this distribution.
- Step 4°: When the publisher decides that it is ready to start the dissemination process, it sends a notification message to all the clients on the list: Publish event.
- Step 5°: The clients receive the notification and send back their WishLists, in compliance with the time-window defined by the publisher. Now the clients are ready and waiting to download the content.
- Step 6°: The publisher receives the WishLists from the clients during the time-window interval. It has also the possibility to close the time-window earlier than the scheduled "end time" (this is especially useful for testing).
- Step 7°: The publisher prepares the content for dissemination: Cluster data.
- Step 8°: The publisher creates the torrents for each cluster of data and publishes them in Azureus: Publish clusters.
5.3. Channel manager
The "Channels" page of the GUI represents the front-end of the Channel Manager on the publisher peer. In the upper part of the page the table displays the list of channels connected on the publisher. For each channel available on the publisher peer you have the information concerning:- the name of the channel
- a short description of the channel
- and the "Delete" action associated to the "removeChannel(channelName)" method from the Channel Manager
5.4. Clustering data
The "Clusters" page of the GUI is used as an input form for the clustering algorithm. The data clustering is implemented at the physical level in the publisher peer architecture, in IDiP module. The algorithm computes clusters of files in order to achieve an efficient dissemination mechanism. The clustering is based on the subscriptions registered by the publisher and on the WishLists sent by each subscribed client. A WishList is a set of DataUnits that a client wants to download. Basically, the WishList is computed as a difference between what the client already has (in its local repository) and what it was newly published in the system. See also the "Client Peer Aplication" chapter of this guide. In a first phase after sending a "publish event" to its subscribers, the publisher waits to receive the WishLists. For each "publish event" associated to a distribution, the publisher defines a starting time and an ending time for the WishLists' reception. This is what we called a "time-window". In the upper part of the "Clusters" page there is a table showing the state of each distribution:- Distribution name
- Distribution state (the state of the scheduled time-window: opened/closed)
- and a "Close window" action that can be used to shorten the waiting delay
- Number of clusters
- Distance metric
- euclidean
- squared euclidean
- manhattan
- pearson correlation
- square pearson correlation
- chebychev
- Iterations number
6. Client Peer Application
The Graphical User Interface (GUI) for the client peer is represented by a stand-alone web application. It is a standard webapp implemented by Java Server Pages (JSP) and deployed in a Tomcat web server.6.1. Run the Client
After installing the client application (see the "Install" chapter of this guide), the Tomcat web server is already configured and prepared to load the web application.- Note: We configured the Tomcat web server to use by default the following ports for the client application:
- Port 9090 - for HTTP connections
- Port 9005 - for shutdown command
- /tomcat/conf/server.xml
- /tomcat/bin/startup.sh (or /tomcat/bin/startup.bat for Windows)
- http://localhost:9090/edos-client (or replace "localhost" with your machine's name; see also the "Replicator/Client webapp configuration" chapter)
- Home
- Repository
- Channels
- Distributions
- Query
- PeerID - the unique ID assigned to this peer in the network
- Address - the IP endpoint of the peer
- ...
6.2. Client's roles
As we previously presented in the system's architecture, each component of the system offers a specific set of functionalities according to its type. We grouped the functionalities into "roles" and each component (peer) can play distinct roles in the system. The first version of the prototype implements two roles for a client peer:- Client role - for the basic content management and download operations
- Query role - for the advanced query operations
6.3. Local repository manager
The client peer stores its data content into a local repository. This content is represented by the DataUnits (Packages, Utilities and whole Distributions) that were previously downloaded from the system. There is also the possibility that Distributions could be manually added to the local repository, but this is not the regular method. We use this feature only for testing reasons. Basically, the local repository is a file system on the client machine. Its location is given by "edos.release.repository.path" parameter in the properties file (see also the "Replicator/Client webapp configuration" chapter). This is the root directory of the repository. Each Distribution is stored in a different directory, using the same name as the distribution's name. The Packages and the Utilities are stored hierarchically in the file structure, according to our datamodel. At the initialization time, the Content Manager on the client peer loads the repository structure in the memory. It uses the root directory as entry point. It searches for all the Distributions available and for each package or utility file in the repository. Each DataUnit will have an associated object stored in memory. After a file download, the Dissemination Manager on the client peer creates a symbolic link in the local repository with the name of the file. This link points out to the file content which is stored in Azureus download directory: "/local/data_store/programs/edos-client/distributions" Note that a regular Client peer stores only the data content into the repository. This is the difference from a Replicator peer that, in addition to the DataUnits, it stores also the metadata content. The Replicator peers are part of the index network where the metadata is distributed on different replicators (see also the "Replicator Peer" chapter). On the "Repository" page of the GUI you will see the content of the local repository. In the upper part of the page you will find a summary of the local repository:- Current Path - the current location in the file system
- Local Distributions - the list of the distributions available locally
6.4. Channel subscription
The "Channels" page of the GUI represents the front-end of the Channel Manager on the client peer. In the table displayed you will find the summary of the peer's connections to the network. For each channel available on the client peer you have all the information concerning:- the name of the channel
- the end-point address of the channel
- the status of each channel - connected or not connected
6.5. Download distributions
The "Distributions" page represents the download section of the client peer. This base functionality is implemented by the Dissemination Manager. The table on this page displays the distributions currently available in the peer-to-peer system. This information is obtained by calling the "getDistributionList()" method from the Index Manager. Each distribution found in the system's index is searched afterwards locally by the Content Manager. The table summarizes which distributions are available in the local repository and which distributions are not, but they can be downloaded from the system. When pressing the "Download Distribution" button, the "getNewRelease()" method is called. As we defined in the system's architecture, at the logical level this method belongs to the Client role. At the physical level, it is implemented by the Dissemination Manager. The radio-buttons on the left side allow you to select the distribution to download. The download of a distribution involves several steps and a set of actions to proceed on the client application. We shall detail the download algorithm as following:- Step 1° - Wait for the announce message
- the Client is already subscribed to the main broadcast channel of the Publisher;
- the Client starts listening on this channel and it waits for an announce message from the Publisher;
- when a publish event occurs on the Publisher (e.g.: when a new release is available for download), each Client subscribed on the Publisher's list will be notified with an announce message;
- this announce message contains also the information concerning the time-window proposed by the Publisher for the broadcast;
- Step 2° - Wait for the right time-window
- the Client must check that its current time is inside the time-window for the broadcast of this distribution;
- it has to verify though that the time-window is not already closed (e.g.: its current time is past over the closing time of the window); this case should not happen unless some communication problems occured;
- it has to verify also that the starting time of the window is already passed; otherwise, it has to wait for a while;
- Step 3° - Compute the WishList
- the download of a new distribution basically consists in a parallel download of independent files;
- in the case when the selected distribution is not present in the local repository (e.g.: it has never been downloaded before by this Client), the set of files to download is actually the whole collection of files contained by the distribution (this is the case of a new release);
- in the case when the selected distribution is already present in the local repository (this is the case of an update), the Client will compute the delta between its local content of files and the new content announced by the Publisher;
- this delta is basically assessed between the Content Manager and the Index Manager;
- in both cases (new release or update), the set of files to download we called it the WishList;
- Step 4° - Send the WishList to the Publisher
- the WishList is packed in a message and sent to the Publisher
- Step 5° - Get the torrent files and download the files
- during the announced time-window, the Publisher will receive the WishLists from all its subscribed Clients;
- after closing the window, it computes the file clusters and creates the torrent files;
- the Publisher releases the torrents in Azureus and sends another message to all the Clients to inform them about the availability of the torrents;
- the Client receives the torrents message and starts its Azureus downloader;
- Step 6° - Announce completion
- Azureus finished the download of all files in the WishList;
- the downloaded files are registered by the Content Manager;
- the Dissemination Manager creates the symbolic links from the local repository to the new files downloaded by Azureus
6.6. Querying the system
The "Query" page of the GUI represents the interface of the Query role played by the client peer. The queries are passed to the Index Manager and then send lower in the psysical level, to the KadoP Manager. We defined two categories of queries:- Simple queries - based on the pre-defined tags building the metadata of DataUnits
- NAME
- VERSION
- RELEASE
- SUMMARY
- DESCRIPTION
- SIZE
- LICENSE
- ...
- Compouned queries - logical combinations of simple queries (joins)
Version 1.12 last modified by StephaneLauriere on 18/10/2006 at 14:00
Document data
Attachments:
No attachments for this document
Comments: 0