Large scale simulator

1 EDOS experiment on Grid5000 network

1.1 Experiment's preparation:

1° Register the experiment on Grid5000 network.

2° Connect to a public frontal (e.g.: Orsay site)
  • Note: Orsay frontal has a pass-through firewall, therefore the 'ssh' connection is done in 2 steps:
    • 'ssh rpop@acces.grenoble.grid5000.fr' =>
    • 'ssh rpop@oar.orsay.grid5000.fr'
3° Retrieve the sources
  • copy the application tarball in the personal account, using 'scp':
    • 'scp edos.prototype.tar.gz rpop@acces.grenoble.grid5000.fr:~/transfer/edos.prototype.tar.gz'
  • unpack the tarball:
    • 'tar cvzf ~/transfer/edos.prototype.tar.gz'
  • the repository's structure is detailed below
  • Notes:
    • home directories are mounted by NFS on all the machines of a site (Orsay, Lille, etc.)
    • there is no synchronisation between the sites; users have to synchronise data from home directory on different sites
    • for the first experiments, we use simple shell scripts to test lauching the processes.
    • the communication with the application is done by web service calls
4° Submit the experiment
  • there are two submission modes: interactive and passive
    • interactive - gives an interactive shell and is often used for debugging purpose
    • passive - for batch mode execution
  • nodes to be used in the experiment have to be reserved using 'OAR' tool
  • firstly, check the list of available nodes:
  • select a set of nodes and submit the task: 'oarsub' command
    • 'oarsub -I -l nodes=7'
    • $OAR_NODEFILE variable will contain the list of the reserved nodes

1.2 Launching an experiment (algorithm):

1° Update the configuration file:

  • 'config'
2° Launch the test scenario (Publishing and Querying - S1, Downloading and Subscription/Notification - S2), e.g. S1:
  • 'edos_s1.sh [publisher | replicator | client]'
    • 2.1° Retrieve the sources and make a local install in: '/tmp/EDOS' directory:
      • 'edos_deployment.sh [publisher | replicator | client]'
    • 2.2° Run tomcat server
    • 2.3° Sleep (init time)
    • 2.4° Call "publishRelease" web service (java PublishReleaseCall "releaseName")
    • 2.5° Sleep (publishing time) TODO: web service call-backs
  • web applications run on the local machines: '/tmp/EDOS'
  • data sets are directly accessed in home directory: '~/EDOS/distributions'
4° Retrieve the log files and save them in the home directory:
  • 'edos_collect.sh'
    • ~/EDOS/metrics/"hostname"-"date".log
5° Stop tomcat server

2 Repository's structure

~/EDOS/Prototype/Sources
      |                `-lib.tar.gz
      |                `-tomcat.tar.gz
      |                `-azureus-client.tar.gz
      |                `-azureus-publisher.tar.gz
      |                `-azureus-replicator.tar.gz
      |                `-webapp-client.tar.gz
      |                `-webapp-publisher.tar.gz
      |                `-webapp-replicator.tar.gz
      /axis
      |   `-axis-ant.jar
      |   `-axis.jar
      |   `-commons-discovery.jar
      |   `-commons-logging.jar
      |   `-jaxrpc.jar
      |   `-log4j.jar
      |   `-saaj.jar
      |   `-PublishReleaseCall.class
      /distributions
      |            `-Cooker2007
      |            `-Cooker70
      /java
      /metrics
      |      `-"hostname"-"date".log
      /scripts
             `-config
             `-edos_collect.sh
             `-edos_deployment.sh
             `-edos_s1.sh

3 Experiment's results

3.1 Publishing Scenario S1

  • Parameters:
    • INS (Indexing Network Size) = 1 (one Publisher; no Mirrors)
    • DNS (Distribution Network Size) = 1 (the Publisher)
    • MDS (MetaData Size) = 200 packages
Publishing-s1.gif

  • Parameters:
    • INS (Indexing Network Size) = 3 (the Publisher + 2 Mirrors)
    • DNS (Distribution Network Size) = 3
    • MDS (MetaData Size) = 300 packages
Publishing-s1+2.gif Publishing-s1+2-500.gif

  • Parameters:
    • INS (Indexing Network Size) = 11 (the Publisher + 10 Mirrors)
    • DNS (Distribution Network Size) = 11
    • MDS (MetaData Size) = 5120 packages (Cooker2007 Release)
Publishing-s1+10-5120.gif

4 Set up a real testing environment on the LRI cluster (100 machines).

We test the EDOS distribution system on a remote cluster located in the LRI (Laboratoire de Recherche en Informatique) network. We access the cluster via ssh from our local network. The deployment schema is illustrated in the following figure:

lri-cluster.gif

The access point to the cluster is a dedicated machine named pl-ssh2.lri.fr. Once securely connected to this machine, it gives you the rights to connect to any other machine in the LRI network.

Another machine in the network is used to store the applications to be tested on the cluster. It is called pc4-83.lri.fr and it can be accessed by ssh from pl-ssh2.lri.fr. The sources can be transfered here by simple scp commands. The deployment of the application in the cluster is done from this machine, using scripts that will automatically detect the machines available and will export the application.

Acknowledgements: This is a joint work with KadoP development team. The common goal is to test and to improve the performances of the distributed index on a large scale.

A full version of this document can be found on the KadoP documentation page.

5 Steps

  • see what adaptations are needed for the application to run on the cluster, and do them
  • checker that all the important services have their own logger declared in the code
  • configure log4j so that the loggers write their messages in individual files (example: PublishRole will store in edos.publishrole.log etc.) and that they use standard output format than can be easily parsed
    • added a file logger for PublishRole with a high threshold (FATAL) for filtering only the testing messages
  • experiment with anteater (or see possible other tools) and write a few XML descriptions of the scenario described in WP4_Unit Tests
  • write pseudo code corresponding to the scenarios so that we can understand the complexity of the scenario and see what's possibly missing in anteater
Version 1.36 last modified by RaduPop on 27/04/2007 at 17:36

Comments 0

No comments for this document

Attachments 6

Image
Publishing-s1 2.gif 1.1
PostedBy: RaduPop on 25/04/2007 (6kb )
Image
Publishing-s1.gif 1.1
PostedBy: RaduPop on 25/04/2007 (5kb )
Image
Publishing-s1.jpg 1.1
PostedBy: RaduPop on 25/04/2007 (25kb )
Image
lri-cluster.gif 1.1
PostedBy: RaduPop on 05/02/2007 (7kb )
Image
Publishing-s1 2-500.gif 1.1
PostedBy: RaduPop on 25/04/2007 (5kb )
Image
Publishing-s1 10-5120.gif 1.1
PostedBy: RaduPop on 27/04/2007 (7kb )

Creator: StephaneLauriere on 2007/01/23 12:19
Copyright EDOS Consortium
1.1.1