Mandriva Linux Cooker Mirrors Monitor

The Mirror Monitor framework will implement 7 (out of 13) diffusion indicators.

IDIndicatorDetails
I1Topology of mirrors involved in the dissemination1136194587596
I2Typology of the disseminated contents1136195059100
I3Data consistency1136129864991
I4Metadata consistency1136135363631
I5Tree edit distance from the latest version available1136135208297
I6Synchronization time and system performance1136136205882
I7System availability1136133332138

Architecture

The MirrorMonitor's architecture is depicted in the schema below:

MirrorMonitorArchitecture.gif

The MirorMonitor has three main components:

  • CONNECTOR module
  • ANALYZER module
  • MIRROR Database
Around those primary components there are some secondary components providing inputs to the monitor:
  • Mirrors
  • Scheduler
  • Reference
  • Indicators
The MirrorMonitor's output is provided by the Analyzer module in different reporting forms. Additionally, a graphical interface can be developed and integrated in the dashboard.

The Connector and the Analyzer are two distinct modules working independently on a common database.

The Connector's task is to retrieve state informations from all the mirrors known by the distribution editor. It connects to each mirror and gets the mirror's state as well as the list of the files stored at the given time. Currently, there are three protocols used to synchronize the mirrors:

  • ftp
  • http
  • rsync
The list of the known mirrors is provided by a Reference, namely a text file containing the mirrors' URL (e.g.: http://www1.mandrivalinux.com/mirrorsfull.list).

The output of the Connector essentially consists in the list of the files found on each mirror, including additional information like: size of the file, timestamp, etc. A "mirror down" message will be registered in the case of connection failure. All the data is stored in the Mirror Database.

The Analyzer's task is to compute the indicators' values based on the information stored in the database and according to the definition of the indicators presented in the first deliverable. It uses as input a Reference on the current state of the distribution (e.g.: Cooker fileset on the main server). Based on the Reference, the Analyser evaluates the measurement indicators by comparing the information selected from the database at a given moment (or a time interval).

The output of the Analyzer consists in a set of reports presenting the values of each indicator.

The both primary modules are synchronized by a Scheduler. This component is a configuration element which specifies the connection frequency to the mirrors, the time interval for the data analysis and the execution time for each module.

The detailed specification of each component is given further on.

Primary components specification

CONNECTOR

Inputs:

  • mirror_list(URL)

Outputs:

  • mirror_status
  • file_list
  • connection_time

API

* Boolean ftp Connect(Mirror) * Boolean http Connect(Mirror) * Boolean rsync Connect(Mirror) * ? *Boolean main Server Connect(Main Server)* * Date connection Time(Now) * File[? getFiles(Mirror)
  • ? *FileInputs:

    • time
    • reference ?

    Outputs:

    • report

    API

    • ? Boolean mainServerConnect(MainServer, Now)
    • ? *File[" href="/xwiki/bin/edit/Main/%3Cspan?parent=Main.CookerMirrorsMonitor">] getReferenceFiles(MainServer)*
    * storeMirrorStatus(Mirror, Date) * storeMirrorContent(Mirror, Date, Collection ?) * ? *storeReferenceContent(MainServer, Date, Collection ?)*

    ? getReferenceFiles(MainServer, Collection ?)*
    • Report evaluateIndicator(Indicator, Date)
    • Report evaluateIndicatorStatistics(Indicator, StartDate, EndDate)

    MIRROR Database

    Reflections on different DB solutions

    • XML (DOM trees) - too slow to serialize (for thousands of nodes)
    • Apache Derby
    • Hypersonic SQL (XWiki + Hibernate)
    • MySQL + PostgreSQL
    Conclusions:
    • we use MySQL for the first version
    • as future plan: implement an interface module to ensure portability

    Database structure

    The Mirror Database consists in three main tables (Mirror, LocalState and HdlistState) and an auxiliary table (Availability). The auxiliary table will contain redundant information (that can be also computed from the main tables), but it is useful to directly express some of the indicators, using a reduced amount of space.

    Here follows the description of each table:

    • Mirror
    FieldTypeDescriptionExample
    Mirror_IDSMALLINTPrimary key-
    HostVARCHARServer's hostnameftp.free.fr
    ProtocolCHAR(5)Connection protocolftp / http / rsync
    ContentVARCHARThe set of packages stored on this mirrorcookeri586 / communityi586 / … / cookeri586_reference
    PathVARCHARThe path to the contentmirrors/ftp.mandriva.com/MandrivaLinux/devel/cooker/i586/media/main

    This is the reference table for the list of mirrors known by Mandriva. It is synchronized with the content of the mirrorsfull.list file, according to this file's internal schema:

    Content : Protocol : // Host / Path
    

    Note that a record in this table does not correspond to a physical mirror server. Each mirror server can have several entries into the table, according to each supported protocol or content that it stores.

    In other words, each record in the dataset represents a certain content (set of packages) stored by a mirror server, uniquely identified by an URL, a path and the connection protocol type.

    A particular record stored also in this table is the address of the main server (e.g.: cookeri586_reference), used as reference to evaluate the consistency of the mirror servers.

    This table is used to implement I1 indicator.

    • LocalState
    FieldTypeDescriptionExample
    Mirror_IDSMALLINTForeign key-
    Date_StampDATEThe date of the snapshotYYYY-MM-DD
    Time_StampTIMEThe time of the snapshotHH:MM:SS
    Local_PackageVARCHARPackage nameperl-5.8.8-1mdk.i586.rpm
    Package_SizeBIGINTSize of the package-

    Note: The timestamp is fixed at the connection time, so as all the records corresponding to a certain mirror connection have the same timestamp. This is used in the selections done for time-interval statistics.

    This dataset stores a large number of records. At each connection time, for each mirror server, we add a record for every package file currently stored by the mirror. All this information is needed in order to draw statistics on the data consistency of the mirrors over different intervals of time.

    It works together with HdlistState table, which is used in the same manner to evaluate the metadata consistency of the mirrors. It keeps the track of the content of hdlist files.

    This table is used to implement I5 indicator.

    • HdlistState
    FieldTypeDescriptionExample
    Mirror_IDSMALLINTForeign key-
    Date_StampDATEThe date of the snapshotYYYY-MM-DD
    Time_StampTIMEThe time of the snapshotHH:MM:SS
    Hdlist_PackageVARCHARPackage nameperl-5.8.8-1mdk.i586.rpm
    Package_SizeBIGINTSize of the package-

    This table is used to implement I1, I4 and I6 indicators.

    • Availability
    FieldTypeDescription
    Mirror_IDSMALLINTForeign key
    Date_StampDATEThe date of the snapshot
    Time_StampTIMEThe time of the snapshot
    StateBOOLEANMirror up / down
    No_PackagesSMALLINTNumber of stored packages
    Total_SizeBIGINTTotal size of the content
    No_Missing_PackagesSMALLINTNumber of missing packages compared with the reference server
    ? Hdlist_TimeDATETIMEThe timestamp of the hdlist file

    This auxiliary table contains a summary of the current state of each mirror. For each connection, it stores only the state of the mirror (up / down), the number of stored packages and the number of missing packages evaluated at the current timestamp. It does not store the name of the packages, but only their total number and size.

    This table is used to implement I2, I3, I7 and I10 indicators.

    MirrorMonitorDB.gif

    The SQL script used to create MIRRORDB database.

    Secondary components:

    1° Inputs

    • Reference
    • Scheduler
    • Indicators
    2° Outputs
    • Reporting
    • Dashboard
    3° Mirrors
    • mirrors list

    Cooker mirrors list

    Scripts and related pages

    Mirroring Metrics

    Attached to this page: Perl scripts written by Radu Pop for checking the mirror consistencies.

    Topics Wp5 Cooker

Version 1.43 last modified by RaduPop on 07/03/2006 at 13:12

Comments 0

No comments for this document

Attachments 9

BIN
compare.pl 1.1
PostedBy: StephaneLauriere on 07/06/2005 (1kb )
BIN
syn.pl 1.1
PostedBy: StephaneLauriere on 07/06/2005 (4kb )
Text
FTPMirrorConnection.java 1.1
PostedBy: StephaneLauriere on 04/08/2005 (3kb )
Text
Mirror.java 1.1
PostedBy: StephaneLauriere on 04/08/2005 (2kb )
Text
Statistics.java 1.1
PostedBy: StephaneLauriere on 04/08/2005 (2kb )
Text
Task.java 1.1
PostedBy: StephaneLauriere on 04/08/2005 (5kb )
BIN
MIRRORDB_script.sql 1.1
PostedBy: RaduPop on 01/03/2006 (919 bytes )
Image
MirrorMonitorDB.gif 1.1
PostedBy: RaduPop on 01/03/2006 (8kb )
Image
MirrorMonitorArchitecture.gif 1.1
PostedBy: RaduPop on 14/02/2006 (6kb )

Creator: RaduPop on 2005/06/07 15:20
Copyright EDOS Consortium
1.1.1