Mandriva Linux Cooker Mirrors Monitor
The
Mirror Monitor framework will implement 7 (out of 13) diffusion indicators.
Architecture
The MirrorMonitor's architecture is depicted in the schema below:

The MirorMonitor has three main components:
- CONNECTOR module
- ANALYZER module
- MIRROR Database
Around those primary components there are some secondary components providing inputs to the monitor:
- Mirrors
- Scheduler
- Reference
- Indicators
The MirrorMonitor's output is provided by the
Analyzer module in different reporting forms. Additionally, a graphical interface can be developed and integrated in the dashboard.
The
Connector and the
Analyzer are two distinct modules working independently on a common database.
The
Connector's task is to
retrieve state informations from all the mirrors known by the distribution editor. It connects to each mirror and gets the mirror's state as well as the list of the files stored at the given time. Currently, there are three protocols used to synchronize the mirrors:
The list of the known mirrors is provided by a
Reference, namely a text file containing the mirrors' URL (e.g.:
http://www1.mandrivalinux.com/mirrorsfull.list).
The output of the Connector essentially consists in the list of the files found on each mirror, including additional information like: size of the file, timestamp, etc. A "mirror down" message will be registered in the case of connection failure. All the data is stored in the
Mirror Database.
The
Analyzer's task is to
compute the indicators' values based on the information stored in the database and according to the definition of the indicators presented in the first deliverable. It uses as input a
Reference on the current state of the distribution (e.g.: Cooker fileset on the main server). Based on the Reference, the Analyser evaluates the measurement indicators by comparing the information selected from the database at a given moment (or a time interval).
The output of the Analyzer consists in a set of reports presenting the values of each indicator.
The both primary modules are synchronized by a
Scheduler. This component is a configuration element which specifies the connection frequency to the mirrors, the time interval for the data analysis and the execution time for each module.
The detailed specification of each component is given further on.
Primary components specification
CONNECTOR
Outputs:
- mirror_status
- file_list
- connection_time
API
* Boolean ftp Connect(Mirror)
* Boolean http Connect(Mirror)
* Boolean rsync Connect(Mirror)
* ? *Boolean main Server Connect(Main Server)*
* Date connection Time(Now)
* File[
? getFiles(Mirror)
- ? *FileInputs:
Outputs:
API
- ? Boolean mainServerConnect(MainServer, Now)
- ? *File[" href="/xwiki/bin/edit/Main/%3Cspan?parent=Main.CookerMirrorsMonitor">] getReferenceFiles(MainServer)*
* storeMirrorStatus(Mirror, Date)
* storeMirrorContent(Mirror, Date, Collection ?)
* ? *storeReferenceContent(MainServer, Date, Collection ?)*
?
getReferenceFiles(MainServer, Collection ?)*
- Report evaluateIndicator(Indicator, Date)
- Report evaluateIndicatorStatistics(Indicator, StartDate, EndDate)
MIRROR Database
Reflections on different DB solutions
- XML (DOM trees) - too slow to serialize (for thousands of nodes)
- Apache Derby
- Hypersonic SQL (XWiki + Hibernate)
- MySQL + PostgreSQL
Conclusions:
- we use MySQL for the first version
- as future plan: implement an interface module to ensure portability
Database structure
The Mirror Database consists in three main tables (Mirror, LocalState and HdlistState) and an auxiliary table (Availability). The auxiliary table will contain redundant information (that can be also computed from the main tables), but it is useful to directly express some of the indicators, using a reduced amount of space.
Here follows the description of each table:
| Field | Type | Description | Example |
|---|
| Mirror_ID | SMALLINT | Primary key | - |
| Host | VARCHAR | Server's hostname | ftp.free.fr |
| Protocol | CHAR(5) | Connection protocol | ftp / http / rsync |
| Content | VARCHAR | The set of packages stored on this mirror | cookeri586 / communityi586 / … / cookeri586_reference |
| Path | VARCHAR | The path to the content | mirrors/ftp.mandriva.com/MandrivaLinux/devel/cooker/i586/media/main |
This is the reference table for the list of mirrors known by Mandriva. It is synchronized with the content of the mirrorsfull.list file, according to this file's internal schema:
Content : Protocol : // Host / Path
Note that a record in this table does not correspond to a physical mirror server. Each mirror server can have several entries into the table, according to each supported protocol or content that it stores.
In other words, each record in the dataset represents a certain content (set of packages) stored by a mirror server, uniquely identified by an URL, a path and the connection protocol type.
A particular record stored also in this table is the address of the main server (e.g.: cookeri586_reference), used as reference to evaluate the consistency of the mirror servers.
This table is used to implement I1 indicator.
| Field | Type | Description | Example |
|---|
| Mirror_ID | SMALLINT | Foreign key | - |
| Date_Stamp | DATE | The date of the snapshot | YYYY-MM-DD |
| Time_Stamp | TIME | The time of the snapshot | HH:MM:SS |
| Local_Package | VARCHAR | Package name | perl-5.8.8-1mdk.i586.rpm |
| Package_Size | BIGINT | Size of the package | - |
Note: The timestamp is fixed at the connection time, so as all the records corresponding to a certain mirror connection have the same timestamp. This is used in the selections done for time-interval statistics.
This dataset stores a large number of records. At each connection time, for each mirror server, we add a record for every package file currently stored by the mirror. All this information is needed in order to draw statistics on the data consistency of the mirrors over different intervals of time.
It works together with HdlistState table, which is used in the same manner to evaluate the metadata consistency of the mirrors. It keeps the track of the content of hdlist files.
This table is used to implement I5 indicator.
| Field | Type | Description | Example |
|---|
| Mirror_ID | SMALLINT | Foreign key | - |
| Date_Stamp | DATE | The date of the snapshot | YYYY-MM-DD |
| Time_Stamp | TIME | The time of the snapshot | HH:MM:SS |
| Hdlist_Package | VARCHAR | Package name | perl-5.8.8-1mdk.i586.rpm |
| Package_Size | BIGINT | Size of the package | - |
This table is used to implement I1, I4 and I6 indicators.
| Field | Type | Description |
|---|
| Mirror_ID | SMALLINT | Foreign key |
| Date_Stamp | DATE | The date of the snapshot |
| Time_Stamp | TIME | The time of the snapshot |
| State | BOOLEAN | Mirror up / down |
| No_Packages | SMALLINT | Number of stored packages |
| Total_Size | BIGINT | Total size of the content |
| No_Missing_Packages | SMALLINT | Number of missing packages compared with the reference server |
| ? Hdlist_Time | DATETIME | The timestamp of the hdlist file |
This auxiliary table contains a summary of the current state of each mirror. For each connection, it stores only the state of the mirror (up / down), the number of stored packages and the number of missing packages evaluated at the current timestamp. It does not store the name of the packages, but only their total number and size.
This table is used to implement I2, I3, I7 and I10 indicators.
The SQL script used to create MIRRORDB database.
Secondary components:
1° Inputs
- Reference
- Scheduler
- Indicators
2° Outputs
3° Mirrors
Cooker mirrors list
Scripts and related pages
Mirroring Metrics
Attached to this page: Perl scripts written by Radu Pop for checking the mirror consistencies.
Topics Wp5 Cooker
Comments: 0