Debian Data Retrieval and Database Filling Procedure

The current versions of history and anla use the MySQL backend, which is quite slow. However, while we wait for the development of the new versions, we will put it into production. This requires automatizing the database filling procedure.

Location of source files

The sources for anla, debsqlfill and the various libraries are under edos/software/dependencies. The scripts are under edos/software/anla-scripts.

Organization

The current system comprises the following parts :

  • A MySQL database, currently running on cadet.inria.fr. The database is "edos-debian".
  • A directory where the raw metadata files (Contents.gz for various distributions) downloaded from snapshot.debian.net are stored. This is currently the /edos/data/snapshot directory on cadet.
  • A crontab entry that calls, everyday at 23:45 Paris Time, a script, download.sh, which downloads the "metadata du jour" from snapshot.debian.net. The script calls debsqlfill with the -download option ; debsqlfill then uses the Netclient module to download the data.
  • The anla server, which is actually a single process that runs on cadet, port 9998, under the user "anla", in /home/anla.
  • The /edos/data/download.sh script
  • The /edos/data/fill.sh script
  • The /home/anla directory contains :
FilePurpose
anla.optThe natively compiled executable.
run.sh*Script for properly killing and relaunching anla with the correct port number.
history-cacheThe marshalled file anla uses to speed-up loading of the database. This is an Ocamlized version of the state of the SQL database.
httpd.configThe configuration file for anla.
httpd.logThe log file for anla. This can get quite big quite quickly.
installable.mshThe marshalled file in which computed installability data is written. This file takes ages to compute and thus is very precious.
recompute.configAn alternate configuration file for launching a econd copy of anla for the purposes of updating.
recompute.logThe log file for the second copy.
default.cssThe CSS stylesheet
edos.pngThe EDOS logo
red-1x1.png1x1 pixel PNG file for drawing lifetables
green-1x1.png"
yellow-1x1.png"

Filling the database

This part has been automated. The procedure for adding new data to the database and getting anla to display lifetables for the extended part follows.

  • First, we need to fill the SQL database.
    • Enter the SQL database, for example by launching edosdeb.sh
    • Type
select max(day) from lifetimes;
to get the latest date.
    • Enter the directory /edos/data
    • Type
./debsqlfill.opt -start START_DATE -fill
    • This will start filling the SQL database. It takes about 4 to 5 minutes per day.
This procedure is handled by the "fill.sh" script.

Updating the cache and solving the installability

We now need to launch a copy of the anla process, which we call the recomputer,

  1. reload the SQL data,
  2. recompute the installability,
  3. save the SQL data cache into history-cache,
  4. save the installability information into installability.msh.
It should be noted that whereas reloading the database from SQL takes only a few minutes, recomputing installability for a few months can take days. Therefore the installability.msh file is extremely precious and needs to be backed up.

The procedure is therefore as follows:

  • Become anla
  • Let DATE be today's date string (like 2006-05-17).
  • Do
mkdir -f /home/anla/backups/DATE/
cp installability.msh /home/anla/backups/DATE/
cp history-cache /home/anla/backups/DATE/
  • Launch ./recompute.sh
  • This script should recreate fresh installability.msh and history-cache files.
  • Relaunch anla with ./run.sh
Version 1.7 last modified by RobertoDiCosmo on 17/05/2006 at 23:02

Comments 0

No comments for this document

Attachments 0

No attachments for this document

Creator: Berke on 2006/05/17 10:07
Copyright EDOS Consortium
1.1.1