Data Format
Home Authors Magazines Oddments

Synchronising two sets of data

When there are two (or more) people updating the data files at the same time, there needs to be some mechanism for ensuring that the data is kept synchronised. Unfortunately it is impractical for a variety of reasons to share the files themselves (via some sort of online Check In/Check Out mechanism) so the only viable option is for the data sets to be updated separately with "change notices" being exchanged between the relevant parties. Both Control Files and Data Files need to be synchronised and a different approach is adopted for each as discussed below.

Control Files

Of the five control files only four currently need to be synchronised - all cover scans are held on Phil's Galactic Central website so all updates to Covers.cvt are done by Phil with new copies of the file disseminated to other users as needed.

For each of the other four files, the program supports the use of a pair of files:

These are combined at run-time by all relevant programs so that, between synchronisations, all changes should be detailed in the xxx.new file, each of which contains three mandatory sections:

**** xxx.CVT Additions

This should contain any new entries to be added to the xxx.CVT file in the usual format for that file.

**** xxx.CVT Corrections ****

This should contain any corrected/updated entries for the xxx.CVT file with the original entry (with a leading ! character) followed by the corrected version (in the usual format for that file), as in:

!Alice [Lewis Carroll]~0~Carroll, Lewis/*~en.wikipedia.org/wiki/Works_based_on_Alice_in_Wonderland~
Alice [Lewis Carroll]~0~Carroll, Lewis$1/*~en.wikipedia.org/wiki/Works_based_on_Alice_in_Wonderland~

Note that the entry being changed should also be deleted from the associated xxx.CVT file to avoid duplication.

**** xxx.CVT Deletions ****

This should just contain the entries in the xxx.CVT file being deleted (with a leading ! character as in the previous example). Obviously the entry being deleted should be deleted from the associated xxx.CVT file.

Additional sections may be included in the .NEW files for local purposes (e.g. to record which entries have been deleted to ensure they don't accidentally reappear) but these are typically not exchanged during synchronisation.

Data Files

The approach used for Control Files would quickly become unmanageable if applied to data files, so a different approach is used.

Two parallel file structures should be maintained for the magazine data files:

After a full synchronisation, the contents of these two folder trees will be identical. Thereafter, when one person wishes to send a formal set of data updates to the other they should proceed as follows:

There will be times when a full list of changes is inappropriate (e.g. when creating a new file, or significantly overhauling/tidying an existing file) but hopefully these will be uncommon and should at least have a covering note (for each changed file) explaining what has been changed.

On receipt of a set of changes, a similar process to the above can then be followed:

There is still scope for exchanging "ad hoc" updates (e.g. comments on the other's set of updates or author disambiguations) but these should not affect the above process so that the next set of synchronised updates will also include any ad hoc updates since last time. This promotes stability such that if either side misapplies (or forgets to apply) an ad hoc update it will be there next time the file is synchronised.

Support Files, Validations & Full Synchronisations

Historically, no process has been established for synchronising the support files (other than 00000.xxx and 00xxx.mag which are treated as data files) - I used to send updated versions to Bill but he never sent any to me. However, it is always desirable to do a full validation after a synchronisation and this will identify any issues with valnames.xxx and validate.xxx. Moving forward, this may need revisiting.

Despite the best intentions, mistakes still creep in so, in addition to doing a full validation after each synchronisation, a full synchronisation should also be done periodically (i.e. a complete set of files exchanged and compared to ensure they are the same.