EMiL container

Tuesday, June 07th, 2016 | Author:

One of the proejcts we are currently working on is the DFG-founded EMiL project (Emulation of Multimedia Objects in Libraries). Its main goal is to provide our EaaS framework for libraries to use in their reading rooms. While this project has also spawned off our USB live systems, its latest outcome is a Docker container that allows (comparatively) easy access to emulation for everyone who wants to run born-digital objects.

The container relies on three auxiliary data sources:

  1. An image archive (it contains disk images of operating systems)
  2. An object archive (e.g. CDROMs, Floppys, etc – well, you want to access them, duh)
  3. A directory of environment descriptions (they are merely meta-metadata to track environments from the image archive)

Image Archive

The image archive is one of the core components of our EaaS software and has been so for several years. It contains the disk images that are required by the emulators in order to boot up a virtual machine and all metadata necessary for the emulator to do so. There already has been another blog-post about the image archive and how to set it up for use in a container, so I’ll refer to that.

For the impatient, there’s an image archive with free images available for download. Just run these commands:

curl -O http://bw-fla.uni-freiburg.de/image-archive.tgz
tar xf image-archive.tgz
cd image-archive/nbd-export/ 
ln -s ../images/base/doom.raw 
ln -s ../images/base/hatari_tOS206us.img 
ln -s ../images/base/qemu-i386-DOS_6.20_CDROM.raw

Object Archive

The object archive recently had its own article in this blog and contains the objects you want to access. The prepared docker image (wait for it!) has a preconfigured file-backend for the actual data. The file structure is pretty simple:

object-archive/
 |- object1/
 |   \ iso/
 |      |- disk1.iso
 |      \  disk2.iso
 |- object2/
 |   \ floppy/
 |      |- disk1.raw
 |     ...
...

As you can see, each object is represented by its own subdirectory. The directory’s name is also its id that is to be used later to access it. Inside this directory, there is either an iso/ subdirectory, or a floppy/ subdirectory, or both. The actual images (either CD-ROM images or floppy images) are located within their respective directories according to their media type. And that’s basically it, just copy your objects into a directory structure like above and you have your object archive ready.

For those using Rosetta, Marcus Bitzl (Bayrische Staatsbibliothek) has implemented a Rosetta-Object-Archive adapter (https://github.com/emil-emulation/emil-rosetta), allowing to retrieve objects directly from the Rosetta repository.

Environment Descriptions

Within our EMiL project, it became clear that the partnering institutions are only remotely interested in the technical metadata of our EaaS framework itself. This is especially true for our Emulation Environments that contain all the metadata necessary for re-enacting a virtual computer system. Information like memory, which hardware bus a drive is connected to, etc. is, at best, only remotely interesting if all you want is run a single multimedia object. Consequently, the institutions want a more abstract environment description that specifies the software environment, i.e. the operating system, installed software like a PDF viewer, text processors and so on. This metadata is stored separately from our technical emulation environments in the environment descriptions.

An environment description may look like this:

{ 
        "envId":"4404",
        "title":"Windows 98 (SE)",
        "description":"Windows 98 (SE)",
        "os": "Microsoft Windows 98 Second Edition",
        "version": "01052016",
        "emulator": "VirtualBox"
}

The envId field refers to our classical emulation environments with information like the harddisk image, available drives for removable media, CPU type and so on. The other fields in the environment descriptions are purely logical metadata. They are extensible and can identify e.g. the operating system, installed software and other information that the institution deems necessary or useful for managing their software environments. Curators or users can then select an appropriate environment according to their requirements, for instance select a Windows platform with MS Office installed, because the documentation for the multimedia object is provided as a .doc file.

In its current implementation, the environment descriptions are simple JSON files with one description per file, all stored in a single directory.

The container

Finally we come to the actual container. You should have three directories by now, image-archive/, object-archive/ and environments/.

The EMiL container is available from the docker.io hub and can be pulled via

docker pull eaas/bwfla:emil

The accompanying run script can be downloaded from the EMiL github repository:

./run-emil.sh --public-ip-port 132.230.8.226:8080 --archive-dir ./image-archive/ --environments-dir ./environments/ --objects ./object-archive/

The IP address should be the IP address from the machine you start the container on. Besides the port 8080 to communicate with the JBoss application server, the container also opens up port 1080 to provide access to our new UI. We have developed two shiny new Javascript-based UIs. The admin UI where you can test all the available operating system images is available from your browser at http://ip:1080/emil-admin-ui. To actually access a digital object, use http://ip:1080/emil-ui/. This URL will provide you with a list of available objects from the object-archive and auto-detect a suitable environment (Note: this feature is not available with our publicly available, limited image archive!).

Category: Uncategorized | Leave a Comment

OPF Hackathon – Tag 3

Thursday, November 15th, 2012 | Author:

Der abschließende Tag des Hackathons strukturierte nochmal die Frage- und Problemstellungen der bisherigen Veranstaltung. So wurden die einzelnen Objektgruppen nochmal praktisch angegangen. Zudem befassten sich einzelne Unterarbeitsgruppen mit Fragestellungen der langfristigen Datenbankerhaltung, wie zukünftige Lesesaalsysteme für Bibliotheken, Archive oder Museen gestaltet sein sollten, mit den Herausforderungen privater Archive oder der Mediensicherung. more…

Category: Uncategorized | Leave a Comment

Zugriff und Migration von PPT 4.0 in Originalumgebung

Wednesday, November 07th, 2012 | Author:

Die British Library führt seit einigen Jahren regelmäßige Web-Harvests  der UK-Domains durch und speichert neben traditionellen Webseiten ebenfalls eine Menge verlinkten anderen Contents. Hierzu zählen beispielsweise Power Point Dateien in frühen Versionen oder Real Media Audio-Streams. Diese Dateiformate sind mit heutigen Programmen nicht mehr abspielbar. Das und die Fragen, wie man den Inhalt wieder sichtbar machen und Nutzern bereitstellen kann, triggerte eine lebhafte Diskussion, beispielsweise auf dem Blog von Ch. Rusbridge. more…

Category: Uncategorized | Leave a Comment

British Library und digitaler Langzeitzugriff

Tuesday, September 18th, 2012 | Author:

Die British Library ist eine der zentralen Einrichtungen in Europa, die sich intensiv mit der Problematik des digitalen Langzeitzugriffs auseinander setzen. Die Bibliothek beteiligt sich an mehreren EU-Projekten wie beispielsweise SCAPE zur Thematik, da sie über eine sehr breite Palette verschiedener digitaler OBjekte inklusive Computerspiele britischer Herkunft verfügt. more…

Category: Uncategorized | Leave a Comment

Rosetta Advisory Group Meeting in Hannover

Tuesday, July 17th, 2012 | Author:

Das Treffen versammelt die Nutzer der Rosetta-Software, Bibliotheken und Archive aus fast allen Teilen der Welt, die mit diesem Produkt ihre Langzeitarchivierungsaufgaben erfüllen. ex Libris ist einer der Anbieter von Langzeitarchivierungslösungen, andere Anbieter sind beispielsweise IBM oder Tessella. Bisher konzentrieren sich die Software und damit verbundenen Workflows auf Migrationsszenarien. Der Fokus der meisten Teilnehmer des Treffens liegt jedoch auf dem Access, womit mittelfristig weitere Strategien wie Emulation eine Rolle spielen werden. more…

Category: Uncategorized | Leave a Comment