Illinois Digital Newspaper Collections convert from Olive ActivePaper to Veridian

Author: Stefan Boddie, Managing Director, DL Consulting Ltd.
Date: 2014-12-17

In September 2012 the University of Illinois at Urbana-Champaign formed a Newspaper Delivery and Preservation Working Group to discuss the sustainability of the Library’s repository architecture for managing the preservation of and access to their digital newspaper collections. The working group evaluated the Olive ActivePaper Archive (APA) software it was using for the Illinois Digital Newspaper Collections and decided it was not meeting the needs of the library and users. After some consideration they recommended migrating to Veridian.

The digitized newspaper collections at the University of Illinois consisted of approximately 900,000 pages in the PR XML format produced by/for the Olive APA platform, as well as approximately 200,000 pages in the METS/ALTO format produced for the National Digital Newspaper Program (NDNP). The PR XML data had article segmentation (i.e. individual articles had been identified) while the METS/ALTO data did not.

Veridian staff developed a process to transform all the PR XML data to METS/ALTO, while retaining the article segmentation and all available text and metadata. Following that work the University of Illinois now has all their digitized newspapers in the same, standards-compliant METS/ALTO format. And for the first time both the older (previously Olive-based) newspaper collections and the NDNP newspaper collections can be made available from the same platform.

The upgrade to Veridian proved to be very popular right from the start. After only a few months the new site was receiving more than ten times as many visitors per month as the old Olive site received in an entire year! The new site had also already attracted more than 100 registered users who were actively correcting OCR errors with Veridian’s UTC module.