So you have a digital newspaper archive…now what?

Author: Stefan Boddie, Managing Director, DL Consulting Ltd.
Date: 2014-06-11

Way back in the old days (like 2005!), when many libraries were beginning to digitize their newspaper collections and other cultural heritage material, it was all about preservation and access. That is, the key goal was to get the material digitized and store the digital files in a safe place. A second goal was to make the digitized material available and searchable online, making it much more accessible to library patrons than ever before.

Those two goals are still absolutely central to the digitization efforts of most libraries, as they should be and as they’ll remain to be.

We don’t think that “just digitizing stuff and putting it online” is good enough any more though, and we think most modern libraries agree with us.

The next steps — patron engagement

So what comes next, once your newspaper collection has been digitized and made available online?

With Veridian our vision is to help libraries create online digital newspaper collections that are more than just static sets of content. We work to attract an audience to the collection; we create opportunities for that audience to engage with the content, with the library, and with each other; and we work with the library to develop and maintain the collection over time.

That part about developing and maintaining the collection over time is something we think is really important. In the past many cultural heritage collections have been built, at significant cost, but once “finished” have been left to languish in the outer reaches of the internet. The content of these collections is often fantastic, but they attract only relatively few visitors, and over time they gradually begin to look a little old and outdated and neglected.

With Veridian we never consider a digitization project to be “finished”. Getting it online is just the first step. After that the collection needs to be nurtured and maintained, it needs to attract visitors and it needs to encourage those visitors to really engage with the content.

We don’t just want a trickle of users passing by consuming content — we want a community of repeat visitors who care about and contribute to the collection.

How to encourage patron engagement?

We’re working hard to develop features to encourage online patrons to engage at a much deeper level with Veridian-based digital heritage collections.

The first step is to maximize visitor numbers by making it easy for users to find the collection. The best way to do that is to ensure it’s indexed properly by the big search engines like Google. Search Engine Optimization (SEO) is discussed in some depth in other articles on this site — see the related reading section at the bottom of this page for links.

Once you’ve maximized your audience you want to engage those visitors and provide something to keep bringing them back for more. Ideas to help do that include the following.

Tags, comments, private lists, and social media

Many of these features are designed to encourage users to contribute and give something back to an online collection, in addition to just acting as content consumers. They also allow users to keep track of and organize the items that interest them right in the Veridian user interface, and to share interesting items with their social media networks.

Crowdsourcing activities

Veridian’s User Text Correction (UTC) module allows users to correct OCR errors as they come across them in the text. For newspaper digitization collections, which usually contain a lot of OCR errors, this can work surprisingly well.

This page primarily discusses UTC as it’s used for digitized newspapers but the same tools can be used in different ways for many types of collections. Applications for this type of technology are really only limited by our imaginations. Some ideas include the following:

  • Crowdsourced transcription of handwritten letters and manuscripts.
  • Innovative crowdsourcing projects to identify people or places in photograph collections, or categorize large collections of mixed archival material.

Do you have a cool idea for a crowdsourcing project? Contact us to discuss.

A good example of a newspaper digitization project for which UTC has been very successful is the California Digital Newspaper Collection (CDNC). In three years the CDNC has built a community of more than 2,000 registered users, and collectively that community has corrected more than 2.5 million errors.

At first glance the benefits of crowd-sourced UTC might seem obvious — the collection is gradually improved as errors are corrected. In reality though the corrections are just a fortunate by-product. The real benefit for the library is the creation of an engaged online community around their collection. The effects of that include:

  • A more engaged user base — more repeat visitors, increased average duration of visits, increased total visits, etc.
  • Opportunities to communicate and engage with your online patrons. Interested in learning more about who uses the collection, what they use it for, what their priorities are for future digitization, or what they do/don’t like about the collection? Go ahead and ask your registered users!