What is METS/ALTO?

Author: Stefan Boddie, Managing Director, DL Consulting Ltd.
Date: 2014-07-19

METS and ALTO are XML standards maintained by the Library of Congress.

The METS standard is a flexible schema for describing a complex digital object (like a digitized newspaper issue). METS describes the structure of the object but does not encode the actual textual content of the object. The ALTO standard fills this void by encoding the textual content of a digitized page in great detail, including styles and layouts. As well as encoding the digitized text itself ALTO encodes the spatial coordinates of every column, line, and word as it appears on the page.

The combination of METS and ALTO (often written METS/ALTO) is the current industry standard for newspaper digitization used by hundreds of modern, large-scale newspaper digitization projects (and lots of smaller projects too!) A very small sample of projects using METS/ALTO are listed below.