January 21, 2021 | by Hannah Storch
For cultural heritage professionals, metadata provides invaluable descriptive information about an object or resource, but it is also time-consuming to accumulate. Creating and maintaining metadata for a collection is an integral part of taking cataloging one step further and creating a digital collection. Metadata provides context for an item within a collection and can either be embedded in the digital file at or after the time of creation or maintained in a centralized location such as a database, DAM, CMS, or spreadsheet. Using RAW rapid capture imaging with metadata embedding capabilities and DT PixelFlow, Pixel Acuity has been able to automate much of the metadata creation process and create workable metadata formats for institutions. This helps us both reduce the cost of, and increases our accuracy and flexibility in, providing metadata services.
For the purpose of institutions such as libraries and archives, metadata can be categorized into four basic types: administrative, descriptive, preservation, and technical. Administrative metadata provides the provenance context information necessary to understand the information resources, such as past ownership and from where the resource came. Descriptive metadata describes a resource, its context, and identifying characteristics so that people can locate and search for the asset using subjects and keywords. Preservation metadata is the conservation information that can be used to protect the original resource from deterioration or degradation. Finally, technical metadata is the information about the digital file that can allow the resource to be identified. When collecting metadata, institutions are faced with limited resources for staff, time, and funding, often having to choose between feasible or “good enough” metadata and comprehensive metadata. We, at Pixel Acuity, are able to use our DT PixelFlow scripting to alleviate some of the burdens of that choice, offering options for embedding and generating metadata to create and enhance institutional records.
Since technical metadata is information about the digital asset that is created during the collection imaging process, we are able to capture that information and embed it in the file derivatives themselves. Embedding this information ensures that it will not be lost, automatically links the data and the metadata, and ensures that the image and the metadata will be updated together. Typically this information is linked with the TIFF derivative file after being captured in the RAW but with DT PixelFlow we are able to embed it in almost all derivative formats including TIFFs, JPEGs, PDFs, and PDF/As. Technical metadata information can also be extracted and put into a spreadsheet, which can simplify data management and facilitate search and retrieval. Using DT PixelFlow we are able to generate spreadsheets containing this information in multiple formats, including Dublin Core, based on client preference in order to facilitate data retrieval and storage in accordance with their own institution’s Digital Asset Management System (DAMS) or collection information storage/organization type.
Along with creating spreadsheets and ingesting metadata, DT PixelFlow can be used to enrich existing records or create a more holistic record, combining information from original inventories and documents with new information obtained at the time of capture and digital content creation. This includes the generation of basic descriptive metadata such as an object type or category, transcription of annotations on an item or its container/folder/box, and non-subjective evaluations such as page count. More advanced descriptive metadata information can require both organization-specific and subject-specific expertise. While it is not possible for us to offer such subject-specific expertise for every collection that we digitize, we are able to utilize and enhance records created by specialists. If a client is able to provide us with an inventory with existing information, such as an inventory spreadsheet or an XML format of a finding aid, we can extract information from that format and create a new record, with that information as well as the technical information about the digital asset obtained during the imaging process. This allows us to combine records to give a more comprehensive understanding of a resource, its place within a collection, and how it relates to the digital asset.
Using our DT PixelFlow scripting, we are able to automate the often arduous processes of metadata embedding and creation, minimizing cost and labor for the institution and allowing it to focus on other aspects of digital collection creation. In just one example, prior to digitization, one of our clients had a cataloging inventory with folder-level descriptive information that listed the location information for the assets, such as box and folder number, as well as descriptive information including title, dates, location, collection, and series. Not only were we able to use our tools to embed that information into the files, effectively linking the original object to the digital asset, but we were also able to generate information about the filename of the digital asset, the number of assets for each folder, and generate checksum lists and titles for each folder. This additional information was added to the original inventory, giving a more holistic view of the original item and the digital asset within the collection and linking the two within our client’s DAMS.
Pixel Acuity is also at the forefront of leveraging artificial intelligence to assist metadata workflows. We are working with the Smithsonian Center for Folklife and Cultural Heritage (recipient of our DT ArCHER Grant) to evaluate the effectiveness and accuracy of these methods. This effort deserves its own article, which we will publish later this year. For now, we can say that we don’t expect AI to be a magic wand that replaces expertise, experience, and careful execution, but we do expect it will be an enormously useful tool.
While metadata generation can be costly in terms of time, labor, and resources, it is crucial to capturing the context of items within a digital collection. Metadata is what allows scholars and researchers to search a collection for specific information and allows registrars, cataloguers, and collections managers to organize their data and collection information. With DT PixelFlow automation, we are able to effectively and efficiently assist our clients to have integrated metadata records, so that they do not have to sacrifice quantity or quality.
We can help your next project be a breeze! Learn more about DT PixelFlow, project planning, additional services, and pricing by contacting us here.