Author: Hannah Storch
History in the Making: A Case Study of AI Application
History in the Making: A Case Study of AI Application
This piece was written by Hannah Storch.
“She gave Black people an opportunity to look at themselves on a big screen as something beautiful when all that was there before spoke to our degradation. In her we found another dimension to being black in our time.” – Harry Belafonte of Etta Moten Barnett
Etta Moten Barnett: A HistoryMaker
Etta Moten Barnett was an African American singer and actress known for her legendary roles in Sugar Hill (1931) and Porgy and Bess (1942) on Broadway as well as in the film Flying Down to Rio (1933) and in many other notable theatrical performances. Along with her husband Claude Barnett, founder of the Associated Negro Press, Etta Moten was involved in many philanthropic efforts in the United States and Africa and served as a pillar of her community.
The HistoryMakers is an organization that documents and promotes stories like those of Etta Moten, to create a more inclusive record of American history and has been recording African American oral histories since 1999. This one-of-a-kind collection is housed permanently at the Library of Congress and contains almost 3,400 video oral history interviews of African American leaders from 413 cities and towns, Mexico, the Caribbean, and Norway across a variety of disciplines – including the arts, business, civic engagement, education, entertainment, law, the media, medicine. With education as its mission, The HistoryMakers website and digital archive provide an unprecedented and irreplaceable record of African American lives, accomplishments, and contributions through unique first-person testimony.
With the help of Digital Transitions’ service division Pixel Acuity, The HistoryMakers are now leveraging digital images of items from the personal collections of their HistoryMakers, including from the collection of Etta Moten and Claude Barnett, to construct a more holistic digital record of their lives. At Pixel Acuity we have been able to use our integrated scanning processing pipeline and Artificial Intelligence (AI) application to assist in the telling of these influential stories by providing enhanced digital assets that promote accessibility, prompt research, and provide valuable visual context.
Photograph of Etta Moten Barnett courtesy of The HistoryMakers
Initial Pilot Project
The pilot project for this digitization program was from Etta Moten and Claude Barnett’s personal collection and consisted of several scrapbooks as well as photographic prints, correspondence, and newspapers and newspaper clippings relating to their lives. Personal collections, such as this, create a unique challenge for full-scale digitization and cataloguing efforts as there is often little to no existing metadata information describing the contents. What little information there is relies heavily on the personal knowledge of the individuals cataloguing the items. Given our extensive experience dealing with personal collections and the advice of the Digitization Committee for this mass-digitization project, Pixel Acuity took a “Digitize First, Process Second” approach, meaning that we were able to use rapid capture photography to image the materials first and then use AI to derive item-level descriptive metadata information from the images themselves. By using Digital Transitions’ scanning hardware and our RAW rapid capture workflow, we were able to preserve all of the information recorded by the system at the time of capture, not lose any information due to compression, and create high-quality digital images to integrate into our processing pipeline. This created the perfect platform for AI implementation and image analysis. We also used our proprietary Optical Character Recognition (OCR) software to create searchable transcription records of the correspondence and newspapers so that researchers and the general public could learn more about Etta Moten and her life through keyword and subject searches.
President Dwight D. Eisenhower and Claude Barnett as identified by PixelFlow image analysis.
Artificial Intelligence Application
Along with creating digital preservation-grade images of the collection, our DT PixelFlow software uses computer vision artificial intelligence to describe an item’s material, text, and image content through material analysis, text analysis, and image analysis respectively. For the Etta Moten Barnett Collection, image analysis was the most crucial of the three as there was little descriptive information or contextualization for the photographs that made up the majority of the collection. Using our AI capabilities, we provided object recognition, keyword extraction, and customized notable person detection. For this project, we applied our software to the collection in order to extract keywords and context from the photographic prints – as well as recognize Etta Moten, Claude Barnett, and other notable individuals from their life experiences throughout the collection. Unlike generic facial detection options on the market, our notable person detection cross references the approximate age of the individual, their date of birth and death, and the approximate date range of the physical collection, to confirm that a detected face is a plausible identification.
Since we understood the richness of Etta Moten’s history and how impactful she was within her community, we wanted to ensure that we were capturing as much information as possible from the digital images so we implemented two different types of person detection/face recognition. The initial process was Notable Individual Detection which compared the faces identified within the collection against an existing database of faces of notable people. This proved an excellent source of information for discovering individuals that we didn’t know were in the collection, such as President Dwight D. Eisenhower, American pianist and composer Hadda Brooks, and actor Laurence Olivier.
Knowing that the collection belonged to Etta Moten and Claude Barnett and that they were likely to be in many of the photographic prints, we also utilized our custom-trained facial recognition model. By using several previously-identified images of Etta and Claude at the beginning of the project, we were able to train our AI software to recognize their particular faces and the results were extraordinary. Even within class photos with many difficult-to-distinguish faces and people, our software was able to recognize Claude Barnett with astounding accuracy. In order to establish a criteria of confidence in the accuracy of the software’s ability to recognize faces, we also provided an accuracy percentage along with the descriptive metadata. This accuracy percentage served as a confidence rating for The HistoryMakers who could then take a manual look at any images that fell below a certain threshold to confirm the software’s findings.
Etta Moten and Claude Barnett identified by custom-trained AI notable person detection.
Claude Barnett identified among a crowd of people in a group photo by custom-trained notable person detection.
Identifications of Etta Moten and Claude Barnett through custom-trained AI facial recognition and notable person detection.
Along with discovering keywords and notable people within the collection and providing valuable context to Etta Moten Barnett’s story, our DT PixelFlow software was able to input this information into a workable spreadsheet. By working closely with The HistoryMakers and their website team at ThirdWave LLC., we were able to incorporate any descriptive metadata our software was able to find as well as technical metadata about the digital assets themselves into a Dublin Core spreadsheet that they could easily use with their online platform.
Since 1999, The HistoryMakers has amassed a collection of first-person oral history interviews with 3,000 influential African American ArtMakers, BusinessMakers, CivicMakers, EducationMakers, EntertainmentMakers, LawMakers, MediaMakes, MedicalMakers, MilitaryMakers, MusicMakers, PoliticalMakers, ReligionMakers, ScienceMakers, SportsMakers, and StyleMakers that is still growing. Through Pixel Acuity’s partnership with the organization, we look forward to adding valuable visual context to the voices of these powerful HistoryMakers and creating descriptive records during the digitization process for collections that previously had none. By using artificial intelligence to recognize people and contextualize moments within these HistoryMakers’ lives through object recognition and keyword extraction, we are able to enhance their narratives and work with The HistoryMakers to create an unprecedented repository of stories and digital assets and an invaluable resource of American history.
Do you have a need for Artificial Intelligence applications within
your own collection?
Contact us for more information on how Pixel Acuity can help preserve your heritage.
The Nuances of Nitrate Film: A Case Study With The Getty
The Nuances of Nitrate Film Digitization: A Closer Look at the Getty Research Institute’s Nitrate Film Collection
This piece was co-authored by Hannah Storch and Karl Seifert.
Above, differently sized pieces of nitrate film from The French and Company Photographic Archive of Fine and Decorative Arts
show design interiors and pieces showing advanced nitrate degradation (© J. Paul Getty Trust. Getty Research Institute, Los Angeles (71.P.1)).
What is Nitrate Film?
Digital preservation allows for the conservation and continuity of the knowledge and data contained within cultural and corporate heritage collections. For nitrate-based film collections, this process is even more invaluable because of the unstable nature of the media. Nitrate film is a type of film stock, which was commercialized from the 1890s to the early 1950s, and is made of cellulose nitrate. While notable for its transparency and flexibility, there were also drawbacks to using nitrate film. While most collections face the eventuality of degradation over time, nitrate film collections are particularly susceptible due to their inherent instability. During the deterioration process, nitrate film releases toxic fumes. Deteriorated nitrate becomes extremely flammable and, in rare yet occasional cases, may spontaneously combust at high temperatures. Due to this voracious decay process and the fact that the data contained within nitrate film collections is rapidly being lost, digital preservation can play an even more crucial role safeguarding these types of collections and the information they contain.
The Getty Research Institute’s Nitrate Negative Collections
This past year, Pixel Acuity partnered with the Getty Research Institute (GRI) to digitize their nitrate negative collections. Due to the fragile nature and unique storage concerns of the GRI’s nitrate negative holdings, these materials were unavailable for many years. After digitization, the 6,000+ images will be added into the Library Catalog and freely available for the public to view. The negatives represent a few different collections, including Harald Szeemann papers, The French and Company photographic archive of fine and decorative arts, and Malvina Hoffman papers, and as such, their content varies widely.
The images depict an artists’ community in Switzerland, stock photos from an NYC-based art dealership, and work by a prominent female sculptor documenting her life and surroundings. Through the Nitrate Digitization Project, the Getty Research Institute had an opportunity to increase access to collections that were previously unavailable to researchers.
Above, film samples show nitrate film degradation.
Above, a member of the Pixel Acuity team digitizes a piece of nitrate film.
Although great care and consideration is always taken while handling a collection for digitization, the nature of nitrate film requires specialized environment, storage, and handling workflows throughout the digitization process. This particular project also consisted of several different collections with various sizes of film in various states of physical processing and organization.
The collections consisted of over 6000 individually cut pieces of nitrate film, as well as several hundred pieces of reflective ephemera, some of which dated back over 100 years. Given the diverse nature and scope of these collections, we had to customize our approach, figuring out how best to adapt to the different film formats and specifications while maintaining the care and physical preservation of a fragile collection.
Unique Solutions for Nitrate Film
In order to create the optimal environment for nitrate storage and for digitization, we worked with a professional conservator provided by the GRI to predetermine the ideal temperature and humidity control as well as air purity for the nitrate materials. The conservator also conducted an initial review of the condition of the nitrate materials with our team prior to beginning production.
In order to facilitate rapid capture production while maintaining best handling practices for nitrate film digitization, we organized the collection according to size prior to production, which allowed us to tailor our approach to each type of film. Once we had organized the film according to five previously-agreed upon distinct resolutions based on size, we were able to optimize efficiency within the production workflow and ensure a high-quality digital image for all of the different sizes using a 150mp sensor. During the digitization process itself, we also used the DT Film Scanning Kit and its customized film carriers for contact-free scanning, resulting in the rapid digitization of many different sizes of film without touching the image area or putting the fragile material at greater risk.
Above, Pixel Acuity digitizes nitrate film.
In order to provide the highest quality images for all film format sizes, we received the request to digitize 8×10 film at a high resolution of 1700 pixels per inch (ppi). While such a high ppi provides an undeniably high-resolution digital capture with all information included, it is difficult to attain within a given camera’s field of view. In order to achieve this, we utilized a 2-shot method and our DT Batch Script to seamlessly integrate the 2 images and create one holistic, high-quality digital image. This workflow was useful in handling the larger film formats that we came across while maintaining the highest-level preservation-grade FADGI standards.
After production, we moved into the processing phase of the project. Our next-generation proprietary DT PixelFlow software enabled us to process all of the requested file derivatives for this project, including 16-bit tifs, 8-bit tifs, and full-scale resolution jpegs. Due to the fact that so many collections of this kind do not have existing item-level records, we were also able to generate a metadata spreadsheet for this project. Preliminary catalogue records, like the ones Pixel Acuity generated for GRI, can be created during the digitization process to serve as a useful resource for information, cataloguing, and integration into unique Digital Asset Management Systems (DAMS).
Do You Have Nitrate Film?
Collection digitization results in the creation of preservation-grade images that can serve as digital surrogates for the original materials. In the case of inherently unstable nitrate film collections, this need becomes even more prevalent.
While digitization can’t stop deterioration, it can capture the object and its information before it is lost. Projects, such as the GRI Nitrate Digitization project, are not only interesting because of their scope but they are indispensable because of the nature of the collections and their propensity for deterioration.
To learn more about how Pixel Acuity can help you make the most of your nitrate film or transmissive material collection, contact us at [email protected].
Image right is an example of nitrate film from The French and Company Photographic Archive of Fine and Decorative Arts (© J. Paul Getty Trust. Getty Research Institute, Los Angeles (71.P.1))
A Digitization Case Study for Oak Spring Garden Foundation: A Specialized Approach for a Special Collection
Two-page spread from the French Hortus Collection
Over the past year, Pixel Acuity has conducted several digitization projects for Oak Spring Garden Foundation, using our imaging expertise to digitize their trickier special collections of rare manuscripts. As their name might suggest, special collections typically represent the rarest and most extraordinary works in an archival collection. However, these unique and often fragile collections present their own challenges when it comes to digital preservation.
The material in these collections is often frail and brittle and susceptible to damage caused by handling. While the rapid capture digitization approach lends itself well to imaging large collections of similar materials, it can also be adapted to image special collections with materials of varying sizes and conditions. The benefits of digitizing these types of collections are undeniable. Special care and consideration of the material is crucial to the success of any special collection digitization project– something our team at Pixel Acuity prides itself on.
A digitization technician with a hand-drawn landscape from the Oak Spring Garden Foundation collection
OSGF is committed to serving the public interest by facilitating scholarship and public dialogue on horticulture and the history and future of plants through the gifts of Rachel “Bunny” Lambert Mellon’s estate and gardens. The Oak Spring Garden Library holds over 19,000 unique objects, including rare books, manuscripts, and paintings. Most of which relate to horticulture, landscape design, botany, architecture, and the decorative arts.
Rare and specialized collections like these provide essential historical and cultural context for scholars within a broad range of disciplines. OSGF is currently undergoing a massive project to digitize large categories of its collection.
As a result, for the first time in its history, many of Bunny Mellon’s rare books and manuscripts are in the process of being made digitally available on the library’s catalog and, more broadly, on World Cat. Thus, allowing worldwide access to OSGF’s rare and unique collections. Additional information about the ongoing digitization efforts, and access to the online catalog, are available on the website.
While working to build their digitization program and collection of digital assets, Oak Spring Garden Foundation reached out to Pixel Acuity to digitize some of their more fragile written works and volumes. The scope of these projects has consisted of several rare horticultural books in French and English, large architectural and landscape design books, maps, and atlases in varying conditions due to age and use. Along with capturing digital images of these works at the highest quality to ensure that every etching and detail is conveyed, the Pixel Acuity team adapted their workflows to deal with each condition, type, and size of material.
A digitization technician with a two-page spread from Les Plaisirs
While the rapid capture digitization method emphasizes streamlining and efficiency, it can be difficult to apply to special collections due to their varying conditions and scope. Many of these delicate types of volumes can be difficult to digitize due to the fact that their bindings are too fragile or tight for the book to lie flat at 180 degrees–which is the angle that is most used for flatbed or traditional scanning methods.
Within some of these landscape and architectural books, there were also drawings and etchings that spread over two pages but were affixed to a single supporting piece, which needed to be manipulated to avoid gutter shadow and distortions of the drawings (see figures A and B).
Solutions & Successes
To determine the best approach for each collection that we received from the Oak Spring Garden Foundation, our experienced team undertook a discovery phase to determine the best equipment and workflows for each type of material, combatting each challenge that could arise.
In order to digitize some of the books with tighter bindings or frailer spines, we used the DT V-Cradle, which holds the book nestled at an 80-degree angle so that neither page was fighting gravity and there was no undue pressure on the books’ pages and spines.
The DT Versa enabled our team to digitize two-page spreads with ease. By implementing the two-platten system and conservation-friendly glass top to keep the affixed page level, and using minimal handling we protected the pages throughout the digitization process.
While special collections do require a certain amount of adaptability, the rapid capture approach is still faster and more efficient than traditional scanning methods for the material. By using this approach, we were able to capture high-resolution, preservation-grade digital images encompassing every detail in a fraction of a second. By applying our rapid capture digitization expertise and DT Heritage’s’ state of the art equipment, we were able to safely and efficiently digitize several special collections for the Oak Spring Garden Foundation.
Ready to work together?
Contact our expert staff below
A wildflower illustration from the Wildflowers of Georgetown DC
DT PixelFlow: Artificial Intelligence and Application
March 23, 2021 | by Hannah Storch
Artificial Intelligence (AI) has great implications for cultural heritage preservation. Gathering metadata for a collection can be time-consuming and labor-intensive, often involving the individual knowledge and experience of one person. Without this identifying descriptive metadata, valuable information can be lost and collections can remain incomplete. AI can be used as part of a metadata workflow to reduce the cost and tediousness of enriching a collection with enhanced metadata records.
Along with creating digital preservation-grade derivatives and deliverables, DT PixelFlow can use artificial intelligence to describe an item’s material, text, and image content. This type of descriptive data extraction allows not only for the leveraging of existing assets but also for the salvaging of descriptive metadata information before it, or the context required to create it is lost to time and memory.
While the material type is a common metadata field, it is often automatically generated with more generic information, such as text or film. By implementing AI analysis, DT PixelFlow has the ability to automatically suggest the material type and item categories with greater depth of detail. This is a capability we are currently developing and exploring as part of the PA ArCHER Grant with Smithsonian Center for Folklife and our partners RIVERai. The goal is to have DT PixelFlow automatically determine the types of documents, such as pieces of correspondence or legal briefs, and then further categorize them into groups such as memos, contracts, or letters. Learn more about the PA ArCHER Grant and its progress here.
We are very excited to explore this capability with other institutions and collections. If you have a collection you think would benefit from large-scale automatic material-type description using AI, with human QC, please contact us.
Along with providing Optical Character Recognition (OCR), DT PixelFlow is able to provide a deeper analysis of the text and written content of an image to provide valuable contextual information. By using entity extraction, DT PixelFlow provides context to information that might otherwise seem like unconnected data, such as recognizing an address, formulaic greeting, or date from a string of numbers and text. Similarly, this type of entity analysis can find known entities such as proper names, which could enable an institution to successfully search for and gather together all of the images relating to a particular person or place.
DT PixelFlow’s AI analysis is not only able to recognize and transcribe the text within an image but also understand the conveyed sentiment and style of the text, interpreting the emotions, such as positive and negative or happy and sad, behind them.
These kinds of deeper analyses are set up on a project-by-project basis to ensure the analysis is relevant to the collection, the institution, and the stakeholders of the results. If you think your collection might benefit from AI analysis of the structure, content, or sentiment of the OCR’d text, please contact us for a consultation and we’ll help you understand what is possible and, just as importantly, what is practical.
Artificial intelligence can identify objects and individuals inside of photographic or pictorial collections. With digital records, if this information is not extracted, cataloged, and linked to the image, this descriptive information can be lost to volume – obscured by the sheer scale of images one might have to look through manually to find given content. We can provide object detection and/or face detection in DT PixelFlow. This enables us to isolate and identify objects that are both in the foreground and less prominent in images.
DT PixelFlow also has the ability to identify and categorize general objects, locations, and information using keywords. This information can be general or tied to a specific institution or collection. For example, keywords for a collection of slides belonging to a natural history museum could have more refined and accurate metadata with keywords pertaining to that particular type of collection, location, or scientific study. Within the image, DT PixelFlow is able to recognize and extract both big depictions and small details, from the recognition of landmark features to facial features and human emotion. If there are specific individuals of interest, we can even train DT PixelFlow to automatically identify the faces.
Once the descriptive metadata has been derived through AI analysis, it can be packaged in many different formats to make it more accessible to the user and institution. After it has been interpreted, the information can be embedded into the final image file, ensuring that the data is always linked to the image and that they can be updated together in the future, or output in other formats. For usability, this descriptive data could also be generated in a txt file, document format, or included in a new or existing spreadsheet. To learn more about how to make the most of your metadata, check out our recent metadata article.
Traditionally the accumulation of descriptive metadata has been a specialized as well as a tedious and labor-intensive process. With AI analysis and application, Pixel Acuity is able to maintain a high level of accuracy while increasing efficiency and accessibility, and as needed we can leverage our highly skilled staff to provide human QC on top of the automatic detection provided by our AI Combining our in-house software, our experience in cultural heritage collections, and our talented team, we are now able to assist in the preservation of collections through descriptive categorization and contextualization derived from AI research and analysis as well as digital surrogacy.
Contact us to learn more about our digital imaging services and how we can bring artificial intelligence to your workflow.
Managing and Mastering Metadata with DT PixelFlow
January 21, 2021 | by Hannah Storch
For cultural heritage professionals, metadata provides invaluable descriptive information about an object or resource, but it is also time-consuming to accumulate. Creating and maintaining metadata for a collection is an integral part of taking cataloging one step further and creating a digital collection. Metadata provides context for an item within a collection and can either be embedded in the digital file at or after the time of creation or maintained in a centralized location such as a database, DAM, CMS, or spreadsheet. Using RAW rapid capture imaging with metadata embedding capabilities and DT PixelFlow, Pixel Acuity has been able to automate much of the metadata creation process and create workable metadata formats for institutions. This helps us both reduce the cost of, and increases our accuracy and flexibility in, providing metadata services.
For the purpose of institutions such as libraries and archives, metadata can be categorized into four basic types: administrative, descriptive, preservation, and technical. Administrative metadata provides the provenance context information necessary to understand the information resources, such as past ownership and from where the resource came. Descriptive metadata describes a resource, its context, and identifying characteristics so that people can locate and search for the asset using subjects and keywords. Preservation metadata is the conservation information that can be used to protect the original resource from deterioration or degradation. Finally, technical metadata is the information about the digital file that can allow the resource to be identified. When collecting metadata, institutions are faced with limited resources for staff, time, and funding, often having to choose between feasible or “good enough” metadata and comprehensive metadata. We, at Pixel Acuity, are able to use our DT PixelFlow scripting to alleviate some of the burdens of that choice, offering options for embedding and generating metadata to create and enhance institutional records.
DT PixelFlow Metadata Capabilities
Since technical metadata is information about the digital asset that is created during the collection imaging process, we are able to capture that information and embed it in the file derivatives themselves. Embedding this information ensures that it will not be lost, automatically links the data and the metadata, and ensures that the image and the metadata will be updated together. Typically this information is linked with the TIFF derivative file after being captured in the RAW but with DT PixelFlow we are able to embed it in almost all derivative formats including TIFFs, JPEGs, PDFs, and PDF/As. Technical metadata information can also be extracted and put into a spreadsheet, which can simplify data management and facilitate search and retrieval. Using DT PixelFlow we are able to generate spreadsheets containing this information in multiple formats, including Dublin Core, based on client preference in order to facilitate data retrieval and storage in accordance with their own institution’s Digital Asset Management System (DAMS) or collection information storage/organization type.
Along with creating spreadsheets and ingesting metadata, DT PixelFlow can be used to enrich existing records or create a more holistic record, combining information from original inventories and documents with new information obtained at the time of capture and digital content creation. This includes the generation of basic descriptive metadata such as an object type or category, transcription of annotations on an item or its container/folder/box, and non-subjective evaluations such as page count. More advanced descriptive metadata information can require both organization-specific and subject-specific expertise. While it is not possible for us to offer such subject-specific expertise for every collection that we digitize, we are able to utilize and enhance records created by specialists. If a client is able to provide us with an inventory with existing information, such as an inventory spreadsheet or an XML format of a finding aid, we can extract information from that format and create a new record, with that information as well as the technical information about the digital asset obtained during the imaging process. This allows us to combine records to give a more comprehensive understanding of a resource, its place within a collection, and how it relates to the digital asset.
Case Study Featuring Metadata Mastery
Using our DT PixelFlow scripting, we are able to automate the often arduous processes of metadata embedding and creation, minimizing cost and labor for the institution and allowing it to focus on other aspects of digital collection creation. In just one example, prior to digitization, one of our clients had a cataloging inventory with folder-level descriptive information that listed the location information for the assets, such as box and folder number, as well as descriptive information including title, dates, location, collection, and series. Not only were we able to use our tools to embed that information into the files, effectively linking the original object to the digital asset, but we were also able to generate information about the filename of the digital asset, the number of assets for each folder, and generate checksum lists and titles for each folder. This additional information was added to the original inventory, giving a more holistic view of the original item and the digital asset within the collection and linking the two within our client’s DAMS.
Artificial Intelligence & Metadata
Pixel Acuity is also at the forefront of leveraging artificial intelligence to assist metadata workflows. We are working with the Smithsonian Center for Folklife and Cultural Heritage (recipient of our DT ArCHER Grant) to evaluate the effectiveness and accuracy of these methods. This effort deserves its own article, which we will publish later this year. For now, we can say that we don’t expect AI to be a magic wand that replaces expertise, experience, and careful execution, but we do expect it will be an enormously useful tool.
While metadata generation can be costly in terms of time, labor, and resources, it is crucial to capturing the context of items within a digital collection. Metadata is what allows scholars and researchers to search a collection for specific information and allows registrars, cataloguers, and collections managers to organize their data and collection information. With DT PixelFlow automation, we are able to effectively and efficiently assist our clients to have integrated metadata records, so that they do not have to sacrifice quantity or quality.
Ready to Learn More About Mastering Metadata?
We can help your next project be a breeze! Learn more about DT PixelFlow, project planning, additional services, and pricing by contacting us here.
Shining Light On Film Scanning
December 10, 2020 | by Hannah Storch
Preservation grade film scanning is no simple task, and it becomes considerably more complex in mass digitization projects with large collections. Transmissive materials (a catch-all term used to collectively refer to film, glass plates, and any other media designed to be viewed in front of a light source) present many obstacles in handling and imaging not found with reflective media, and there are other considerations in terms of digitization method and final image rendering. Pixel Acuity has spent the better part of a decade perfecting film scanning workflows that optimize efficiency, fulfill each client’s unique goals, and conform to the highest image quality standards.
Film collections are often in a delicate physical state and are susceptible to many types of physical deterioration. Film can degrade in many ways: delaminating, becoming brittle, distorting, and fading to name just a few. All of these factors result in the need for conservation-grade handling and extra attentive care during imaging, especially during rapid capture in mass digitization efforts.
Because film must be handled with the utmost care, digitization workflows frequently require additional staff beyond an imaging technician/photographer, and once the film has been imaged, there are still many decisions to make regarding the presentation of the film, all of which require in-depth knowledge of software settings, workflows, and processing steps. Clients may want film presented as it appears to the eye, or want negative items converted into positive images and positive images color corrected.
At Pixel Acuity, our team of experts uses their extensive knowledge and experience to resolve these issues and create the highest-quality preservation-grade digital surrogates.
In order to provide the best care possible for the film during the digitization process, Pixel Acuity follows the same conservation principles that are used for in-person viewing. All working surfaces are cleaned on a regular basis and the trained object handlers handle the material with care and wear conservator-approved gloves.
In order to minimize potential damage or scratching of the emulsion of the film, Pixel Acuity uses film carriers, such as the Digital Transitions (DT) magnetic or glass carriers, that make minimal or no contact with the emulsion (pictured above). These carriers also help deal with physically distorted material, hand-cut film, and materials of differing thickness, such as glass plates and lantern slides.
Over years of working in the cultural heritage imaging space, Pixel Acuity has perfected imaging workflows for film, moving quickly, efficiently, and safely through the digitization process. By implementing these workflows, we are able to digitize transmissive material at an unparalleled rate, imaging approximately 2,500 35mm slides or 3,200 strips of film a day.
Using our extensive knowledge base, Pixel Acuity’s skilled imaging technicians are able to render film according to the client’s specifications and needs: either object reproduction, content reproduction, or speculative artist’s rendering.
Object reproduction imaging is a faithful reproduction of the entire physical object, as it would appear to the eye on a light table.
Content reproduction involves producing a human-readable version of the image contained within the object, for example, a negative converted to a positive image, or a contrast adjusted version of a faded positive image. Color negative conversion is a particularly challenging task, with no one-size-fits-all solution. However, Pixel Acuity has developed several proprietary conversion methods born from extensive research and experience in the darkroom that provide excellent positive “print” image files from color negatives of all types.
A speculative artist’s rendering involves more creative license and agency on the part of the imaging technician as they attempt to recreate the image as they imagined the artist would have wanted their final product to look. This rendering method can produce final images that counteract the effect of years of age on the film itself and produce an image that is reminiscent of how the original film was most likely intended to look. For this type of bespoke imaging work, Pixel Acuity works with clients to research how the artist might have wanted the image represented to ensure accuracy in the alterations.
We Can Help With Your Collection
Pixel Acuity’s extensive experience in digitizing transmissive materials, our knowledgeable object handlers and photographers on staff, and our use of the latest imaging equipment and technological tools in the industry makes us one of the leading authorities on film scanning.
Working with collections around the world, for institutions such as the Smithsonian Institution, The Getty, and so many others, Pixel Acuity has created digitization workflows that combat the challenges of such a potentially tricky material while optimizing efficiency, quality, and preservation.
To learn more about how Pixel Acuity and Digital Transitions can help you with digitization services, software, and consultations, please contact us.
Looking for more film scanning resources? Check out this new Film Scanning Knowledge Center by DT Cultural Heritage here.
Glass plate negative (right) picturing Abraham Lincoln was taken by Mathew Brady and was digitized by Pixel Acuity for the National Portrait Gallery.
The Word On Optical Character Recognition With DT PixelFlow
At The Phillips Collection Archive
November 19, 2020 | by Hannah Storch
Pixel Acuity has offered the cultural heritage community unparalleled imaging and digitization services for the better part of a decade. Recently, we have added new automations and related offerings to our repertoire. One of the most impactful innovations in Cultural Heritage imaging technology has been the ability to use the next-generation Optical Character Recognition (OCR) in our DT PixelFlow software to turn typed and handwritten documents into searchable text. Pixel Acuity is now not only able to generate the highest quality digital images for cultural heritage collections but also to create searchable texts for the researchers and scholars who access these collections, revolutionizing the way that they conduct research.
The Phillips Collection Archive
One of our ongoing projects that leverages DT PixelFlow’s OCR capabilities is our project with The Phillips Collection Archive in Washington, DC. The Phillips Collection houses modern and contemporary art, while The Phillips Collections Archive contains materials pertaining to the museum’s founding director, Duncan Phillips, and his wife Marjorie. The Archive holds materials documenting the purchase of important pieces of modern and contemporary art from the 1920s to present acquisitions. The current project consists of digitizing approximately 100,000 personal photographs and correspondence, pamphlets, and documents relating to the family and their work with various directors, artists, and galleries. By using DT PixelFlow’s OCR capabilities, The Phillips Collection Archive is able to transform their collection of typed and hand-written material into fully-searchable documents.
Optical Character Recognition (OCR) Application and Process
For our project with The Phillips Collection Archive, we are able to implement our OCR technology to create two different types of readable and searchable text files from our digital images – PDF/As and .txt files. We start by capturing the highest-quality and most consistent images of the material – the better the input the better the output – so we surpass preservation-grade digitization standards such as Metamorfoze-strict, FADGI 4-star, and ISO 19264 using RAW rapid capture photography to capture digital images. This enables us to preserve all of the information recorded by the camera sensor at the time of capture without applying compression or losing any information.
Once all of the images have been captured in the RAW format, they are ready to be run through DT PixelFlow in order to create the OCR’d derivatives. Due to our modern machine-learning approach, we are able to generate highly-accurate OCR’d text in multiple languages and output formats. We also have the flexibility to create a controlled, topic-specific vocabulary, depending on the needs of the collection, which can be used to further increase the specificity and accuracy of the resulting text.
The resulting data learned during the machine OCR process is then encoded into an hOCR file, which can then be converted into the deliverables requested by the client. Our unique approach enables us to offer a wide range of deliverables, including but limited to, PDF, PDF/A, a METS/ALTO sidecar xml, and txt files.
Derivatives and Deliverables
Since The Phillips Collection Archive aims to make the documents and correspondence of Duncan and Marjorie Phillips more accessible to researchers and scholars, they have opted for both PDF/As and txt files. The PDF/A format layers the OCR’d text over the image of the object and produces a document that researchers can use to search on their own devices and see matches in their original visual context, in the document itself (examples of typed and handwritten applications are pictured above). The txt file (one example is pictured right) extracts the text from the image and creates a separate file format, which can be utilized by other institutional systems such as text-analysis tools or word-cloud generators. The choice of these OCR’d deliverables, along with highest-quality preservation-grade digital images, will allow researchers to delve deeper into The Phillips Collection Archive and learn more about the history of the Museum and the relationships that formed its foundations. While it may have taken hours of painstaking research to further explore the relationship between the Phillips Collection and The American Federation of the Arts, with a simple keyword search, a researcher can now find all of the documents, both typed and handwritten, pertaining to the Federation or The Phillips with a click of a button.
It is opportunities and projects like these that allow Pixel Acuity, as a company, to innovate new workflows and adapt new technologies to give our clients the best possible digitization services and imaging experience. We continue to promote advancements, such as machine-learning-powered OCR, within the cultural heritage community because, the bottom line is that the best deserves the best.
To learn more about how Pixel Acuity and Digital Transitions can help you with digitization services, software, and consultations, please contact us.
How Can Pixel Acuity Help Your Digitization Program In the Age of Social Distancing?
There are many unknowns in the world today and Cultural Heritage institutions everywhere are facing the same challenge – how to best serve the public and their mission while keeping their employees safe and healthy. We heard this first hand from many institutions during and following our recent webinar titled “Digitization in the Age of Social Distancing.” When coronavirus (COVID-19) struck, these institutions that are responsible for promoting the arts, history, and culture had to close their doors to the very public they serve. In order to function in the “new normal,” people became increasingly reliant on technology, using it not only as their primary means of gathering information but also interacting with the world. With this growing dependence on online and remote access, collection digitization and digital preservation have proven even more vital than ever before. At this moment in time, we as a Cultural Heritage community can come together, reaching the world and the public in new ways through collection digitization and online publication.
With decades of combined hands-on experience imaging cultural heritage in all its forms, Pixel Acuity offers highly specialized knowledge of the inner workings of large-scale digitization efforts as well as customizable workflows and production solutions to fit your institution’s individual needs. By working with you in assessing your individual digital imaging needs, Pixel Acuity can provide a tailored digitization workflow plan for either on-site or off-site digitization that will maximize efficiency while implementing OSHA and CDC guidelines to ensure a safe working environment for everyone involved. We are committed to providing the same high standard of care and production quality whether on-site or off-site. In both scenarios, we are dedicated to applying archival-quality methodologies and Federal Agencies Digital Guidelines Initiative (FADGI) standards to our imaging of collections to create preservation-grade images of the highest possible resolution and quality control.
Although some parts of the world are beginning to reopen to varying degrees, Cultural Heritage institutions are experiencing a “new normal” with reduced on-site staff and hours spent in-person with physical collections. We at Pixel Acuity understand that when every in-person hour counts, priorities have to be re-evaluated in order to optimize time spent with the collection. Within the digitization workflow, physical preparation and digitization of materials requires hands-on work with the physical collection, while tasks such as metadata entry, post-processing, and quality control (QC) can be completed remotely. Pixel Acuity provides both on-site and off-site digital imaging services to the Cultural Heritage community. We are able to install state of the art photographic equipment – developed by Digital Transitions – on-site at Cultural Heritage institutions, as well as provide highly qualified imaging technicians to implement digitization workflows and create high-resolution digital images. We also are able to arrange for off-site digitization production at any of our production facilities in Chantilly, Virginia; New York City, New York; or Los Angeles, California. With this service, Cultural Heritage institutions can send their collections to one of our facilities, where our certified art/object handlers and imaging technicians can digitize the collections and then return the digital files and physical collections back to the original institution. This allows Cultural Heritage institution employees to minimize the amount of time they have to spend on-site with the collection during the digitization process, and enables us to provide them with digital files that can be worked on remotely post-digitization. Along with providing imaging technicians and object/art handlers for digitization production, Pixel Acuity is also able to supply qualified staff to assist with collection processing duties such as rehousing collections, barcode application, and metadata creation.
Like everyone in the Cultural Heritage community, we at Pixel Acuity understand how difficult it is to serve the community and our clients in these uncertain times, and we are working harder than ever to provide customizable imaging solutions to help institutions reach their digitization goals. To learn more about our services or to obtain more information, please contact us.