Selection & Cataloging

Audience

Traditionally, analog collections have been built for the convenience or the interests of the owner of the collection.  However, in planning the digitization of a collection, the perspective of the end-user is critical.  “Who will want to use this collection?  How will users search this collection?” are essential questions that you will need to address early on in the planning process.  Your unintended audience might even be wider than the one that you anticipate, and the better you can provide for the users of your project, the more effective it will be.

Scope

In defining the scope of the project, consider a realistic, manageable number of images. Each image collection should be culled for duplicates, undesirable images of poor quality or inappropriate content, etc.

In some instances, even though you have a number of pictures that you wish to scan, acquiring them through license or purchase may be a more efficient approach to developing digital resources than digitizing a collection personally. The College Library acquires substantial visual resources, for example, and other licensing and vendor options provide alternatives to “doing it yourself.”  IT can provide advice and refer you to library staff who are knowledgeable about Library resources.

Intended use of images

How your digitized images will be used is crucial to the planning process.  Consider uses over the long term, since the digitization specifications that you choose will eventually define the optimal output quality of your images.   

Commercial printing, printing with a laser or ink jet printer, Web display, projection, zooming and cropping all require choices about color, resolution, bit depth, and file type.  By anticipating the intended end-use before the image is digitized, it is possible to choose specifications that will result in satisfactory image quality without squandering resources such as time and network storage.

Quality Control

Quality control is one of the most critical aspects of any digitization process and in fact applies throughout the life of an image. Workflows need to be identified and established to ensure consistency in image capture (scanning, digital photography, purchase of images), metadata and controlled vocabularies, and image editing. 

According to the Getty’s Introduction to Imaging, “Consistent image-capture guidelines and parameters should be established, and scans must be periodically reviewed and checked for accuracy, ideally against the source material, whether they are produced in-house or supplied by a vendor. Although automatic scanning is generally consistent, problems with exposure, alignment, and color balance occur often enough to require a quality-control component in any scanning program. Without quality control, it will not be possible to guarantee the integrity and consistency of the resulting digital image files… Records need to be proofread and mechanisms such as controlled vocabularies utilized to ensure consistent data entry. Additionally, relationships between cataloguing records and image files need to be verified and/or developed.  Quality control must also be applied to all access files derived from master images and to all preservation copies made, on whatever media, as mistakes and technical errors can often be introduced during the process of duplication or migration. Files should be checked to ensure that all are correctly named, not corrupted, and so on.”

“Quality control must also be applied to all access files derived from master images and to all preservation copies made, on whatever media, as mistakes and technical errors can often be introduced during the process of duplication or migration. Files should be checked to ensure that all are correctly named, not corrupted, and so on.”

Cataloging

Cataloging digital images promotes efficient retrieval and increases the likelihood that the digital collection will continue to be accessible and useful over time. Cataloging is best accomplished by creating a database of records that describe each image (metadata).  If the collection contains or will eventually contain more than 500 images, cataloging is necessary.

“Metadata” reflect information about your images that can be represented in a database.  These data can include very simple descriptive information like the creator, title, or topic of a particular image; technical information about a digital image like file name, resolution, file format, etc.; and management information like rights of use, ownership, and copyright.

These “metadata” descriptors make it possible to search and retrieve specific images, and they help in the management and organization of groups of digital images.  Assigning and entering these data into a data base consistently will allow you and other users to organize and search your image collections efficiently. 

Metadata Standards

To promote reliable searching and basic file management practices, Bowdoin College recommends the use of the Dublin Core descriptive metadata standard for digital image databases.  The Dublin Core allows for a wide range of detail, from very simple to extremely specific, and it accommodates considerable flexibility in customizing descriptions (such as adding non-standard fields) for a particular set of objects.  Because this standard is well established and applied internationally, adopting the Dublin Core assures consistency and reliability for indexing, retrieving, and managing data for immediate applications and over time.

Field
*Required?
Definition
Examples
DC element
File name (system supplied)
M
Assigned file name, including extension, of the digital object filename.tif;
filename.pdf
identifier
Date digital (system supplied)
M
Creation date of digital file Scan date; “born digital” date
date
File type; size; resolution (system supplied)
M
Type of image file; byte size; DPI TIF; 5.8 MB; 600 DPI
format
Title
M
Formal name given to the work (or work from which image derives); brief caption for untitled work Mona Lisa;
Boy riding a pony
title
Creator or Author
MA
An entity responsible for the content of the work Artist of a painting; photographer
creator
Administrative Control
M
An entity responsible for the physical or administrative control of the work A dept. or office that has custody of the work
rights
IP Rights
MA
Information about copyright and related intellectual property rights The name of a copyright. holder; terms defining use and access of the work or of the digital image
rights
Date original
R
Date when original work was published or created Date of painting; date of photo; construction date of bldg.
date
Description
R
Brief summary of the content of the work Free-text narrative of the contents and/or the context of the image
description
Keyword
O
Topic of the content Topical keywords or controlled vocabulary; LCSH; thesauri
subject
Publisher
O
An entity responsible for making the work available A person; a dept. or office; the College
publisher
Use
O
Intended audience for viewing the image or digital project Class no.; project name
audience
Local identifier
O
Numbering or naming reference to the resource from which the digital image derives Call no.; local file no.
source


*M= Mandatory          R=Recommended          MA=Mandatory, if applicable          O=Optional