Selection & Cataloging


Traditionally, analog collections have been built for the convenience or the interests of the owner of the collection. In many cases what starts as a personal collection will develop broader purpose beyond this original intended use. In any regard, in planning the digitization of a collection, the perspective of the end-user is critical. "Who will want to use this collection? How will users search this collection?" are essential questions that you will need to address early on in the planning process. Your unintended audience might even be wider than the one that you anticipate, and the better you can provide for the users of your project (both current and future), the more effective it will be.


In defining the scope of the project, consider a realistic, manageable number of audio media. Each audio collection should be culled for duplicates, undesirable audio of poor quality or inappropriate content, or content protected by copyright, etc.

In some instances, even though you have a number of audio media that you wish to convert, acquiring them through license or purchase may be a more efficient approach to developing digital resources than digitizing a collection personally, and in many cases, it is not permitted to digitize commercially available audio media, even if they are out of production or no longer available. The College Library acquires substantial audio resources, for example, and other licensing and vendor options may provide alternatives to "doing it yourself." IT can provide advice and refer you to Library staff who are knowledgeable about Library resources and copyright.

Intended use of audio

How your digital audio will be used is crucial to the planning process. Consider uses over the long term, since the digitization and/or capture specifications that you choose will eventually define the optimal output quality of your audio.

Commercial production, classroom listening, Web download, streaming or hosting, all require choices about file format, encoding, and compression. By anticipating the intended end-use before the audio is created or digitized, it is possible to choose specifications that will result in satisfactory audio quality without squandering resources such as time and network storage.

Quality Control

Quality control is one of the most critical aspects of any digitization process and in fact applies throughout the life of recorded audio. Workflows need to be identified and established to ensure consistency in audio capture (conversion, digital recording, purchase of audio media), metadata and controlled vocabularies, and audio editing. Although originally written to address digital image quality control, the following selection from the Getty Research Institute's Introduction to Digital Imaging applies equally well to quality control as it relates to digital audio:

"Quality control must also be applied to all access files derived from master images and to all preservation copies made, on whatever media, as mistakes and technical errors can often be introduced during the process of duplication or migration. Files should be checked to ensure that all are correctly named, not corrupted, and so on."


Cataloging digital audio promotes efficient retrieval and increases the likelihood that the digital collection will continue to be accessible and useful over time. Cataloging is best accomplished by creating a database of records that describe each audio file (metadata). If the collection contains or will eventually contain more than 50 audio files, cataloging is necessary.

Metadata reflect information about your audio files that can be represented in a database. These data can include very simple descriptive information like the creator, title, or topic of a particular audio recording; technical information about a digital audio file like file name, file format, etc.; and management information like rights of use, ownership, and copyright.

These metadata descriptors make it possible to search and retrieve specific audio files, and they help in the management and organization of groups of digital audio files. Assigning and entering these data into a data base consistently will allow you and other users to organize and search your audio collections efficiently.

Metadata Standards

To promote reliable searching and basic media and file management practices, Bowdoin College recommends the use of the Public Broadcasting Core (PBCore) descriptive metadata standard for moving images. This standard is based on Dublin Core (see Bowdoin recommended metadata standards for digital still images) but has greater complexity, incorporating more technical fields and the ability to encompass description of multiple physical and digital formats of the same media item. The PBCore standard allows for a wide range of detail, from very simple to extremely specific. Use of the PBCore standard assures consistency and reliability for indexing, retrieving, and managing data for immediate applications and over time.

See for more detailed descriptions and examples.

Field *Required? Definition Examples PBCore element DC element
Descriptive and administrative metadata
Metadata record ID M (PBC M) Unique ID of record of metadata descriptions for a media item Record ID in digital asset management system; may be automatically generated identifier (01.01) No equivalent
Metadata record creator M (PBC M) Agency, institution or  individual that assigns Metadata record ID Bowdoin College Library identifierSource (01.02) No equivalent
Date created M   Date of capture/creation recording date, digitization date dateCreated (25.01) date
Creator or Author MA An entity responsible for the creation of the work performer; producer creator (15.01) creator
Title M Formal name given to the work or brief caption for untitled work Bowdoin Concert Band performanceSpring 2003; Inuit in South Greenland, 1947 title (02.01) title
Administrative Control M An entity responsible for the content of the work A dept. or office that has custody of the work rightsSummary (18.01) rights
IP Rights MA Information about the copyright and related intellectual property rights Name of copyright holder; terms defining use and governing access to the work rightsSummary (18.01) rights
Description   M (PBC M) Brief summary of the content of the work Free text narrative of the contents and/or context of the media Description (4.01) description
Keyword O Topic of the content Topical keywords or controlled vocabulary subject (3.01) subject
Publisher O An entity responsible for making the work available A person; a dept. or office; the College publisher (17.01) publisher
Use O Intended audience Class no.; project name audienceLevel (18.01) audience
Location M (PBC M) Location of media URL; shelf location formatLocation (25.05) No equivalent
Technical metadata
Media Identifier M (PBC M) Unique identifier of media item such as file name or call number BA583 filename.mp3 formatIdentifier (25.25.1) identifier
Media Identifier Source M (PBC M) Agency, institution or  individual that assigns Media ID Bowdoin College. IT   formatIdentifierSource (25.25.2) No equivalent
Physical Format M Physical medium CD, DVD-RW, Digital Audio Tape formatPhysical (25.03) format - medium
Digital format M Identify format of media item as it exists in digital form mpeg; RealAudio Formatdigital (25.04) format
Media type M General descriptor of the kind of media Sound formatMediaType (25.06) type
Generation of Media MA original, master, copy Preservation master, original recording formatGenerations (25.07) format
Format Standard M Identify larger system/technical standard within which media exists MPEG, Quicktime formatStandard (25.08) relation – conformsTo
Format Encoding M How information in media item is compressed, interpreted or formulated MPEG-1 Real Media formatEncoding (25.09) format
Tracks O Number of tracks present 3 audio tracks   formatTracks (25.20) format – extent
Channel Configuration O Arrangement or configuration of specific channels or layers of information Stereo, monaural fomatChannelConfiguration (25.21) format
Recording equipment MA Equipment or software? used to create media Make and model of recording equipment; name and version of software Local use – no equivalent Local use – no equivalent


Mandatory if applicable
PBCore mandatory

Cataloging Systems

Currently faculty and staff use the following database-driven systems to manage metadata for collections of digital assets: Insight Luna, Extensis Portfolio, BPress Digital Commons, Microsoft Expression Media (formerly iView MediaPro). However, some projects may not allow time to develop the skills and cataloging workflow for these specific programs. It is nonetheless still recommended to record metadata associated with each digital asset in some manner. Databases such as Microsoft Access (PC) or Filemaker Pro (Mac) can store this data. Even Microsoft Excel can be used to record metadata, with each column header representing a metadata element and each row representing one descriptive record. Whatever system you use, it is important to back up the database regularly.