Definition of Metadata
Metadata is structured information about an object. It is used to describe several facets of a digital object, such as its general subject matter, specific content, its creator(s), its copyright status, its technical specifications, and so on. A complete set of metadata about a digital object may require the collaboration of several individuals to supply all the different kinds of metadata that make up a comprehensive, high quality characterization of a digital object.
Importance of Metadata
Since many of the objects to be submitted to the Hamilton College Library Digital Collections will be images rather than texts, we cannot rely on full-text to serve as the basis for searching. In order for a search engine to find an image, the image has to be accompanied by metadata that describes the object - preferably with consistent terminology. But even objects that are strictly text (such as Word documents and pdf files) benefit by having accurate structured metadata to specify the precise subject matter covered. Metadata also facilitates the delivery of digital objects in ways that can be sorted by author, date, subjects, geographic and temporal coverage and even file formats.
The Metadata Scheme
The Hamilton College Library Digital Collections is run on software called CONTENTdm which stores its metadata in a scheme known as Dublin Core. The Dublin Core metadata scheme was designed as a simplified cataloging framework for describing digital objects on the Web. It consists of 15 metadata elements and prescribes as few formal data entry rules as possible. However, to facilitate accurate indexing and thus useful delivery to the end user, these basic rules of data entry need to be followed carefully. These basic rules for data entry can be found in the Data Dictionary.
The Digital Collections Metadata Requirements
We expect that most people in the Hamilton community who submit digital objects to the Hamilton College Library Digital Collections will be the subject experts and thus they will be the person best qualified to describe the content of the their objects (and characteristics of works of art), but they will not necessarily be experts in cataloging or technology. Thus, we only expect submitters to supply descriptions of the content of the objects they submit and to follow only some general formatting guidelines for that information. Other kinds of metadata will be added later as necessary by experts in cataloging and technology.
In order to simplify the process of submitting digital objects to the Hamilton College Library Digital Collections we require that Data Providers submit only a few basic metadata elements. However, we encourage all Data Providers to submit as much metadata as they can and to try to fill in values for more elements as they become comfortable with the Hamilton College Library Digital Collections and as they realize just how useful rich metadata is to the user.
Every object submitted to the system must be accompanied by the following metadata elements at the time of submission.
- Title - Formal title
- Creator - Main person or entity who created the content of the original object
- Tags - Keywords or phrases associated with the content of the item
- Object Class - General type of work.
- Type (DCMI Type) - How the content is delivered (or presented), e.g. text, sound, movingImage
- Digital Collection - Name(s) of any group(s) of Digital Objects of which this is a member
- Rights - Copyright statement that identifies the copyright holder by name.
- Access Rights- Rights and terms of access for resources in this collection
- Submitter - Name of the department or unit for which this object is being submitted.
All of these fields are required. The Hamilton College Library Digital Collections Administrator will review the metadata and contact the Data Provider if any changes are needed.
Controlled Metadata Elements
Certain fields are indexed and the index is used to generate result sets in response to user searches. To ensure accuracy and completeness of the result sets, the values in these fields need to be consistently formatted. It may be necessary for a cataloger to revise the Data Provider's submissions for the fields that specify a controlled vocabulary in order to meet the specific searching and sorting needs of the project.
Data Provider's Checklist
Terminology
Terms Used
VRA (Visual Resources Association): AlternativeTerminology
RDA (Resource Description and Access): Alternative Terminology
General Metadata Cleanup
(Updated: November 2, 2010, pjm)
- Be sure you fill in all required fields.
- Do not put anything in a field that is not appropriate for that field. Put our private comments in the Internal Notes field.
- Keep in mind that some metadata fields apply to the original object and others apply to only the digital representation of it. Assume you are describing the original object unless it is obvious that the element is designed for the digital version.
- Example, the "Date" field should be the date of the original object -- not the date it was digitized or the date the metadata was created.
(The original vs. digital dichotomy, of course, does not apply to born-digital objects).
- Do not use HTML elements in the metadata. The only exception is the <br/> element which may be used in longer free-form fields to mark the break between paragraphs.
- Encode the following characters:
Character Entities
Character |
Encoding |
& |
& |
< |
< |
> |
< |
" |
" |
- Abbreviations are allowed in the Title field if they appear in the title on the original object. In all other fields, avoid abbreviations unless they are very common ones such as "Dept.," Mr.," "St." (for "Saint" or "Street"), and so on. States may be abbreviated if the controlled vocabulary uses them or in mailing addresses, but elsewhere they should be expanded to their full names.
- Be consistent in all fields, especially in the formatting of names and dates .
- In the spelling of tags (keywords)
- Use the plural forms of all tags
- Use both the singular and the plural if their formation is quite different (e.g. "buggy" and "buggies").
- Do not fill in fields with "unknown" or "N/A." Just leave them blank if you have nothing to enter.
- To enter letters with diacritics, you must use the Latin-1 character set
(see <ISO 8859-1 (Latin-1) Characters List>.
- The original "Content" is the conceptual or physical entities referred to or depicted by some Work (e.g. Content is the subject matter of a photo or a book, places in a map, etc.).
- The "Work" is the physical medium used to carry or convey the Content (e.g., a photograph, a slide, a map, video, etc.).
- The "Digital Object" may be an electronic reproduction of an original non-digital work or it may have been "born digital."
- Work = the physical representation of intellectual content (e.g., a sculpture)
- Image = the visual representation of the work (e.g., an analog photo of the sculpture)
- d-image = the digital representation of the image (e.g., a digital camera image taken directly of the sculpture, or a digital image created from the analog photo of the sculpture)
- Work: a distinct intellectual or artistic creation
- Expression: the intellectual or artistic realization of a work
- Manifestation: the physical embodiment of an expression of a work
- Item: a single exemplar of a manifestation
- Remove metadata fields with no information (e.g., "none," "unknown").
- Check for data in wrong fields.
- Check spelling.
- Remove duplicate information.
- Resolve UTF-8 encoding problems (ampersands, etc.).
- Remove disallowed HTML tags.
- Remove extraneous white-space.
- Supply a semicolon plus space between separate entries.
- Put a semicolon after the last word of every field.
- Top -
(Reviewed: September 27, 2010)