Basic Digitization Terminology (D1.b)

General Digitization Terminology
Resolution Terminology (Size, Mega Pixels)
Color Terminology
File Format Terminology
Compression Terminology
Image Metadata Terminology
Media Terminology
Other Glossaries

General Digitization Terminology

Checksum

The checksum is a number based on the digital content of a digital object. A unique checksum value is generated for every electronic file and any change in the file would result in an alteration of the checksum value. Therefore, the checksum value is often used by serious preservation systems to detect if the data has been altered. If it has, then a previous copy of the file with the correct checksum should be found to replace the changed version.

Pixel

The term pixel is a computer abbreviation for picture element, the smallest element of a digital image. Pixels are the tiny dots that make up an image on a computer monitor. Pixels are the measurements used to calculate bit depth.

Megapixel

A Megapixel is 1 million pixels. It is used to express the number of sensor elements of digital cameras, the number of display elements of digital displays, as well as the number of pixels in an image.

Reflective Scanning

A scanning method whereby light is bounced off the scanned object to capture the image – typical of scanning photographs and documents.

Transmissive Scanning

A scanning method whereby light is passed through the original to capture the image – typical of scanning slides and other transparencies.

Resolution Terminology (Size, Mega Pixels)

DPI

Dots per inch is the term of measurement for the resolution of printers. It is the number of dots a printer can print and thus it can also be used to refer to the pixel density of the printed version of the image. It is less accurately also used to measure the resolutions of scanners (see SPI) and monitors (see PPI). The more dots per inch, the higher the resolution: 600 dpi would mean 600 x 600 = 360,000 dots per square inch. This term is a key factor in the conversion analog to digital object. The more samples per inch take by the scanner, the larger the image will appear on the monitor (ppi) and the more dots per inch will be printed on the printer (dpi).

Dynamic Range

Dynamic Range is a measurement of the number of bits (1, 8, 16, 24, or 48) used to represent each pixel in a digital image. It is the range or levels from lightest to darkest (shadows and highlights). 8-bit color or 8-bit grayscale means that each pixel can be one of 256 shades of color or one of 256 shades of gray. 24-bit color means that each pixel can be one of 16.8 million colors. (In digital audio production it is the range from quietest to loudest in a sound file.) In order to increase the dynamic range captured during digitization of an object, some scanners offer a bit width higher than the usual 24 bits. For example, a 30-bit scanner increases the number of steps per color from 256 to 1024.

Histogram

A histogram is a diagram that illustrates how light and dark pixels are distributed across an image. Shadows (dark pixels) show up on the left side of the histogram, midtones in the middle, and highlights on the right side. In Photoshop, changes can be made in each of the areas to adjust the tonal range across the image to improve its appearance. An image with full tonal range has a high number of pixels in all areas, without gaps.

Image Resolution

Image Resolution can be measured in several ways. Spatial resolution is one common use and it depends on the properties of the device through which the image is being delivered. A computer monitor is a spatial resolution of 72 to 100 lines per inch, which corresponds to pixel resolution of 72 to 100 ppi. A printer, being a completely different physical device has a different spatial resolution from a monitor.

Interpolation

This is a technique used by scanners or digital editing software to increase the size of a digital image by increasing the number of pixels it contains. It accomplishes this by averaging the color range of adjacent pixels and then inserting new pixels of this value between the existing ones. This technique does smooth out the transitions between adjacent pixels, thus avoiding the "jaggies," but sometimes requires the user to apply some minor "sharpening" to restore crispness of the image. However, interpolation should be avoided. It is preferable to capture more pixels during the scanning by just increasing the resolution, but be sure not to scan at a resolution that exceeds the scanner's maximum, interpolated resolution.

Resampling (resizing)

Resampling refers to changing the number and values of pixels in an image. It is creating a new image with new pixel dimensions by using the original image pixels as the basis from which to work out the values for each new pixel. Resampling that creates a larger image than the original involves interpolation and should be avoided. It is always preferable to scan an image at the size needed than to try to increase the size or resolution later. Resampling an image to create a smaller one also involves interpolation, but is usually does not degrade image quality.

Spatial Resolution

Resolution is the number of pixels (in both height and width) making up an image. It represents the level of detail in a digital device or object. For example, the higher the resolution, the more pixels there are in an image, and the more pixels there are, the greater its clarity and definition (and the larger the file size). For computers and digital cameras, resolution is measured in pixels; for scanners, resolution is measured in samples per inch (spi) or dots per inch (dpi); for printers, resolution is measured in dots per inch (dpi).

If you are going to print an image for publication, the image will need a higher resolution (more pixels) than if you were just going to display it on a computer screen because printers compact the pixels of an image into a high number of dots per inch. Your image will appear too small if it doesn't have enough pixels.

Resolution is defined somewhat differently for a digital image, a monitor, scanner, digital camera or a printer. (see http://en.wikipedia.org/wiki/Resolution .)

PPI

Pixels per inch (PPI) is the unit of measurement used to describe the physical resolution of a computer display screen. It can also sometimes be used to describe the resolution at which an operator wants a scanner to capture data, although using SPI would be more accurate. The PPI of a monitor is a fixed figure and so a monitor that can show 72 ppi will show a 600 ppi image only 72 pixels per inch. This means that each inch of the image will need about seven screen inches to display fully. It can only show 72 pixels per inch. It is sometimes used synonymously with DPI and SPI, but only loosely so.

Printer Resolution

The number of individual dots of ink a printer or toner can produce within a unit of distance (e.g., dots per inch).

SPI

Samples per inch (SPI) is a measurement of the resolution of an image scanner, in particular the number of individual samples that are taken in the space of one inch. rt is sometimes used synonymously with DPI and PPI.

Color Terminology

Bi-tonal Images

The bitonal specification for images is also known as "line art" or "black and white," and was once used for scanning printed text or high contrast graphics, but it has lost favor as a scanning technology today, but it is still sometimes useful as a display format because the files are so small and thus they are quick to transfer over the network to the user. In a bi-tonal image, the bit depth is a value of 1, that is, either the pixel (bit) is black or it was white – no shades of gray are represented.

Bit-Depth

Bit Depth is the range (or tonal levels) of color or shades of gray found in an image. The greater the bit depth is, the greater the depth of color, and the larger the color (or grayscale) palette (number of colors). In other words, the higher the bit-depth is, the greater the subtlety of color or gray.

a 1-bit image is black and white (bitonal)
a 4-bit grayscale image yields us 4 tonal levels
an 8-bit grayscale image has 256 tonal levels of gray
an 8-bit color image has 256 tonal levels of color
a 16-bit color image has 65,536 tonal levels of color
a 24-bit image has 16.7 million possible color (that is, three 8-bit channels: 8,8,8, which yields 256 shades of red, 256 shades of green, and 256 shades of blue).

(See also the RGB color model). 48-bit images are possible and provide the most color information about an object, but most software and hardware are not capable of displaying data in 48-bit depth. It is ideal for archival purposes, but 24-bit depth is still preferred (as of May, 2007).

CMYK (cyan, magenta, yellow, and key/black) is a color model used in color printing. The CMYK color model is divided into four channels of color and the colors are manipulated (using a "subtractive" method) as needed to form the full range of colors. Since CMYK and RGB have different gamuts, the color of an image as it appears on a computer monitor (RGB) must be converted to the CMYK equivalent in ink colors before it can be printed accurately.

Color Calibration

Color calibration is the act of adjusting the color of one device relative to another. You can calibrate a monitor to a printer or to a scanner. Or, it may be the process of adjusting the color of one device to some established standard. For example, you can calibrate your monitor by using software or a hardware device to compares the color spectrum of the monitor with a standard color spectrum.

Color Calibration Card or Target

A calibration target is a card that shows a matrix or spectrum of colors that are set to a known standard. Color Cards can provide reference points to ensure accuracy of color capture and to calibrate output devices. We recommend the use of Kodak Color Separation Guide and Gray Scale (Small) cat 152 7654.

Color Channels

Color channels are separate color components used by the various color spaces/models. For example, RGB images have three channels: red, green, and blue; CMYK images have four: cyan, magenta, yellow, and key/black.

Color Model (Color Space)

A color space is a three-dimensional representation of the colors that can be discerned and/or created by a particular color model. RGB is a color Model that has a color space of red/green/blue and CMYK has cyan/magenta/yellow/key(black). It may also refer to the range of possible colors that can be produced by a particular output device --such as a monitor or color printer.

Gamma

Gamma is commonly used to describe they way contrast and brightness are distributed across the intensity spectrum of a monitor, printer, or scanner. A perfect gamma of 1.0 would plot on a graph called a "tone curve" as a straight line. Although a scanner is fairly linear, the tone curve of a monitor or printer is bent, yielding a gamma in the range of 1.8 to 2.6, which affects midrange tones. Depending on the device, the gamma may have a significant effect on the way colors are perceived. This is why it is important to set the gamma for your monitor appropriately for your operating system. The recommended gamma for a monitor of the PC is 2.2 and for a Mac is 1.8.

Gamma is also used in other ways in digital photography. For explanations of these other uses see Gamma Terminology and Calculations (Accurate Image Manipulation)
http://www.aim-dtp.net/aim/calibration/gamma_terminology/gamma_termninolony.htm (sic.) (accessed Feb. 19, 2007)

The subset of colors represented in an image. The color gamut is dependent on the color space it is based on (such as RGB, sRGB, or CMYK). Colors that cannot be displayed within a particular color space are said to be out of gamut. For example, some reds in the RGB color space gamut are out of gamut in the CMYK color space. Note that different output devices will vary in their capability of reproducing the try color gamut of an image.

Grayscale

Grayscale is a color space that displays decolorized images. Grayscale images use black, white, and a range of shades of gray. Displaying up to 256 shades of gray, grayscale is useful as a web display of black and white photographs. The larger the number of shades of gray, the better the image will look, and the larger the file will be.Grayscale images are created with a bit depth of 8.

Hue

The Hue of a color is its basic color and is described with names such as "red," "yellow," green" or any other name.

Kodak Color Cards

See Color Calibration Card.

LAB (Lightness/A channel/B channel) is a color model designed to produce a color space that is more perceptually linear than other color spaces. This means that a change of the same amount in a color value should produce a change of about the same visual importance. The L, or lightness, channel controls how bright or dark a pixel is. The A channel controls the color by shifting between green and red colors. The B channel controls the color by shifting between blue and yellow colors.

RGB (short for Red-Green-Blue) is a color model that contains G full range of colors. RGB images are usually created using 24-bit depth which produces up to 16.8 million colors. Photoshop refers to RGB values as channels. Thus, an 8-bit channel (8,8,8) is the same as 24-bit depth and a 16-bit channel (16,16,16) is 48-bit depth. The variety RGB, known as Adobe RGB (1998) is widely used for this purpose. RGB images can be converted to the sRGB color space, which uses a reduced color space designed specifically or viewing on computer equipment.

sRGB is a color space derived from the standard RGB color space . It was created especially for computer monitors (CRTs and LCDs), digital cameras, printers, and scanners. Most monitors and graphics cards support it by default. Because of its limited gamut, it is not appropriate for use for images that are to serve as archives versions or for professional printing. Use Adobe RGB (1998) for the TIFF archival versions, but when you save the file for Web delivery, the color space is often automatically reduced to sRGB.

Tone

The Tone of a color is its hue plus either gray or the opposite color to mute or tone down the color.

File Format Terminology

These are only a few of the many image file formats available.

BMP

Basic Multilingual Plane images are made up of a two-dimensional "map" of pixel values that describe each bit based on color brightness. A BMP file can actually increases in size when it is compressed. This format is common to the Windows platforms, but not particularly transferable to others.

DNG

The Digital Negative (DNG) file format is a RAW image format developed by Adobe in an effort to standardize the many varieties of "raw" files produced by digital cameras from different manufacturers. All versions of Adobe's Photoshop and Lightroom released since 2004 support DNG. For compatibility reasons, convert DNG files to TIFF files before archiving them.

GIF

Graphics Interchange Format (or Generalized Image Format, Graphic Image File, or Group IV FAX Compression). This is a very common format of images on the web, usually for logos or illustrations – especially if black and white. GIF works best for images with only a few distinct colors such as line drawings and simple cartoons or for images with large areas of one or more colors. It is not suitable for grayscale or full-color images. The primary limitation of a GIF is that it only works on images with 8 bits per pixel or less, which means no more than 256 colors. This makes it unsuitable for most color photographs, which are best rendered at 24 bits per pixel. To store these in GIF format you must first convert the image from 24 bits to 8 bits. The conversion will result in a loss of data and a considerable degradation in quality. See http://www.libpng.org/pub/png/png-sitemap.html#info .)

JPEG

This format is controlled by the Joint Photographic Experts Group. JPEG works well for displaying photographs, naturalistic artwork, and similar material on computer monitors. It may not be the best format to use for lettering, simple cartoons, or line drawings or for use in a print publication. JPEG is a standard lossy image compression algorithm. It provides exceptional compression on both 8-bit grayscale and 24-bit color images. When saving a JPEG file, the user can balance the amount of compression with the desired files size. Lossy compression means that when the file is uncompressed only a part of the original information is still there, although the user may not notice any change. Every time you save a JPEG file, it loses some information. Eventually, the users will begin to see some change. Technically, the image files are actually called JFIF (JPEG File Interchange Format). (See http://www.jpeg.org/ .)

JPEG2000

JPEG2000 format is emerging as a standard for image compression. It provides much better image quality at smaller file sizes than JPEG does. Based on wavelet compression, JPEG2000 offers both lossless and lossy compression. JPEG2000 formats provide good image quality, even at very high compression ratios such as 80:1. JPEG2000 creates scalable image files, which means that no decompression is needed for reformatting. Other new functionalities include region of interest coding, improved error resilience, resolution scalability, random access or spatial scalability, and quality scalability. Web browsers require a plug-in to handle JPEG2000 files unless the Web server processes it before transmitting it to the user. (See http://www.jpeg.org/jpeg2000/ .)

ODF

The Open Document Format is an open, XML-based, ISO-approved standard for textual documents. It is meant to be a vendor-neutral standard for saving common office documents. ODF files can be viewed by applications including OpenOffice.org 2.0, StarOffice 8 and IBM Workplace. It is a competitor to Microsoft's OpenXML format.
(On ODF see http://opendocument.xml.org/ . For acceptance of ODF, see the ODF Alliance - http://www.odfalliance.org/ )

OOXML

Office OpenXML is a file format developed by Microsoft for representing Office documents of all kinds in a XML. It was standardized as Ecma 376 in 2006 and ISO standardization is under consideration. It is a competitor of ODF. (For more, see the Open XML Community
http://www.openxmlcommunity.org/ .)

PDF

The Portable Document Format is a data format for representing documents with text, graphics and images in a manner that is independent of the original application software, hardware, and operating system used to create and view those documents. Viewing a PDF file requires that the user's computer have Adobe Acrobat Reader installed on it. PDF files can be searched for text and cab also contain structural information like a table of contents. (See http://www.adobe.com/products/acrobat/adobepdf.html .

PDF/A

The Portable Document Format/Archival (ISO 19005-1:2005) is an electronic document file format for long term preservation. It is based on the PDF Reference Version 1.4 from Adobe Systems Inc. It is a subset of PDF, leaving out PDF features not suited to long-term archiving. The PDF/A format is fine for preserving the visual layout of an original file, but it is not necessarily an optimal solution for reuse and reformatting of the text content itself. For a more complete definition see

White Paper: PDF/A - The Basics - http://www.pdf-tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf
PDF/Archive (AIIM/ACM) - http://www.aiim.org/standards.asp?ID=25013 .
Document management — Portable document format — Part 1: PDF 1.7, First Edition 2008-7-1 (Adobe Systems)
http://www.adobe.com/devnet/acrobat/pdfs/PDF32000_2008.pdf (accessed January 8, 2010)
PDF/A-1, PDF for Long-term Preservation, Use of PDF 1.4 (Library of Congress)
http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml (accessed Sept. 3, 2007)
Guidelines for Creating Archival Quality PDF files, version 1.1. June 2006, by Carol Chou (Florida Digital Archives) -
http://fclaweb.fcla.edu/uploads/Lydia%20Motyka/FDA_documentation/PDFGuideline.pdf (accessed September 30, 2013)
Preserving the Data Explosion: Using PDF, April 2008, by Betsy A. Fanning (AIMM) -
http://www.dpconline.org/docs/reports/dpctw08-02.pdf (accessed June 26, 2008)

PNG

The Portable Network Graphics format is a file format for image compression but does not drop as many pixels as the JPEG format does. It was developed as a patent free replacement for GIF Unisys owns the GIF format). It provides a number of improvements over the GIF format.

Like a GIF, a PNG file uses lossless compression. It allows you to mate a trade-off between file size and image quality when the image is compressed. Typically, an image in a PNG file can be 10 to 30% more compressed than in a GIF format. Like GIFs, you can make one color transparent, but you can control the degree of transparency (this is also called "opacity"). Interlacing is supported and is faster in developing than in the GIF format. Images can be saved using true color as well as in the palette and grayscale formats. Since PNGs compression is fully lossless-and since it supports up to 48-bit truecolor or 16-bit grayscale-saving, restoring and re-saving an image will not degrade its quality, unlike standard JPEG. For more information see

PSD

This is Photoshop's native (proprietary) file format. This is great if you ire working on an image, but terrible for distributing an image. Not everyone can afford Photoshop. It can be converted to most other file formats for reuse in non-Adobe applications.

RAW

The RAW file format is the untouched "raw" pixel information captured by the sensor of a digital camera. Many digital cameras automatically convert this RAW pixel information into a full color JPEG or TIFF image . Others preserve the RAW file and let the user import it into a photo editing program for conversion and editing. If you ever see a RAW file on a file system, it will have a filename extension based on the camera that created it. To ensure long-term accessibility, RAW files should be converted to 24-bit TIFF images. (For more about RAW files, see http://www.cambridgeincolour.com/tutorials/RAW-file-format.htm .)

TIFF (TIFF ITU T.6)

The Tagged Image File Format is the most widely supported lossless image format and has emerged as the standard archiving image file format. TIFF is the format to use for the highest quality image you intent to capture. TIFF image files optionally use compression but it is the LZW lossless compression, which does not lose any pixels of data. All full-featured image capture and manipulation programs, such as PhotoShop, can read or write TIFF files.

The TIFF format provides a large number of pre-defined, standardized tagged fields that are available for a range of metadata – primarily technical in nature. Most scanner control and image post processing software can be configured to automatically provide much of this metadata. Commercial applications and vendor-developed routines are available to support the input of metadata that cannot be supplied automatically.

In CONTENTdm we recommend that the ICC Profile (color space) be stored in the TIFF metadata. This is done by checking the "ICC Profile" box when saving the file in Photoshop.Most Web browsers do not have built-in TIFF support. So, TIFF images need to be converted to some other format before it can be viewed in any Web browser. TIFF is a very good format if you need to print in publication quality (See http://www.scantips.com/basics9t.html.)

TIFF 6.0 ITU G4

(TIFF Version 6.0 with ITU/CCITT Group 4 Compression)

This ITU compression takes TIFF as the input format and provides a compression factor of around 1:40. This is generally suitable for digitizing only bitonal text documents which black ink is printed on a white page. This could be used to store 1 bit images as an archival format because the use of ITU Group 4 compression is acceptable for long-term storage, but 8-bit grayscale captured more texture from the document.

Compression Terminology

Compression

Compression is the reduction of image file size for processing, storage, and transmission. The quality of the image may be affected by the compression techniques used and the level of compression applied. There are two types of compression: Lossless Compression and Lossy Compression. In selecting a compression technique, it is necessary to consider the attributes of the original object. Some compression techniques are designed to compress text; others are designed to compress pictures.

Lossless Compression

Lossless compression is a method for reducing the size of a file without loss of information, achieved by storing data more efficiently. It works by removing repeated information in the binary code of the file. This kind of compression works better on some kinds of images than others. If an image has undergone lossless compression, it will be identical to the image before it was compressed. The TIFF, GIF, PNG and JPEG2000 image formats allow lossless compression. Lossless compression is appropriate to use for delivery images when the object is bitonal; other type of delivery images should probably use a lossy compression.

Lossy Compression

Lossy compression works by permanently throwing away or losing carefully selected data during the compression process. If an image that has undergone lossy compression is decompressed, it will differ from the image before it was compressed, even though the difference may be difficult for the human eye to detect. Depending on the amount of compression, a considerable visual loss can occur. The most common lossy compression format is the JPEG format which allows a range of compression options from high to medium to low. Repeatedly saving a JPEG file will result in continuing loss of information during each save process.

Lempel-Ziv-Welch (LZW)

LZW is common lossless data-compression algorithm often associated with the TIFF file format.

Wavelet

A wavelet is a mathematical function useful in image compression. Wavelets can compress images to a greater extent than is generally possible with other methods. In some cases, a wavelet-compressed image can be as small as about 25 percent the size of a similar-quality image using JPEG . MrSid and JPEG 2000 are both wavelet compressed formats.

Image Metadata Terminology

Dublin Core

DC provides a simple and standardized set of metadata elements for describing online resources in ways that make them easier to find. Dublin Core is widely used to describe digital materials such as video, sound, image, text, and composite media like web pages. Dublin Core is defined by NISO Standard Z39.85-2001 . DC can be stored inside an image (see XMP below), but it is typically stored in XML format in a database in XML files. The metadata in CONTENTdm uses the Dublin Core schema stored in XML files. (See Dublin Core: http://dublincore.org.)

Exif

Exchangeable image file format is information (metadata) about an image added to an image automatically by all digital cameras. This includes metadata relating to the camera's make and model, functions and image properties such as the date the photo was shot and edited, settings such as aperture, shutter speed, and whether the flash was used. It can be added to JPEG, TIFF Rev. 6.0, and RIFF WAVE files formats. It is not supported in JPEG2000 or PNG. This data can be extracted by various utilities and used in other applications. (Wikipedia lists several.) CONTENTdm does not currently use this data. It will be used if the Archival Master TIFF images are submitted for long-term storage when and if this service if ever offered by the college.

HDR Core

These are the eight metadata fields that are required for objects that are submitted to HDR. As of February 19, 2007, these are the required fields:

Title
Creator*
Keywords
Work Type
Resource Type
Groupings
Submitter
Access Rights

* = "Creator" is required if available.

IPTC

The ITPC metadata format was created by the International Press Telecommunications Council. Unlike Exif metadata, which is added automatically to an image by the camera (or scanner), IPTC metadata must be added manually. It includes fields such as title (called "headline"), caption, the photographer's name, location, and so on. To edit IPTC data check your image-editing software for a "File Information" or "File Properties" menu item. (Not all photo editing software supports the IPTC standard. Some display it only and don't let you edit it.)

Any ITPC data you add to your photos can be extracted and used by HDR.
(For more on ITPC, see http://www.iptc.org , see also IPTC4XMP http://www.iptc.org/IPTC4XMP/ .)

RDF

Resource Description Framework (RDF) is a general method of modeling information. Ii is used along with a wide variety of other metadata schemas to describe digital resources in a way that allows disparate metadata schemas to interoperate. (See "RDF Primer," http://www.w3.org/TR/rdf-primer/ .)

XML

Extensible Markup Language (XML) is a general-purpose markup language that is supported in a wide variety of applications. One of its primary purposes is to facilitate the sharing of data across different information systems, particularly systems connected via the Internet.

XMP

Extensible Metadata Platform is Adobe's mixture of RDF and XML that provides a common metadata framework that standardizes how metadata is embedded in images across various applications. XMP has many fields based on the Dublin Core Metadata Initiative , but as its name implies can be extended to include custom types of metadata as well. XMP information is included alongside Exif and IPTC data. XMP can be used in PDF and other graphics formats, such as JPEG, JPEG2000, GIF, PNG, HTML, TIFF, Adobe Illustrator, PSD, PostScript, and Encapsulated PostScript. XMP was first introduced by Adobe Systems in April 2001 as part of version 5.0 of the Adobe Acrobat software product. The "IPTC Core Schema for XMP" version 1.0 specification was released publicly on March 21, 2005. Adobe Creative Suite (CS2) includes "custom panels" or XMP metadata as part of its default set. In Photoshop CS2 the file information (metadata) is saved using the file extension ".xmp." (See "User's Guide to the IPTC Core" at http://www.iptc.org/std/Iptc4xmpCore/1.0/documentation/Iptc4xmpCore_1.0-doc-CpanelsUserGuide_13.pdf .

Media Terminology

The removable optical media that are recommended for off-line storage of digital objects are the following. Whatever medium you chose, do not trust it to be readable for more than three years. If you do, you run the risk of data corruption due to media degradation or incompatibility of disc format with your disc reading hardware or software.

BD

Blue-ray Disc (BD) is a high-density optical disc format for the storage of digital media – especially high-definition video because of its high capacity. Single-layer BD discs can store 25 GB of data and dual layer discs 50 GB.

CD-R

Compact Disc-Recordable (CD-R) is a recordable optical disc format that is Write Once, Read Many (WORM). Actually, there is a way to write data to a CD-R disc after it has been burned, so long as you are writing to an area on the disc that has not yet been written to. But you cannot alter areas that have already been written to. CD-Rs have a high level of compatibility with standard CD readers. It differs from CD-ROMs in that it has an additional coating of a dye such as cyanine or phtalocyanine that facilitates reading the data. Once a CD has been "burned", its content cannot be deleted. Typically CD-Rs can hold up to 700 MB. (Recommended for offline storage of data in small digitization projects.)

CD-ROM

Compact Disc-Read only Memory (CD-ROM) are produced mechanically by stamping the discs from a master data disc. Thus, the production of CD-ROMs is of most use when you are making multiple copies of a master disc. They typically hold up to 650 MB of data. (Recommended only for large-scale, multi-copy reproduction of digital data on CD format.)

DVD

Digital Versatile Disk (DVD) discs can hold significantly more data than compact discs – between 4.7 and 1 GB depending on the manufacturer. There are several DVD formats in the marketplace and none has become dominant yet (see DVD-R and DVD+R).

DVD-R

DVD-R is a recordable optical disc format that is Write Once, Read Many (WORM). It is a format older than DVD-R, but it is widely supported by DVD players. (Use DVD+R disks if possible for offline storage of data in medium-size digitization projects.)

DVD+R

DVD+R is a recordable optical disc format that is Write Once, Read Many (WORM). It is a competitor to the older DVD-R format, but because it is newer it does not have as large a user base as does DVD-R. However, the way data is laid out on the disc appears to be more error resistant than DVD-R. (Recommended for offline storage of data in medium-size digitization projects.)

Dual-layer DVDs

Some DVD-R and DVD+R are being manufactured with two different layers of data which can be read separately using laser focused differently. This doubles their storage capacity to 8.5 GB per disc. (Recommended for offline storage of data in medium to large-size digitization projects.)

For compatibility, stability, and security reasons, the following optical disc formats are not recommended for off-line storage of digital objects: CD-RW, DVD+RW, DVD+RAM

(See "Understanding Recordable & Rewritable Media," First Edition, April 2004, 53 p. (OSTA.org) (PDF) http://www.osta.org/cechnology/dvdqa/pdf/dvdqa.pdf .)

Other Glossaries

Glossary (California Digital Library)
http://www.cdlib.org/inside/diglib/glossary/?field=institution&query=CDL&action=search (accessed September 30, 2013)
Glossary, in Introduction to Imaging, Revised Ed., by Howard Besser (Getty)
http://www.getty.mdu/research/conducting_research/standards/introimages/glossary.html (accessed September 30, 2013)
Glossary (Princeton University. Princeton Imaging)
http://www.princetonimaging.com/scanning/glossary/ (accessed September 30, 2013)
Glossary of Scanning and Digital Imaging Terms, by Alyce Scott, 2012 (WebJunction)
www.webjunction.org/documents/webjunction/Glossary_of_Scanning_and_Digital_Imaging_Terms.html (accessed September 30, 2013)

- Top -

Appendix: Definitions

Date Provider - Anyone who is depositing a digital object into the Hamilton College Library repository of digital objects.
Data Curator - The person who maintains the repository of archival digital objects for the Hamilton College Library.
TIFF 6.0 (ITU T.6) is considered a de facto archival image format, it was selected for the Capture file format because of its stable properties and its ability to be read by most imaging software.Bit-depth - the amount of color information recorded for each pixel of a raster image. Also called "color resolution."
Adobe RGB 1998 - ??
sRGB - the recommended color space for computer display versions of image files.
JPEG2000 (JP2) was selected as the Master file format for royalty free patent status; it has been formulated by the Joint Photographic Experts Group (JPEG) which made the JPEG format. We apply a slightly lossy compression to our JPEG2000 files but only to the degree that the human eye cannot detect any deterioration in the quality. Though browser cannot natively display JPEG2000 files, Web servers can deliver the image as a JPEG file. We used Photoshop CSx for the creation of JPEG2000 files from the Master TIFFs.
SHA = Secure Hash Algorithms (SHA), the designated standard of the National Institute of Standards and Technology (NIST).
Bit-level integrity - When the analog original is retained, data integrity checking may be less important than for born-digital objects. The later really do need a CRC or SHA integrity fingerprint.

(Reviewed: May 6, 2008, pjm)

Main menu

Search

Collections

You are here