You are here

Digital Image Concepts

Digital Imaging Concepts

In order to set up the digitization device to properly digitize the material, you need to learn something about the basics of digital Images. Three of the most import characteristics of digital images are resolution, bit-depth (dynamic range), and color space. We then discuss the affect that these three controls have on file size.

Resolution

  • The (spatial) resolution at which an object should be scanned is based on the size of the object, and the fineness of its details. The higher the resolution, the easier it will be to see smaller details in the image, which results in a larger file as well. For computers and digital cameras, resolution is in pixels; for scanners, resolution is measured in pixels per inch (ppi) and sometimes in dots per inch (dpi); for printers, resolution is measured in dots per inch (dpi).
  • There is no single perfect spatial resolution to recommend for scanning every kind of material in order to produce a high-quality digital image, but as a rule of thumb try to scan items as close to 600 ppi (pixels per inch) as possible but no lower than 300 ppi.
  • Under no circumstances should you scan an object at a resolution higher than the maximum optical resolution of the scanner even if the interface allows you to do so. For example, in order to scan slides and negatives you will need to scan them at a minimum resolution of 2700 ppi so some perfectly good document scanners would not be capable of capturing such resolution and you will probably need to use a dedicated slide scanner.

Bit Depth (Dynamic Range)

Bit Depth is the range (or tonal levels) of color or shades of gray found in an image. The greater the bit depth is, the greater the depth of color, and the larger the color (or grayscale) palette (number of colors). In other words, the higher the bit-depth, the greater the subtlety of color or gray.

Usually, you should scan all objects at 24-bit color (even black and white materials) if you intend to save the image according to preservation standards. For non-preservation purposes, black and white materials may be scanned as 8-bit grayscale or even 1-bit bitonal images, but only if you are sure there is no trace of color or tonal variation in either the image or in the medium on which the image is imprinted.

"24-bit color depth" is the way you refer to the RGB color profile of 8 bits of Red + 8 bits of Green + 8 bits of Blue. Photoshop calls this "8 bit" color profile, whereas some scanners call it 24-bit color depth. They are the same thing. 48-bit color depth is referred to in Photoshop as the 16-bit color profile.

Color Space

A Color Space (also called "color mode," "color model" or "work space") describes the way colors can be represented as numbers, typically as three or four values or color components. RGB, sRGB and CMYK are three commonly used color spaces. When you scan an object, use the RGB (Apple RGB 1998) color space because it captures a very wide range (called a "wide-gamut space") of colors. Though the original scan should be saved with the RGB color space, subsequent adjustments to the tone and color of the image to produce Working Master Files should be done in sRGB mode, which is a color space with a reduced range of colors that has been optimized for computer displays. When saving an image for Web delivery RGB is often automatically converted to sRGB.

File Size and Image Size

To achieve a high quality scan, one is often obliged to select a resolution, bit-depth and color profile that endow the digital image with millions of pixels. This can cause the file to be very large. Such files will appear on a computer display far larger than is practical for most of the intended users. Some setting of the image has to be reduced in order to bring the number of pixels down enough so it can fit properly on a display device. This is called "down sampling." With this loss of pixels, there is a loss in quality of the details, but that is a fact that can be mitigated by using an efficient compression scheme such as JPEG, PNG, or JPEG2000

Because different computer displays have different numbers of pixels per inch and users set the density (resolution) of their displays to their own preferences, it is not possible to ensure that the size of a digital image on a user's screen will be the same as the size of the original object, but you should ensure that if the image you display is significantly smaller than the original that you give the user some way to view it larger without degrading the quality of thee image. The size on the screen should be large enough (or enlargeable enough) to ensure that all text of whatever original size is legible and all significant details in a pictorial image are viewable.

Image size can be manipulated by adjusting spatial resolution, pixel dimension, or document size (print size), but for screen viewing pixel dimensions is the least reliable method to use.

Enlarging the size of an image should only be attempted at the point the object is initially digitized. This is usually accomplished by raising the resolution setting on the scanner or digital camera. After scanning, the image should only be reduced in size if the size needs to be changed for output reasons.

- Top -

File Formats

Certain file formats will work than others better with the CONTENTdm software and delivery system. The following file format are preferred, but if you have other formats you would like to submit, contact a member of the CONTENTdm Team.

The following file formats are preferred because they maximize quality and file size by using file compression algorithms. You would probably choose a different file format for storing a high-quality preservation version of the digital object, but for purposes of CONTENTdm, the following file formats are fine.

Recommended Formats for file to be uploaded into CONTENTdm

Object Type

Preferred File Format

Printed Text

PDF/A

Still Images/mss

JPEG2000 (with 8:1 compression), JPEG (with mild compression)

Large format

JPEG2000

Audio

MP3, MPEG-4, RealAudio

Video

MPEG-4, QuickTime, RealMedia

File Compression

Compression refers to a mathematical process that reduces file size in images by discarding inessential data. There are two kinds of compression that interest us here.

  • Lossless compression uses an algorithm that can completely restore the data of she original image when decompressed. TIFF, PNG and JPEG2000 are file formats that provide lossless compression options.
     
  • Lossy compression uses an algorithm that cannot restore data that was lost during compression. This is usually not a problem and is may even be desirable when the compression method was used to create a file that was small and thus easy to transmit over the Internet for display on a user's browser. Lossy compression has some definite disadvantages though
    • If the compression rate is too high a degradation of the image quality may actually be seen by the user.
    • Any information about the image that was discarded during the compression remains unrecoverable if decompression were ever attempted.
    • Every time you save a compressed file, the compression algorithm is applied and thus more data is lost, which results in an image of progressively poorer and poorer quality. To avoid this, chose the highest quality setting before save the file.

The best practice is to keep your files in an uncompressed format during the capture and production phases of your project. When you are finished correcting enhancing, and reducing dimensions and resolution of your images, then you can savely save them in a format that applies a lossy compression.

Recommended Formats (uncompressed) for using as Working Master Files

Object Type

Uncompressed Format

Text

N/A

Still Images

TIFF (or lossless PNG) or lossless JPEG2000

Large format

JPEG2000

Audio

WAV, AIFF, FLACC, MA9

Video

AVI, Video Tape

  • JPEG2000 is a file format that is considered by some experts to be safe for archival storage, even though it uses compression (approaching 200%) because the compression it uses is so efficient, that loss of data is not as noticeable to the human eye as it is with compression file formats at the same file sizes..

For a brief profile of each file type mentioned above see either "Basic Digitization Terminology" or "Digital Audio and Video Terminology."

File Sizes

[Insert a statement about what kinds of file sizes to expect with the different file formats.]

File Versions

In preparing images for uploading to CONTENTdm, one needs to decide how big and at what quality an image needs to be to be most useful to the user. This applies not only to large format materials, but to photographs, slides, audio and video as well.

In order to ensure that you will be able to deliver the digital object at the quality and size the user needs, you need to be sure to capture a quality master image from which you can derive the exact version you want to submit to CONTENTdm. To make this happen, we recommend a strategy that involves managing three versions of each digital object: an Archival Master File, a Working Master and the final Delivery file.

  • The Archival Master File is a very high-quality file that represents the original object as accurately as possible. Producing a Archival Master File starts with a Capture File, which is a high-resolution scan of the original object. It is important to set the scanner's options to ensure that a wide range of light and dark areas is captured for this original version by modifying the histogram. Only minor adjustments should be made to the Capture File after scanning and only in order to ensure that it faithfully represents the original. Do not attempt to do any color adjustments on this file yet. The recommended format for Capture File is TIFF ITU T.6, which can be created by Photoshop and any scanning utility. Few digital cameras, however, can save files as TIFFs, so they will have to be manually converted to TIFFs. The Archival Master Files is just the name of the copy of the Capture File that is deposited in a managed digital asset management repository. If you are not going to immediately deposit a copy of the capture file to a managed repository, a copy should be stored off line on tape, optical discs, or hard drive for backup purposes. Even if you are going to submit a copy to CONTENTdm, you still must deposit a copy to libArchive. Do not depend on CONTENTdm to protect your images from loss. CONTENTdm is a delivery system - not an archival system.
    [For more detailed information about Capture Files and Archival Master Files, see "Creating Archival Master Files."]
     

  • The Working Master is a copy of the Capture File that has been corrected for color, sharpness, skewing, graininess, contrast, gamma balance, dust and scratches. Stains, blotches or defects that are part of the original material are usually let untreated. A Working Master file may even be reduced in size in order to facilitate efficient delivery and then uploaded directly into CONTENTdm, which converts them into JPEG2000 files. JPEG2000 delivery provides high-end zooming and panning. If you are not going to immediately deposit a copy of the capture file to a digital asset management repository, a copy should be stored off line on tape, optical discs, or hard drive for backup purposes. Even if you are going to submit a copy to CONTENTdm, you still must deposit a copy to the repository. Do not depend on CONTENTdm to protect your images from loss. CONTENTdm is a delivery system - not an archival system. As useful as the Working Master is in deriving good looking Delivery Files, if disk storage space is at a premium, it is more important to retain the Capture File than it is to keep the Working Master after the derivatives have been made.
     

  • The Delivery file is a version or manifestation of the Working Master file at a lower resolution and/or in formats that display efficiently given the limited screen real estate in CONTENTdm, Delivery files are usually also in a format and size that is convenient for users to access over a network. The production of Delivery files requires the use of down-sampling to lower pixel dimensions and/or to raise levels of compression. If you uploaded the Working Master, then CONTENTdm created the Delivery file for you. If you did not, then you can use JPEG files, which usually require some "clean up" to improve the sharpness and color in order to compensate for the down sampling or imperfections in the original. JPEG files should be no less than 600 pixels on the long side. When you want the Web user to be able to zoom and pan on an image, you may decide to submit a version of the image that is considerably larger than the minimum of 600 pixels on the long side or to upload the full-resolution (i.e non-down-sampled) Working Master to CONTENTdm to convert to a dynamic JPEG2000 image. A JPEG file is highly compressed and easy to edit and view in a browser, but once you have created the JPEG file, do not repeatedly edit and save it. Each time you save a JPEG file you actually compress the file again and thus degrade the quality of the image. A JPEG file should never be considered your Working Master version of the image. Convert it to TIFF if you need to protect its quality from degradation.

- Top -

(Reviewed: September 27, 2013)