Controlled Vocabulary

Looking for the Controlled Vocabulary Keyword Catalog?

Using Image Databases to Organize Image Collections

Developing a Digital Image Archive can become a tremendously huge undertaking if you don't break it down into small discrete steps. These can go on at the same time, and even be handled by different people in your organization. Some tasks require different skill sets, and while there are a number of technical challenges with regards to capturing, processing, and storing images; how you organize your images for retrieval can be even more important.

Organizing Your Photographs
There are many research projects that are examining the indexing of images by automatic content analysis, but they can not achieve the level of detail and accuracy needed to replace a truly professional manual indexing system.

There are a lot of considerations to be made before you even begin the process (unless of course you like to redo work for the thrill of it). The biggest task to tackle is how to physically organize and file the scans (in folders on a hard drive, on a local area network, or even a series of CDR's) so you can find the materials you need later. You also need to consider what the images will be used for (thumbnails to locate the "real" photos, web and multimedia, final art for publication?). This will probably require you to employ one or more "off-the-shelf" software applications to assist you in cataloging the images or creating a searchable database of your images.

Organize, Organize, Organize
You may already have an organizational scheme that you already use for your existing "physical" images. See if it's possible to modify or transfer this system for your "virtual" image storage. My own filenaming system differs for slides, negatives, and digital files, and aids me in locating the physical film. If you don't have some organizational "hierarchy" in place, my first suggestion would be to create your own "dewey decimal system" for images. See if you can locate a copy of Ernst Robl's "Organizing your Photos" described on the books page. For a summary of some of the more important parts of that book see his article, Image Numbering, Filing, and Retrieval that was originally prepared for the American Society of Picture Professionals site.

The next big step is to begin embedding your images with "metadata" (literally, data about data, or information about your image). There are many kinds of metadata, so if you are unfamiliar with the term, and want a brief overview take a look at the Metalogging section of this site. The one we will exploit for our purposes goes by various names, but was first conceived by the International Press Telecommunications Council, so are usually just referred to as the IPTC. If you use Adobe Photoshop, you'll know this as the File info feature found under the File menu. To increase productivity you might want to consider something like the Image Info Toolkit, or PhotoMechanic.

Caption and Keyword to Aid in Retrieval
The two most frequently searched fields in the IPTC schema are the Caption field (now called "Description" in the latest vesions of Photoshop), and the Keyword field. Writing good captions and determining good keywords to aid you and others in finding your images is part of what I call "metalogging" and is covered in detail on this site. See the separate pages on writing good captions and determining good keywords, as well as a comprehensive list of caption and keyword guidelines.

If you don't have some organizational "hierarchy" in place for describing your subjects, my first suggestion would be to create your own "keyword thesaurus" using your own "controlled vocabulary" that works as part of your own "dewey decimal system" for images. The Library of Congress Classification Outline is a good start. See some of the other examples listed on this site. If you don't want to go to the trouble of creating your own controlled vocabulary, and you don't find any that match up with your specialties on the example page, you might want to consider the use of the Image Info Toolkit and it's integrated Keyword Catalog.

I started with the Library of Congress's Thesaurus of Graphic Materials combined with the hierarchy from the International Press Telecommunications Council (IPTC), and several other sources found on this site (picking and choosing from each). I also "picked" some ideas from Jim Pickerell's site, which has lots of useful information on image archiving in his "selling stock" newsletter (plus lots of other good stuff if you are interested in selling your images for commercial purposes).

Why are you doing this?
What do you plan to do with the images? If you plan on using the digital images only for thumbnails to locate the original slide or negative, then your requirements for scanning or "acquiring" are much more reasonable. If you intend to use the images for final art you will need much larger files, or a way of finding the physical film quickly, so that you can have the darkroom work handled.

Most desktop scanners today can easily give you a 27mb to 55mb file (when stored as an uncompressed RGB tiff), and only cost $2,000 to $4,000. I've used the Polaroid Sprintscan 35+, the Sprintscan 4000, and the Microtek 4000t and with any it only takes a minute or two to do a scan of this size from 35mm film.

Many photographers are now using pro and prosumer digital cameras for covering assignments and creating stock images for sale. Having an organized system is even more important if you are shooting digital, because it's possible to easily lose track of where that file is located if you don't have a good system in place.

The Research Libraries Group of the OCLC gave an overview of the workflow being used at Corbis in this article on their site. If you are scanning your images with the idea of being able to license them as stock photographs, then this is a good reference.

Where is that scan?
After you've determined what you are going to do with your images, and have scanned them or downloaded the images from the digital camera, what do you do next? One thing you might want to consider is creating a version that's easy to access but large enough to show relevant details. I do this by downsampling the high resolution file using a technique I developed over a period of time. This image, because of its small file size, is a good one to annotate and catalog in your image database. In addition to taking less time to create a thumbnail, they are small enough to be easily opened when you need to append data in the "file info/IPTC" in photoshop, or re-insert this info back into the "header" of the image file.

Classify/categorize, Identify, and Catalog.
Figure out where the image belongs in your classification system. Identify the WHO, WHAT, WHY, WHEN, WHERE and HOW's of the image and either place in the "file info/IPTC" part of your image file (In Photoshop look for the file info header under the FILE menu) or create a way to link the text file with the image (or just keep reading).

If these are your own images, or those of your employers, you may want to provide a copyright notice within the IPTC header. If you are familiar with Adobe Photoshop, you can apply this information as a batch process using "actions." On you can see how to insert a copyright notice into the File Info/IPTC section as a photoshop action. If you are storing your images as jpeg files, you don't want to use photoshop, as you will be "recompressing" the image file each time you save. If you are saving jpeg files (or shooting them with your digital camera) you may want to use one of a handful of utility programs that allow you to change the info in the file "header" without affecting the actual image portion of the file.

If you are creating an image database with the intention of putting it on the web, you may want to include visible watermarks on the face of the image. There are various image security options including visible watermarks discussed on as well.

Finally you are ready to create "catalogs" of your images. There are numerous applications that can handle the task of creating a thumbnail, and automatically gathering information about the file (size, resolution, filename, color space, etc). One easy way is to use a program that can automatically grab the IPTC/file info from the Photoshop file when creating the thumbnail. It's best to test several programs you are considering, as you may find there are differences in how this information is transferred to your program of choice. In order to make it really useful you may want to consider image databases that can take any information you have added within the image database, and push that back into the IPTC file info of the actual images in the catalog/database.

Benefits of an Image Database
In addition to allowing you to search for your images by description or keyword, some of these programs support drag-and-drop or drag-and-place for linking the original file to your page layout, or word processing applications. Most have "slide show viewers" or integrated browsers (some tie into quicktime).You can often use them to create "static" HTML pages that can be viewed on Windows or Mac machines. Several can import/export to proprietary databases or standard database formats that will allow you to edit the database info or create custom applications. Some of the more progressive even have a way to move the resulting catalog on to your website and still remain searchable.

examples  |  books  |  products  |  image databases  |  links  |  what's new
imagedatabases  |  programs  |  IPTC standard  |  downsampling  | filenaming 
metalogging  |  captioning  |  keywording  |  guidelines  | metalog resources
home  |  contact  | sitemap