Controlled Vocabulary

Looking for the Controlled Vocabulary Keyword Catalog?


Keyword = Key ideas/concepts expressed as words
Keywording = The practice of selecting the most appropriate keywords to describe an object, image, or work.

The Function of Keywords
When researchers submit research papers for publication in a journal, the paper's abstract is typically preceded by a set of keywords. Generating these keywords is fairly simple as the work is usually focused and has a point of view (generally to prove or disprove a hypothesis).

Unfortunately, images aren't nearly as easy to describe or categorize. First of all, images aren't made of words. In addition you carry your own accumulated knowledge into the viewing of the subject, which may be quite different from how another viewer sees that same image. So, where to start?

Start with the Caption
A well written caption gives you the basic building blocks for the core of the keywords you might want to add to the images' metadata. The journalists "5 W's and How" (Who, What, Why, When, Where and How) is a good way to tickle your mind for appropriate caption material and keywords. Not every image will require all of these, but it's worth ticking them off to be sure you're not leaving anything out. Your job is to provide the appropriate terms that can guide someone searching for that image, and not much more. Balance is the key in keywording!

For instance, even a simple caption like "young boy holding a frog" suggests a number of immediate possibilities. Keywords that include "frog" (including those higher up the hierarchical branch), "boy" (adolescent, age range, etc.) are pretty obvious. If the time of day is important you might wish to include that info (Twilight, dusk, sunset?). Is there a dominant color to the image? If so, include that as well.

Once you have exhausted the obvious then it's time to go beyond the surface and think if the image expresses any "concepts." Depending on the expression of the young boy these might be words like "curiosity" or "curious" and the like. The vast majority of images may not be concept images, and if so don't confuse your client or photo researcher by adding any "concept" keywords.

In generating keywords for an image database we are looking for ways to describe an image in as broad a sense as possible. Since we will most likely be dealing with a large group of clients from varied backgrounds, it's important to realize that the terms you use and those of your clients may not be the same. A viewer from the general public may search for "pig" but a researcher might search for "swine" or "boar." In addition your clients may not know specifically what they are looking for, until they find it, so in some cases no amount of keywords will make a difference.

Keywording the image in different ways expands the opportunities for a client to find your image among the thousands or millions in any given collection. Including keywords that move outward from the most specific to the more general will guarantee a higher success rate. Rather than only using the keyword, "pig" include others such as "livestock" and "animal", or include the specific breed of pig, and whether it is an adult male (boar) or female (sow). For optimal results, it's best to always use a specific set of terms, which we typically refer to as a "controlled vocabulary." In the long run, using a controlled vocabulary (or creating your own) is the only way to ensure that you consistently include the same cluster of core terms to similar images.

The IPTC Keyword is a multi-value field
A note regarding the "keyword" field as defined by the International Press Telecommunications Council, Information Interchange Module version 4 (often referred to as IPTC IIMv4 or IPTC headers): There is often some confusion regarding the 64 character limit when viewers see the listing in the IPTC chart because they miss the footnote to the text at the bottom of the page that says:

"The keyword field is a "multi-value" field. You can have unlimited numbers of keywords and phrases but no single keyword or key phrase can exceed this 64 character maximum limit."

Both the Keyword and Supplemental Categories fields are referred to as "bag" fields. This mean that you can store a segregated list of terms within this single field. For those of you with some database software backgrounds it might help to think of this field as a "portal" to another database... or sort of a "database within a database."

In the past you had to enter each of these terms, one-by-one into the Photoshop File Info Keyword field. However, if you are using Photoshop CS, you can now cut-and-paste in any "string" of comma separated terms and they will be "parsed" into the proper format. Prior versions of Adobe Photoshop may appear to allow this option, but they do not handle this task properly.

I've illustrated one alternate workflow exploiting this behavior in a sample quicktime movie which shows how to use Image Info Toolkit (IIT) as a keyword generator. Once you have your keywords assembled in the Caption field of the Image Info Toolkit data entry field, you can then cut-and-paste that information into the Keyword field in Adobe Photoshop CS when you have the File Info dialog open.

The caption/keyword guidelines will walk you through a series of questions that should stimulate your thoughts and help you generate a broad set of keywords that will lead viewers to the most appropriate images in your collection.

Strategies for keywording
Before moving on the guidelines, you may wish to start by determining what will work best for your particular situation.

First off, if you are adding captions and keywords to images that are destined to be part of another collection, be sure to ask the appropriate staff member if they have any specific guidelines, style sheets, or the like. If you are doing this work for your own collection then it may be worthwhile to develop your own specifications (after a bit of research). Take a look at what other collections have done by viewing their online holdings. Specifically take a good look at the average length of the captions and the quantity of keywords they are using for each image.

Many agencies may put limits on the number of keywords or total characters per record. It's best to observe these, as it's quite likely any work you do that exceeds these specs may simply be automatically "truncated" or removed from the record. If that is the case you'd be better served by putting your efforts into picking the best 25 or 50 keyword terms, (or 150 or 500 characters --whatever the limit is) than spend time selecting keywords or writing captions that will be of marginal value, or worse yet, cut off and never seen.

Even when providing keywords to other agencies or distributors, realize that keywording is rather involved and takes time as well as experience. More the reason why artists should consider captioning their work in a way that will be keyworder-friendly. Do not assume that the keyworder knows what you know.

Proceed to Caption and Keyword Guidelines >>

examples  |  books  |  products  |  image databases  |  links  |  what's new
imagedatabases  |  programs  |  IPTC standard  |  downsampling  | filenaming 
metalogging  |  captioning  |  keywording  |  guidelines  | metalog resources
home  |  contact  | sitemap