CNET también está disponible en español.

Ir a español

Don't show this again


Should metadata be permanent?

Imagine if there was a permanent, unchangeable barcode applied to every single piece of digital data you made. A consortium has issued a manifesto for making metadata persist forever.

Photographers always want to ensure that their photos, and the information about them, will last as long as possible. EXIF data is a very useful way of ensuring that the date of creation, exposure settings and even copyright information accompanies every digital photo. But should this data be permanently applied?

(I/0 - Lost Bits 5 image by Carsten Mueller, royalty free)

The International Press Telecommunications Council (IPTC) in London, the American Association of Advertising Agencies (4As) and the Association of National Advertisers (ANA) have banded together to take the issue of data persistence one step further by proposing that permanent metadata be applied to images, text, audio and video files. The Embedded Metadata Manifesto is based on five principles:

  1. Metadata is essential to describe, identify and track digital media, and should be applied to all media items that are exchanged as files or by other means such as data streams
  2. Media file formats should provide the means to embed metadata in ways that can be read and handled by different software systems
  3. Metadata fields, their semantics (including labels on the user interface) and values should not be changed across metadata formats
  4. Copyright-management information metadata must never be removed from the files
  5. Other metadata should only be removed from files by agreement with their copyright holders.

As cameras, mobile phones and other devices evolve to include more and more specific information embedded into files that they create, what are the ramifications of having identifying information included in every single digital file?

In some cases, it's incredibly important, such as identifying a photographer who created a particular piece of work, or inserting the GPS coordinates of a holiday snap so it can be plotted on a map. There are, however, plenty of grey areas that illustrate the potential problems of never being able to strip metadata from a file. Imagine a whistle-blowing case involving photographic evidence, where the metadata clearly reveals who took the photo.

The manifesto also doesn't seem to address issues of data tampering or manipulation. We've seen numerous cases where photo-encryption systems have been cracked, showing that an obviously manipulated image is an original file created by a camera in question. There is nothing to stop similar methodologies being developed that could change the metadata to imply that another person created an image. Conversely, the use cases for the Embedded Metadata Manifesto are plentiful amongst museums and other cultural institutions that need to have a persistent metadata state.

The consortium of organisations proposing these guidelines is suggesting that metadata be presented in the IPTC format, rather than the more standard EXIF, which most digital cameras automatically create each time a photo is created. At the time of writing, the majority of organisations supporting the manifesto are photo agencies and groups representing professional photographers.

Nevertheless, should such a manifesto be adopted by manufacturers and content creators, there is always the possibility that this information could be used to track those who create (authorised or unauthorised) copies of this material with persistent metadata. In some respects, this is already happening now, such as the iTunes Store, which appends its own metadata to purchases, including the name of the person who bought the file. At the moment, this can be removed, but, should the consortium get its way, this wouldn't be possible without requesting permission from the copyright owner.

At this stage, the manifesto is just that — a document. But it raises an important discussion about the sort of information being stored within metadata, and that your content-creation devices are probably capturing more data than you think.