You are here



The test collection of the Photo Annotation Task @ICPR 2010 is now freely available here.


The visual concept detection and annotation task is a multi-label classification challenge. It aims at the automatic annotation of a large number of consumer photos with multiple annotations. Most annotations refer to holistic visual concepts and are annotated at an image-based level.

Categories for the visual concepts are for example abstract categories like Family&Friends or Partylife, the time of day (Day, Night, sunny, …), Persons (no, single, small or big group), Quality (blurred, underexposed …) and Aesthetics. Altogether we provide the annotations for 53 concepts. The visual concepts are organized in a small ontology. Participants may use the hierarchical order of the concepts and the relations between concepts for solving the annotation task.
While most of the holistic concepts can be objectively determined (e.g. the presence or absence of objects) there are also some concepts that are influenced through the subjective impression of the annotators. This agreement is reflected in one of the evaluation measures used.

This task poses two main challenges:
1) Can image classifiers scale to the large amount of concepts and data?
2) Can an ontology (hierarchy and relations) help in large-scale annotations?

Data Sets

The training set consists of 5000 images of the MIR Flickr 25.000 image dataset with 53 visual concepts annotated. The annotations are provided in two formats: as rdf files and as plain txt files. A validation set with 3000 images and annotations can be used for validation purposes. The validation set can be freely used for tuning parameters or as additional training data.
The algorithms are tested on a set of 10.000 images of the same collection. For most of the photos additionally the EXIF data is included and may be used.

Evaluation Measures

For evaluation two measures are applied. In evaluation of classification performance the Equal Error Rate (EER) is established for comparing systems. The EER and the Area under Curve (AUC) will be used as score. The second measure is a hierarchical measure that takes into account the relationships between the concepts in the ontology and provides a score for the annotation performance for each image by regarding the set of labels proposed and the correct set of labels.

Training and Validation Data Release

The training and validation sets are online now and available for all registered participants from the ftp account. There is also a ReadMe file and descriptions of the concepts available on the ftp.

Test Data Release

The test data consisting of 10000 images is now available.

How to register for the task

Due to the database restrictions it is necessary to sign a user agreement to get access to the data. Please print the document, sign it and send it via fax to Fraunhofer IDMT. (For detailed instructions see the explanation in the document).
You will immediately receive the username and password to a ftp-account where all data for the task is stored.

ImageCLEF also has its own registration interface . Here you can choose an user name and a password. This registration interface is e.g. used for the submission of runs. If you already have a login from the ImageCLEF 2009 competition you can migrate it to the ICPR benchmarking activity here

Please note that registration is just possible until the 18.12.2009!

How to cheat (but please don´t)

Please don´t use the annotation information that is delivered with the MIR Flickr 25.000 image dataset. We renamed all files and trust you that you don´t try to find out the original filenames.

Submission Format

Submission format is equal to the annotation format of the training data (see file: trainingSetAnnotations_revised.txt), except that you are expected to give some confidence scores for each concept to be present or absent.
That means, you have to submit a file containing the same number of columns, but each number can be an arbitrary floating point number between 0 and 1 , where higher numbers denote higher security regarding the presence of a particular concept.
For the hierarchical evaluation measure, we do not take the confidence values into account, but map the scores to 0 for all confidence scores <= 0.5 and to 1 for all confidence scores > 0.5.

So please submit your results in one txt file for all results.

Please note that we restrict the number of runs per group to maximal 5 submissions.


The results for the Visual Concept Detection and Annotation Task are now available. We had submissions of 12 groups with altogether 44 runs.
We used two measures to determine the quality of the annotations. One for the evaluation per concept and the other for the evaluation per photo. The Equal Error Rate (EER) and the Area under Curve (AUC) show the annotation quality per concept. The proposed ontology score that considers the relations between concepts and the agreement of annotators on concepts evaluates the annotation quality per photo.

On the following sites you can find the results:
* Results of the Equal Error Rate and Area under Curve
* Results of the Ontology Score

Important dates

  • 03.11.2009 - Training data, validation data and task release
  • 01.12.2009 - Test data release
  • 03.01.2010 - Submission of runs - EXTENDED to 05.01.10
  • 08.01.2010 - Release of results
  • 25.01.2010 - Deadline for the main ICPR 2010 conference paper submission - extended
  • 25.05.2010 - Deadline for the LNCS proceedings. (10 pages, LNCS format). Please send us the papers per mail.
  • 23.8. - 26.8.2010 - ICPR 2010 Conference
  • The schedule is that strict as it should allow the participants of the benchmark to submit papers describing their approaches to the main ICPR 2010 conference.

    Contact person

    • Dr. Uwe Kühhirt, Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany, uwe.kuehhirt[at]