You are here

Visual concept detection, annotation, and retrieval using Flickr photos

**The ground truth for the test set has been released!**


The visual concept detection, annotation, and retrieval task is a multi-label classification challenge. The aim is to analyze a collection of Flickr photos in terms of their visual and/or textual features in order to detect the presence of one or more concepts. The detected concepts can then be used for the purpose of automatically annotating the images or for retrieving the best matching images to a given concept-oriented query.

The concepts to detect are very diverse and range across categories such as people (e.g. male, female), nature (e.g. lake, beach), weather (e.g. rainbow, fog) and even sentiments (e.g. unpleasant, euphoric). We supply a training set of images from the MIRFLICKR collection that are fully annotated. You are expected to detect the concepts in a different set of images coming from the same collection and then address either or both subtasks. You can solve these tasks by analyzing the photos in terms of their visual features, their textual features, or by using a combination of both.



For this year's photo annotation task we continue along the same lines as previous years in terms of concepts. In total we now have 94 concepts, where a few old concepts have been removed and a few new ones have been added. We categorize the concepts as follows:

Natural elements  
time of day day, night, sunrise/sunset
celestial bodies sun, moon, stars
weather clear sky, overcast sky, cloudy sky, rainbow, lightning, fog/mist, snow/ice
combustion fire, smoke, fireworks
lighting effects shadow, reflection, silhouette, lens effects
scenery mountain/hill, desert, coast, landscape, cityscape, forest/park, graffiti
water underwater, sea/ocean, lake, river/stream, other
flora tree, plant, flower, grass
fauna cat, dog, horse, fish, bird, insect, spider, amphibian/reptile, rodent
age baby, child, teenager, adult, elderly
gender male, female
quantity none, zero, one, two, three, small group, large group
relationship family/friends, co-workers, strangers
Image elements  
quality in focus, selective focus, out of focus, motion blur, noisy/blocky
style picture-in-picture, circular warp, gray-color, overlay
view portrait, close-up/macro, indoor, outdoor
type city life, party life, home life, sports/recreation, food/drink
impression active, euphoric, happy, funny, unpleasant, inactive, melancholic, scary, calm
Human elements  
transportation bicycle/motorcycle, car/van/pick-up, truck/bus, rail vehicle, water vehicle, air vehicle

As you can see we have focused a bit more on natural elements and image characteristics this year, although we have performed refinements across the whole range of concepts, also in part based on last year's feedback. Please click here for more detailed descriptions of each of the concepts.


Subtask 1: concept annotation

In the concept annotation task your goal is to detect the presence of the various concepts in the images and provide us with the annotations on an per-image basis, see Figure 1 for an example. Please click here for more details on the data format, submission format, evaluation procedure and the results.

Figure 1. Images annotated with the concept 'reflection'.


Subtask 2: concept retrieval

The queries for the concept-based retrieval task are inspired by queries issued by real people on a popular image search engine. We analyzed what people look for on the internet related to the concepts we defined for this task in order to form a realistic yet challenging set of queries for this subtask, see Figure 2 for an example. Please click here for more details on the data format, submission format and evaluation procedure.

Figure 2. Images retrieved for the query 'traffic light trails'.



For this task, we use a subset of the MIRFLICKR collection. The entire collection contains 1 million images from the social photo sharing website Flickr and was formed by downloading up to a thousand photos per day that were deemed to be the most interesting according to Flickr. All photos in this collection were released by their users under a Creative Commons license, allowing them to be freely used for research purposes. Of the entire collection, 25 thousand images were manually annotated with a limited number of concepts and many of these annotations have been further refined and expanded over the lifetime of the ImageCLEF photo annotation task. This year we used crowd sourcing to annotate all of these 25 thousand images with the concepts listed above.

Please click here for more details on the textual features, visual features and concept features we supply with each image in the collection we use for this year's task.



It is necessary to sign a user agreement to get access to the data, you can find the license agreement here. Please print it, sign it and send a scanned copy or a fax to Alba García; see also the instructions page in the document for more information. Once you have signed the license agreement and it has been verified, you can look up the username and password for accessing the data by logging into the ImageCLEF dashboard. The dashboard is also used for the submission of runs. If you already have a login from former ImageCLEF benchmarks you can migrate it to ImageCLEF 2012 here or you can create a new user account here.

To download the data and get detailed information about the structure of the data, please click here.



  • Bart Thomee, Yahoo! Research, Barcelona, Spain, bthomee[at]
  • Adrian Popescu, CEA LIST, Fontenay-aux-Roses, France, adrian.popescu[at]



We would like to express our deepest gratitude to the European Science Foundation for their financial support, which made the collection of the ground truth concept annotations possible. Thanks!

Image icon reflection1.jpg16.42 KB
Image icon reflection2.jpg8.25 KB
Image icon reflection3.jpg12.38 KB
Image icon trails1.jpg13.84 KB
Image icon trails2.jpg5.47 KB
Image icon trails3.jpg19.53 KB
Image icon esf.png24.28 KB