You are here

Revision of ImageCLEF 2007 Photo Retrieval task from Thu, 12/18/2008 - 11:44




Ad-hoc photographic retrieval task 2007

Introduction

ImageCLEFphoto 2007 provides the system-centred evaluation for multilingual visual information retrieval from generic photographic collection (i.e. containing everyday real-world photographs akin to those that can frequently be found in private photographic collections as well).

The evaluation scenario is thereby is similar to the classic TREC ad-hoc retrieval task: simulation of the situation in which a system knows the set of documents to be searched, but cannot anticipate the particular topic that will be investigated (i.e. topics are not known to the system in advance). The goal of the simulation is: given an alphanumeric statement (and/or sample images) describing a user information need, find as many relevant images as possible from the IAPR TC-12 photographic collection (with the query language either being identical or different from that used to describe the images).

Any method can be used to retrieve relevant documents and we encourage the use of both concept-based and content-based retrieval methods. This is an ImageCLEF task.

Data Collection

The image collection of the IAPR TC-12 Benchmark consists of 20,000 still natural images (plus 20,000 corresponding thumbnails) taken from locations around the world and comprising an assorted cross-section of still natural images. This includes pictures of different sports and actions, photographs of people, animals, cities, landscapes and many other aspects of contemporary life.

Each image is associated with an alphanumeric caption stored in a semi-structured format. These captions include the title of the image, its creation date, the location at which the photograph was taken, the name of the photographer, a semantic description of the contents of the image (as determined by the photographer) and additional notes.

<DOC>
<DOCNO>annotations/00/60.eng</DOCNO>
<TITLE>Palma </TITLE>
<DESCRIPTION>two lane street with large shops on the right and smaller shops on the left; people are walking on the sidewalk, some are crossing the street; cars are parked along the left side of the street as well; </DESCRIPTION>
<NOTES>The main shopping street in Paraguay; </NOTES>
<LOCATION>Asunción, Paraguay </LOCATION>
<DATE>March 2002 </DATE>
<IMAGE>images/00/60.jpg </IMAGE>
<THUMBNAIL>thumbnails/00/60.jpg </THUMBNAIL>
</DOC>

The following publication elaborates on the history, design and implementation of this image collection:

Grubinger, M., Clough, P., Müller, H. and Deselaers, T. (2006), The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems, In Proceedings of International Workshop OntoImage’2006 Language Resources for Content-Based Image Retrieval, held in conjuction with LREC'06, pages 13-23, Genoa, Italy, 22 May 2006 (pdf).

Further information about the image collection and links to related publications can be found here.

Based on the feedback from participants of previous evaluation tasks, the following will be provided for ImageCLEFphoto 2007:
  • Annotation Language: four sets of annotations in (1) English, (2) German, (3) Spanish and (4) one set whereby the annotation language was randomly selected for each of the images.
  • Caption Fields: only the fields for the title, location, date and additional notes are provided.
  • Annotation Completeness: each image contains the same level of annotation completeness - there are no images without annotations.
Evaluation Objective

Providing only a subset of the annotations creates a new challenge for 2007: the evaluation of multilingual visual information retrieval from a generic collection of lightly annotated photographs. This allows for the investigation of the following research questions:

  • are traditional text retrieval methods still applicable for such short captions?
  • how significant is the choice of the retrieval language?
  • how does retrieval performance compare to retrieval from fully annotated images? (compared to 2006)
  • has retrieval performance improved in comparison with retrieval from lightly annotated images? (compared to 2006)
Since the involvement of visual retrieval techniques becomes more important in this task, we hope to attract more visually oriented approaches (compared to the mainly concept-based retrieval approaches in previous years).

Query Topics

- Download standard ad-hoc topics for 2007 here.
For this task, we provide a list of topic statements together with sample images expressing realistic user information needs for visual information retrieval from the IAPR TC-12 photographic collection. The creation of these topics has been based on several factors including:
  • the analysis of a log file from online-access to the image collection
  • knowledge of the contents of the image collection
  • various types of linguistic and pictorial attributes such as visual vs. semantic, specific vs. general objects or the use of proper names.
  • the estimated difficulty of the topic.
Similar to TREC, we also provide the query topics as structured statements of user needs which consist of a title (a short sentence or phrase describing the search request in a few words), and three sample images (which have been removed from the image collection) that are relevant to that search request. An example for English is the following:
<top>
<num> Number: 1 </num>
<title> accommodation with swimming pool </title>
<narr> </narr>
<image> 3793.jpg </image>
<image> 6321.jpg </image>
<image> 6395.jpg </image>
</top>
Note:
  • we will re-use a subset of the topics from 2006 (for comparison with retrieval results from 2006)
  • we only offer languages that were also offered in 2006: English, German, Spanish, Italian, French, Portuguese, Chinese, Japanese, Russian, Polish, Swedish, Finnish, Norwegian, Danish, and Dutch. Should a participant want to investigate any other language that is not mentioned here, please contact the task organisers by 30 April 2007 to arrange for a translation.
  • participants only receive topic titles, but no narrative descriptions to avoid confusion (they only serve to unambiguously define what constitutes a relevant images or not).
  • participants will also receive three sample images for each topic. These images have been removed from the collection and do not form a part of the ground-truth.
Retrieval Experiments

Experiments are performed as follows: the participants are given topics, these are used to create a query which is used to perform retrieval on the image collection. This process iterates (e.g. maybe involving relevance feedback) until they are satisfied with their runs. Participants might try different methods to increase the number of relevant in the top N rank positions (e.g. query expansion), and they can repeat these different methods for each query (or source) and collection (or target) language.

Participants are free to experiment with whatever methods they wish for CLIR and image retrieval, e.g. query expansion based on thesaurus lookup or relevance feedback, indexing and retrieval on only part of the image caption, different models of retrieval, different translation resources (e.g. dictionary-based vs. machine translation), and combining text and content-based methods for retrieval. Given the many different possible approaches which could be used to perform the ad-hoc retrieval, rather than list all of these we will ask participants to indicate which of the following applies to each of their runs (we consider these the "main" dimensions which define the query for this ad-hoc task):

Dimension Available Codes
Topic language DA, DE, EN, ES, FI, FR, IT, JA, NL, NO, PL, PT, RU, SV, ZHS, ZHT
Annotation language DE, EN, ES, RND, ALL
Query/run type AUTO, MAN
Feedback/expansion FB, QE, FBQE, NOFB
Modality IMG, TXT, TXTIMG

Query language

Used to specify the query language used in the run. The following language codes should be used to indicate the query language: English (EN), German (DE), French (FR), Portuguese (PT), Spanish (ES), Italian (IT), Finnish (FI), Japanese (JA), Chinese-simplified (ZHS), Chinese-traditional (ZHT), Polish (PL), Norwegian (NO), Swedish (SV), Russian (RU), Danish (DA) and Dutch (NL).

Annotation language

Used to specify the target language (i.e. the annotation set) used for the run: German (DE), English (EN), Spanish (ES), random (RND). You can also use all languages in one run (ALL).

Query/run type

We distinguish between manual (MAN) and automatic (AUTO) submissions. Automatic runs will involve no user interaction; whereby manual runs are those in which a human has been involved in query construction and the iterative retrieval process, e.g. manual relevance feedback is performed. We encourage groups who want to investigate manual intervention further to participate in the interactive evaluation (iCLEF) task.

Feedback or Query Expansion

Used to specify whether the run involves query expansion (QE) or feedback (FB) techniques, both of them (QEFB) or none of them (NOFB).

Modality

This describes the use of visual (image) or text features in your submission. A text-only run will have modality text (TXT); a purely visual run will have modality image (IMG) and a combined submission (e.g. initial text search followed by a possibly combined visual search) will have modality text+image (TXTIMG).

Submission format and guidelines

What to submit

Participants are required submit a baseline run which can be used to compare their other submissions. There should be one baseline run for each annotation language (please include monolingual runs in your submission: English-English, Spanish-Spanish and German-German), and according to the previous table these would be classified/identified as:

  • EN-EN-AUTO-NOFB-TXT for the English-English monolingual run
  • ES-ES-AUTO-NOFB-TXT for the Spanish-Spanish monolingual run
  • DE-DE-AUTO-NOFB-TXT for the German-German monolingual run
It is extremely important that we can get a description of the techniques used for each submitted run. This should be as detailed as possible to ease the comparison or classification of techniques and results.

Submission format

Participants are required to submit ranked lists of (up to) the top 1000 images ranked in descending order of similarity (i.e. the highest nearer the top of the list). The format of submissions for this ad-hoc task can be found here and the filenames should distinguish different types of submission according to the table above. Participants can submit (via email) as many system runs as they require.

Please note that there should be at least 1 document entry in your results for each topic (i.e. if your system returns no results for a query then insert a dummy entry, e.g. 25 1 16/16019 0 4238 xyzT10af5 ). The reason for this is to make sure that all systems are compared with the same number of topics and relevant documents. Submissions not following the required format will not be evaluated.

Result Generation

Relevance Assessments

- Download relevance assessments (qrels) for ImageCLEFphoto 2007 here.
In the past, relevance assessments have been performed by students and staff at the University of Sheffield and Victoria University. Submissions are used to create image pools which are judged for relevance by assessors. The pools are assessed and completed using interactive search and judge (ISJ) to find further relevant images, and the end result is a set of relevance assessments called qrels. These are then used to evaluate system performance and compare submissions.

For more information about this procedure and the qrels sets see the following publications: "The CLEF 2003 Cross Language Image Retrieval Track" and "The CLEF 2004 Cross Language Image Retrieval Track"

Relevance assessment for the more general topics is based entirely on the visual content of images (e.g. aircraft on the ground). However, certain topics also require the use of the caption to make a confident decision (e.g. "pictures of beaches in northern Peru"). What constitutes a relevant image is a subjective decision, but typically a relevant image will have the subject of the topic in the foreground, the image will not be too dark in contrast, and maybe the caption confirms the judge's decision.

The assessment of images in ImageCLEFphoto is based on using a ternary classification scheme: (1) relevant, (2) partially relevant and (3) not relevant. The aim of the ternary scheme is to help assessors in making their relevance judgements more accurate (e.g. an image is definitely relevant in some way, but maybe the query object is not directly in the foreground: it is therefore considered partially relevant). Various combinations of assessor judgements are used to create the qrels sets and more information can be found from the links given above.


Performance Measures and Results

- Download the results for ImageCLEFphoto 2007 here.
The ranked lists (runs) submitted by the participants will be evaluated using trec_eval. We are planning to use the following performance measures to compare the retrieval results:
  • Mean Average Precision (MAP) - the leading measure like in previous evaluations (for comparison)
  • Precision at 20 documents retrieved (P20) - most internet search engines show 20 images on their first page
  • Geometric Mean Average Precision (GMAP) - to avoid that easy topics mask the bad performance of hard topics
  • Binary Preference (BPREF) - to verify the completeness of the relevance assessments
Provided Data and Systems

Training data

Unfortunately, we do not currently have any training data available. We suggest you create your own topics and generate relevance assessments similar to the topics provided.

CBIR Systems

To enable participation to the ad hoc task to those without access to their own CBIR system, we suggest using one of the following systems:

The GIFT/Viper image retrieval system (please contact Henning Müller for more information).
The FIRE image retrieval system (please contact Thomas Deselaers for more information).


Paper Submission and Format


Paper Submission

Participating groups that have submitted at least one run are invited to describe their approaches and corresponding results as well as their experience with ImageCLEFphoto 2007 in the CLEF Working Notes. Working papers have to be emailed to Allan Hanbury (hanbury@prip.tuwien.ac.at) not later than August 17.

The full Working Notes will be prepared in digital form only and will be posted on the CLEF website and published in the DELOS Digital Library one week before the workshop together with the run statistics. At the workshop we also intend to distribute a printed set of abstracts together with CDs containing the entire Working Notes. In order to make life as easy as possible for participants, we will extract the abstract from the text of the submitted papers. This means that participants should try to make the abstract as complete as possible: it should provide the main details of the retrieval experiments including tasks performed, approach used, resources employed, and results obtained.


Submission Format

The submitted papers should not exceed 10 pages and should follow the guidelines desrcibed here.

It is recommended that you use LaTeX to prepare the text (a LaTeX template can be found here). Further, the name of your submitted file should be: [last-name-of-first-authorCLEF2007], e.g. grubingerCLEF2007.pdf.


Important Dates

16 April 2007: Data Release
06 May 2007: Topic Release
11 June 2007: Submission of retrieval runs due (EXTENDED!!!)
16 July 2007: Release of retrieval results
17 August 2007: Workshop papers due
19-21 September 2007: CLEF 2007 in Budapest, Hungary

Organisers of ImageCLEFphoto

Michael Grubinger, School of Computer Science and Mathematics, Victoria University, Australia
(michael.grubinger@research.vu.edu.au)


Allan Hanbury, Pattern Recognition and Image Processing Group, Vienna University of Technology, Austria
(hanbury@prip.tuwien.ac.at)


Paul Clough, Department of Information Studies, University of Sheffield, UK
(p.d.clough@sheffield.ac.uk)


Mailing List

We have set up a mailing list: imageclef@sheffield.ac.uk for participants. Please contact Paul Clough to be added to the list.

Last Modified: 30 July 2007
By: Michael Grubinger
AttachmentSize
ImageCLEFphoto_2.png5.03 KB