![]() |
| Introduction |
| ImageCLEF's Wikipedia Retrieval task provides a testbed for the system-oriented evaluation of visual information retrieval from a collection of Wikipedia images. The aim is to investigate retrieval approaches in the context of a large and heterogeneous collection of images (similar to those encountered on the Web) that are searched for by users with diverse information needs. In 2010, ImageCLEF's Wikipedia Retrieval will use a new collection of approximately 250,000 Wikipedia images that cover diverse topics of interest. These images are associated with unstructured and noisy textual annotations in English, French, and German. This is an ad-hoc image retrieval task; the evaluation scenario is thereby similar to the classic TREC ad-hoc retrieval task and the ImageCLEF photo retrieval task: simulation of the situation in which a system knows the set of documents to be searched, but cannot anticipate the particular topic that will be investigated (i.e. topics are not known to the system in advance). The goal of the simulation is: given a textual query (and/or sample images) describing a user's (multimedia) information need, find as many relevant images as possible from the Wikipedia image collection. Any method can be used to retrieve relevant documents. We encourage the use of both concept-based and content-based retrieval methods and, in particular, multi modal and - new this year - multi lingual approaches that investigate the combination of evidence from different modalities and language resources. More information will be provided soon. |
| Wiki MM Collection |
|
The Wiki MM collection consists of around 250,000 images and associated user-supplied annotations. The collection was built to cover similar topics in English, German and French. Topical similarity was obtained by selecting only Wikipedia articles which have versions in all three languages and are illustrated with at least one image in each version. 44664 such articles were extracted from September 2009 Wikipedia dumps, containing a total number of 265987 images. The collection is intended to be freely distributed and we decided to remove all images with unclear copyright status. After this operation, the remaining number of images in the collection is 251866, with the following language distribution: -English only: 77,365 -German only: 68,831 -French only: 42,751 -English and German: 14,729 -English and French: 16,887 -German and French: 12,128 -English, German and French: 19,175 The main difference between the Wiki MM collection and the INEX MM collection (Westerveld and van Zwol, 2007) used in previous WikipediaMM tasks is that the multilingual aspect was reinforced and both mono- and cross-lingual evaluations can be carried out. Another difference is that this year, participants will receive both the image annotation provided in the previous year and links to the article(s) which contain the image. (Westerveld and van Zwol, 2007) T. Westerveld and R. van Zwol. The INEX 2006 Multimedia Track. In N. Fuhr, M. Lalmas, and A. Trotman, editors, Advances in XML Information Retrieval:Fifth International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence (LNCS/LNAI). Springer-Verlag, 2007. |
| Evaluation Objectives |
The characteristics of the new Wiki MM collection allow for the investigation of the following objectives:
|
| Schedule |
The schedule can be found here:
|
| Organisers |
|