You are here

ImageCLEF 2008: Medical Retrieval Task

Primary tabs

The medical retrieval task of ImageCLEF2008 will be run in a manner similar to ImageCLEF2005-2007 (

However, we will be using a different database this year.
We finally obtained the rights to use a subset of the Goldminer collection for 2008. More information on goldminer can be found at:

The subset used contains all images from articles published in Radiology and Radiographics including the text of the captions and a link to the html of the full text articles.

Our database distribution will include an xml file with the image id, the captions of the images, the titles of the journal articles in which the image had appeared and the PubMed ID of the journal article. In addition, a compressed file containing the approximately 66,000 images will be provided.

Participants will need to sign a EULA agreement prior to obtaining the database.

We will also make available a consolidated collection containing the 2005-2007 image collections, the topics and qrel files.

Our foreseen timing is:
20.2.2008: inscription form CLEF
15.3.2008: 26.3.2008: release of the dataset
15.4.2008: 2.5.2008:release of the topics
2.6.2008: Submission deadline for runs
15.7.2008: end of the relevance jdugement process
20.7.2008: distribution of results to the participants
15.8.2008: submission deadline for working notes papers
16.9.2008: pre-CLEF workshop on visual infromation retrieval evaluation
17.-19.9.2008: CLEF workshop

Data Download


Data Submission

The submission site is available from here:

Please ensure that your submissions are compliant with the trec format prior to submission.
We will reject any runs that do not meet the required format.
Also, please note that each group is allowed a maximum of 10 runs. The qrels will be distributed among the participants, so
further runs can be evaluated for the working notes papers by the participants.

At the time of submission, the following information about each run will be requested. Please let us know if you would like clarifications on how to classify your runs.

1. What was used for the retrieval: Image, text or mixed
2. Was other training data used?
3. Run type: Automatic, Manual, Interactive
4. Query Language

The results and qrels file can be found at
Please let us know if there are any issues (missing files, duplicate runs, missing or incorrect information about run type)