You are here

ImageCLEFmed VQA


The MEDVQA-GI challenge is held for the second time this year with a new goal. One of the new frontiers in AI-driven medical diagnosis is the application of text-to-image generative models. This area integrates language processing and image synthesis to enhance diagnostic capability in the medical field. In this task, we aim to direct the power of artificial intelligence to generate medical images based on text input, along with optimal prompts for off-the-shelf generative models building up on the dataset collected in the first edition of MEDVQA-GI. The objective is to improve the diagnosis and classification of real medical images using AI-generated imagery. The task is divided into two main subtasks

Task Description

The first subtask, Image Synthesis (IS), requires participants to leverage text-to-image generative models to create a rich dataset of medical images derived from textual prompts. Examples include creating images of different pathologies based on text descriptions. For instance, given a textual description like "An early-stage colorectal polyp”, the participants should generate an image that closely represents the textual description. Participants will be given a development dataset containing prompt and image pairs to develop their solutions. For testing, we will provide a list of prompts for which the participants will generate one image per prompt and send it to the organizers.

The second subtask, Optimal Prompt Generation (OPG), asks participants to generate images using their own prompts that fall within specific categories. Participants will submit a model and prompts that will allow the organizers to generate synthetic images. To evaluate the submissions, these synthetic images will be used to train predictive machine learning models. In addition to evaluating the quality of the synthetic images, model complexity, prompt complexity, and hardware requirements will be taken into account. The prompt categories are listed below.

  • A prompt that generates an image containing n polyps.
  • A prompt that generates a polyp in a specific region of the image.
  • A prompt that generates a polyp of a specific type and size.
  • A prompt that generates an image containing no findings from either the esophagus or large bowel.
  • A prompt that generates an image containing one of the following instruments: biopsy forceps, metal clip, and tube.
  • A prompt that generates an image containing one of the following anatomical landmarks: Z-line, Pylorus, Cecum.


Evaluation methodology

The evaluation of these subtasks will be based on a subjective evaluation done by a committee and how accurately a model trained on these AI-generated images can classify real medical images. This will further be analyzed on both single-center and multi-center datasets, providing a comprehensive analysis of the model's performance. Metrics we will use for the task are Fréchet Inception Distance (FID) and standard classification metrics like accuracy, precision, recall, and F1 score on both single-center and multi-center datasets. In addition, the robustness of the model across different centers, representing the model's generalisation capabilities, will also be evaluated.

Participant registration

Please refer to the general ImageCLEF registration instructions

Preliminary Schedule

  • 30.11.2023: registration opens for all ImageCLEF tasks
  • 22.04.2024: registration closes for all ImageCLEF tasks
  • 05.02.2024: Release of the training and validation sets
  • 03.04.2024: Release of the test sets
  • 06.05.2024 : deadline for submitting the participants runs
  • 13.05.2024 : release of the processed results by the task organizers
  • 31.05.2024 : deadline for submission of working notes papers by the participants
  • 21.06.2024: notification of acceptance of the working notes papers
  • 08.07.2024 : camera ready working notes papers
  • 09-12.09.2024: CLEF 2024, Grenoble, France

Submission Instructions

Task 1: Image Synthesis Please submit the images you have generated based on the prompts we provided.

Task 2: Optimal Prompt Generation For this task, you are required to submit both the prompts and the models you have used to generate images. Please organize your submissions by category for our ease of evaluation.

Submission Instructions: Please email your submissions directly to


Will be announced at the workshop.


When referring to ImageCLEF 2024, please cite the following publication: