You are here

Visual Question Answering in the Medical Domain

[This page is being updated...]

Welcome to the inaugural edition of the Medical Domain Visual Question Answering Task!


With the increasing interest in artificial intelligence (AI) to support clinical decision making and improve patient engagement, opportunities to generate and leverage algorithms for automated medical image interpretation are currently being explored. Since patients may now access structured and unstructured data related to their health via patient portals, such access also motivates the need to help them better understand their conditions regarding their available data, including medical images.

The clinicians' confidence in interpreting complex medical images can be significantly enhanced by a “second opinion” provided by an automated system. In addition, patients may be interested in the morphology/physiology and disease-status of anatomical structures around a lesion that has been well characterized by their healthcare providers – and they may not necessarily be willing to pay significant amounts for a separate office- or hospital visit just to address such questions. Although patients often turn to search engines (e.g. Google) to disambiguate complex terms or obtain answers to confusing aspects of a medical image, results from search engines may be nonspecific, erroneous and misleading, or overwhelming in terms of the volume of information.


  • 26.10.2017: Website goes live.

Task Description

Visual Question Answering is a new and exciting problem that combines natural language processing and computer vision techniques. Inspired by the recent success of visual question answering in the general domain, we propose a pilot task this year to focus on visual question answering in the medical domain. Given a medical image accompanied with a clinically relevant question, participating systems are tasked with answering the question based on the visual image content.


The data will tentatively include a training set (5K) and a validation set (0.5K) of medical images accompanied with question-answer pairs, and a test set (0.5K) of medical images with questions only. To create the datasets for the proposed task, we consider medical domain images extracted from PubMed Central articles (essentially a subset of the ImageCLEF 2017 caption prediction task).

Evaluation Methodology

Information will be posted soon.

Preliminary Schedule

  • 08.11.2017: registration opens for all ImageCLEF tasks (open until 27.04.2018)
  • 28.02.2018: development (training, validation) data release starts
  • 20.03.2018: test data release starts
  • 01.05.2018: deadline for submitting the participants runs
  • 15.05.2018: release of the processed results by the task organizers
  • 31.05.2018: deadline for submission of working notes papers by the participants
  • 15.06.2018: notification of acceptance of the working notes papers
  • 29.06.2018: camera ready working notes papers
  • 10-14.09.2018: CLEF 2018, Avignon, France

Participant Registration

Please refer to the general ImageCLEF registration instructions

Submission Instructions

Information will be posted closer to the submission deadline.


  • Sadid Hasan <sadid.hasan(at)>, Philips Research Cambridge, USA
  • Yuan Ling <yuan.ling(at)>, Philips Research Cambridge, USA
  • Oladimeji Farri <dimeji.farri(at)>, Philips Research Cambridge, USA
  • Henning Müller <henning.mueller(at)>, University of Applied Sciences Western Switzerland, Sierre, Switzerland
  • Matthew Lungren <mlungren(at)>, Stanford University Medical Center, USA

Join our mailing list: