You are here

ToPicto

logoWelcome to the 1st edition of the ToPicto task!

Motivation

Several genetic diseases, such as Rett syndrome, can result in language impairment, thereby interfering with the development of language skills such as speaking, listening, reading, and writing. Both language production and comprehension are impaired. Language impairment may also arise from incidents such as a car accident or a stroke, leading to aphasia — a partial or complete loss of the ability to express oneself or understand written and spoken language. In these particular cases, Augmentative and Alternative Communication (AAC) can be implemented. AAC involves the use of pictograms to help individuals accurately convey their messages [1].

From left to right: “music”, “brush the teeth”, “what is your name?”.In AAC, Pictograms refer to an image representing a more or less concrete concept. It can be a single word, a named entity, or a polylexical expression among others (see the example with pictograms taken from ARASAAC, a collection featuring over 25,000 pictograms freely available under a Creative Commons CC-BY-NC-SA license).

Using pictograms as a communication aid has proven effective in visualizing syntax, manipulating words, and facilitating language access [2]. Moreover, the use of AAC has a positive social impact for individuals with language impairment. The “Croix-Rouge” (French Red Cross) has identified a reduction in stress, an improvement in autonomy and health, and greater serenity and enjoyment in daily life [3]. However, not everyone has prior knowledge about AAC and pictograms. Yet, in a situation where a “verbal” person aims to communicate with an AAC user, a tool that converts the two modalities — speech and text — into a sequence of pictograms is essential. By providing a relevant and comprehensible sequence of pictograms for the impaired person, the communication between the two parties can be initiated.

The goal of ToPicto is to bring together linguists, computer scientists, and translators to develop new translation methods to translate either speech or text into a corresponding sequence of pictograms.

News

  • 21.12.2023: Website goes live and registration is open.

Tasks Description

The participants will be requested to develop solutions for translating text or speech into a sequence of pictogram terms, with each of them linked to a unique pictogram image from ARASAAC.

ImageCLEFToPicto 2024 consists of two substaks:

  • Text-to-Picto
  • Speech-to-Picto

Text-to-Picto

Description

Text-to-Picto task focuses on the automatic generation of a corresponding sequence of pictogram terms from a French text. This challenge can be seen as a translation problem, where the source language is French, and the target language is French pictogram terms.

The providing translation has to follow the specifications regarding a translation in pictograms, understandable by AAC users.

Speech-to-Picto

Description

Speech-to-Picto focuses on the two modalities speech and pictograms. The challenge is to directly translate speech to pictogram terms without going through the transcription dimension, which is the focus of the speech community with current spoken language translation systems.



Data

The data for the two tasks is built from the TCOF corpus (https://tcof.atilf.fr/) [4]. TCOF contains interactions between adults, adults and children, and children themselves, covering a wide range of topics including debates, everyday situations, and medical consultations. This type of text is representative of the interactions we observe between caregivers (families, medical staff) and individuals who rely on pictograms due to language impairments.

For ToPicto, we provide a corresponding sequence of terms linked to a pictogram from either the speech utterance or the oral transcription.
Below is detailed information about each input and the expected output format for each task.

Text-to-Picto

Input : a json file with the following information (only for training and validation data, for test you will be only given the id and src):

Tag Definition Example
id unique identifier of each utterance cefc-tcof-Acc_del_07-1
src source of the utterance - text from oral transcription tu peux pas savoir
tgt target of the utterance - sequence of pictogram terms (tokens) toi pouvoir savoir non
pictos a list of pictogram identifiers linked to each pictogram terms (the size is the same as the target output).* [6625, 35949, 16885, 5526]
Description

* This information is provided for reference to give an idea of the input with the sequence of pictogram images. Each image can be obtained from the ARASAAC website as follows: https://api.arasaac.org/v1/pictograms/6625

Output: a json file with the following information:

Tag Definition Example
id unique identifier of each utterance cefc-tcof-Acc_del_07-1
hyp hypothesis given by your system / model corresponding to the sequence of pictogram terms toi savoir non

Speech-to-Picto

The input and output are the same, with the only distinction being that in the input, the source is the audio file linked to the ID in .wav format: cefc-tcof-Acc_del_07-1.wav.

Evaluation methodology

The evaluation is conducted using BLEU [5], METEOR [6], and the Picto-term Error Rate (PictoER) [7]. For all three metrics, the evaluation involves comparing the hypothesis (hyp) with the target (tgt), i.e., the sequence of pictogram terms.

Scripts provided: We provide a script that maps the pictogram terms to the pictogram images to visualize the output sequence.

Participant registration

The registration is open here:

For general information, please refer to ImageCLEF registration instructions.

Important Dates

  • 17.01.2024 NEW DATE: 09.02.2024: Training and validation data release.
  • 14.03.2024: Test data release.

More information will be added soon!

Submission instructions

More information will be added soon!

Results

More information will be added soon!

Contact

Organizers:

  • Cécile Macaire — <cecile.macaire(at)univ-grenoble-alpes.fr>, Université Grenoble Alpes, LIG, France
  • Benjamin Lecouteux — <benjamin.lecouteux(at)univ-grenoble-alpes.fr>, Université Grenoble Alpes, LIG, France
  • Didier Schwab — <didier.schwab(at)univ-grenoble-alpes.fr>, Université Grenoble Alpes, LIG, France
  • Emmanuelle Esperança-Rodier — <emmanuelle.esperanca-rodier(at)univ-grenoble-alpes.fr>, Université Grenoble Alpes, LIG, France

Acknowledgments

ToPicto is organized as part of the PROPICTO project, and supported by the following partners:

Description





References

[1] Romski, M., & Sevcik, R. A. (2005). Augmentative communication and early intervention: Myths and realities. Infants & Young Chitdren, 18(3), 174-185.
[2] Cataix-Nègre, É. (2017). Communiquer autrement: Accompagner les personnes avec des troubles de la parole ou du langage : les communications alternatives. De Boeck Supérieur.
[3] Communication alternative améliorée (CAA) : la Croix-Rouge française dévoile sa première étude d’impact social ! (2021, April 12). Croix-Rouge. Retrieved June 28, 2023, from https://www.croix-rouge.fr/actualite/communication-alternative-amelioree...
[4] André, V., & Canut, E. (2010). Mise à disposition de corpus oraux interactifs : le projet TCOF (Traitement de Corpus Oraux en Français). Pratiques. Linguistique, littérature, didactique, (147-148), 35-51.
[5] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
[6] Banerjee, S., & Lavie, A. (2005, June). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (pp. 65-72).
[7] Woodard, J. P., & Nelson, J. T. (1982, March). An information theoretic measure of speech recognition performance. In Workshop on standardisation for speech I/O technology, Naval Air Development Center, Warminster, PA.