You are here

ImageCLEFlifelog

Primary tabs

Welcome to the 1st edition of the Lifelog Task!

Motivation

The availability of a large variety of personal devices, such as smartphones, video cameras as well as wearable devices that allow capturing pictures, videos, and audio clips in every moment of our life is creating vast archives of personal data where the totality of an individual's experiences, captured multi-modally through digital sensors are stored permanently as a personal multimedia archive. This unified digital records, commonly referred to as lifelogs, has been gathering increasing attention in recent years within the research community due to the need for systems that can automatically analyse this huge amounts of data in order to categorize, summarize and also query them to retrieve the information that the user may need.

Despite the increasing number of successful related workshops and panels ( JCDL 2015 , iConf 2016 , ACM MM 2016 ) lifelogging has seldom been the subject of a rigorous comparative benchmarking exercise as, for example, the new lifelog evaluation task at NTCIR-12 . This task aims to bring the attention of lifelogging to an as wide as possible audience and to promote research into some of the key challenges of the coming years.

News

  • 25.10.2016: Website is up!

Schedule

  • 14.11.2016: Registration opens (Register here).
  • 14.11.2016: Development data release (after having registered, get the data from here).
  • 20.03.2017: Test data release.
  • 01.05.2017: Deadline for submission of runs by the participants 11:59:59 PM GMT.
  • 15.05.2017: Release of processed results by the task organizers.
  • 26.05.2017: Deadline for submission of working notes papers by the participants 11:59:59 PM GMT.
  • 17.06.2017: Notification of acceptance of the working notes papers.
  • 01.07.2017: Camera ready working notes papers.
  • 11.-14.09.2017: CLEF 2017, Dublin, Ireland

Subtasks Overview

Dataset

The Lifelog dataset consists of data from three lifeloggers for a period of about one month each. The data consists of a large collection of wearable camera images (at about 2 per minute) and an XML description of the semantic locations (e.g. Starbucks cafe, McDonalds restaurant, home, work) and the physical activities (e.g. walking, transport, cycling) of the lifelogger at a granularity of one minute. A summary of the data collection is shown in Table 1.

Given the fact that lifelog data is typically visual in nature and in order to reduce the barriers-to-participation, the output of the CAFFE CNN-based visual concept detector is included in the dataset as additional metadata. This classifier provided labels and probabilities of occurrence for 1,000 objects in every image. The accuracy of the CAFFE visual concept detector is very variable, and is representative of the current generation of off-the-shelf visual analytics tools.

Number of Lifeloggers 3
Size of the Collection 18.18 GB
Number of Images 88,124
Number of Locations 130
Number of Visual Concepts 1000
Number of Lifelog Retrieval Task (LRT) Topics 16
Number of Lifelog Summarization Task (LST) Topics 5

Table 1: Statistics of Lifelog Dataset

Directory Structure

The root directory contains two files and one sub-directory:

  • Concept Detector Output (ImageCLEF-Lifelog_Concepts.txt);
  • XML dataset file (ImageCLEF-Lifelog_dataset.xml);
  • Images sub-directory (all of the images in the dataset are organised into user_id/date/images).

Concept Detector Output file format

A text file containing the output of the CAFFE concept detectors. The file is formatted as follows:

  • One title line containing labels (image_path, concept 1, concept 2, ..., concept 1000)
  • One line per image in the collection (path, value, value, ..., value)

Not every image in the dataset will have a line in this concepts file. For some images, the quality is considered too low (e.g. movement, darkness, etc..) so CAFFE does not output any value for these files, i.e., these files are skip in the concepts file. Refer to the CAFFE Concept list file to map the concept numbers to the concept titles (see the file in the Attachment section).

XML Dataset file format

The .xml file is a simple aggregation of all users data. It is structured as follows:

The root node of the data is the <users> tag, it contains the data of all users. Each user has a tag <user> that contains the user ID as an attribute, example: <user id="u1">. Inside the <user> node, is his/her data:

  • Firstly there are a number of tags constituting his user profile information (gender, weight (in KG), height (in CM) and age), example:
    • <gender>Male</gender>
    • <weight>75</weight>
    • <age>39<age>
  • Following that there is a tag <days>, this tag contains the lifelogging information of that user organised per day, each day is included in a tag <day> that has the data (a tag <date>), the relative path to the directory that contains the images captured in that particular day (the tag <images-directory>), then the minutes of that day under a root tag called <minutes>.
  • The <minutes> tag, contains exactly 1440 child tags (called <minute>), each child has an ID (example: <minute id="0">, <minute id="1">, <minute id="2">... etc), and it represents a minute in the day ordered from 0 = 12:00 AM, to 1439 = 23:59PM.
  • Each minute contains: 0 or 1 location information (<location> tag), 0 or 1 activity information (<activity> tag), 0 or more captured images (<images> tag with <image> child nodes), each node has a relative path to the image and a unique image ID.
  • The location information are captured by Moves app (https://www.moves-app.com/), and they refer to semantic locations (i.e., locations related to the logger, such as Home, Work, DCU Computing building, GYM, Name of a Store, etc.), or to landmark locations registered by Moves. This tag can contain information in several languages.
  • The activity information are also captured by Moves app, and they represent physical activities or transportation media.

Images sub-directory

The images directory contains three subdirectories (u1, u2 & u3) which contain the images of each user who donated data to the dataset. Within each of these subdirectories are additional subdirectories labeled with the date of capture. Within each date folder one can find the images from the wearable cameras on that day. An example is: /u1/2015-02-23/b00000003_21i6bq_20150223_070810e.jpg, which is an image captured at 07:08am on the 23rd February 2015.

Topics

Aside from the data, the dataset includes a set of topics (queries) that are representative of the real-world information needs of lifeloggers. There are 16 and 5 ad-hoc search topics representing the challenge of retrieval for the LRT task and the challenge of summarization for the LST task, respectively.

SubTask 1: Lifelog retrieval (LRT)

The participants should analyse the lifelog data and according to several specific queries they have to return the correct answers. For example:

  • Shopping for a Bottle of Wine: Find the moment(s) when I was shopping for wine in a supermarket.
  • Shopping For Fish: Find the moment(s) when I was shopping for fish in the supermarket.
  • The Metro: Find the moment(s) when I was riding a metro.

SubTask 2: Lifelog summarization (LST)

The participants should analyse all the images and summarize them according to specific requirements. The summary should be represented by 50 images, and it is required to be both relevant and diverse. All of the topics in this subtask will have more than 50 relevant images, so if the participants do not submit 50 images, it will be considered as an incorrect format result. The represented images are considered to be diverse if they depict different moments of the lifelogger in terms of activity, location, day-time, viewpoint, etc. of the queried topic. For example:

  • Public Transport: Summarize the use of public transport by a user.
The participant should recognize any different mean of transport depicted in the images of the dataset and if a particular mean of transport it is depicted in different day-time the participant should recognize this.

Registering for the task and accessing the data

Please register by following the instructions found in the main webpage of ImageCLEF 2017 webpage.

Following the approval of registration for the task, the participants will be given access rights to download the data files.

Submission instructions

The submissions will be received through the ImageCLEF 2017 system. Go to "Runs", then "Submit run", and then select the track.

Participants will be permitted to submit up to 10 runs.

Each system run will consist of a single ASCII plain text file. Before submitting the run files, you are expected to pre-validate their format using the automated format check tool provided by the organizers. You can access it here. Upon the passing of the checks, please add to your run file name the tag "OK". This will allow us to know that everything is OK with the format. The results of each run should be given in separate lines in the text file. The format of the text file is as follows:

SubTask 1: Lifelog retrieval

A submitted run for the LRT sub-task must be in the form of a CSV file in the following format:

[topic id, image id, confidence score]

Where:

  • topic id: Number of the queried topic, e.g., from 1 to 16 for the development set.
  • image id: The image ID that answers the topic. Each image ID is mapped into moments. If there are more than one sequential images that answer the topic (i.e. the moment is more than one image in duration), then any image from within that moment is acceptable.
  • confidence score: from 0 to 1.

Sample:

1, u1_2015-02-26_095916_1, 1.00
1, u1_2015-02-26_095950_2, 1.00
1, u1_2015-02-26_100028_1, 1.00
...
16, u3_2015-08-01_144854_1, 1.00
16, u3_2015-08-01_145314_1, 1.00
16, u3_2015-08-01_145345_2, 1.00
16, u3_2015-08-01_145531_1, 0.80

SubTask 2: Lifelog summarization

A submitted run for the LST sub-task must be in the form of a CSV file in the following format:

[topic id, image id, confidence score]

Where:

  • topic id: Number of the queried topic, e.g., from 1 to 5 for the development set.
  • image id: ID of a relevant image.
  • confidence score: from 0 to 1.

The CSV file should contain a diversified summarization in 50 images for each query.

Sample:

1, u1_2015-02-26_095916_1, 1.00
1, u1_2015-02-26_095950_2, 0.95
1, u1_2015-02-26_100028_1, 0.92
...
16, u3_2015-08-01_145314_1, 1.00
16, u3_2015-08-01_145345_2, 0.89
16, u3_2015-08-01_145531_1, 0.86

Verification

A script will be available for verifying the correct format of the files. The verification script can be downloaded from the validation folder in the dataset download area.

Evaluation Methodology

For each subtask, the final score is computed as an arithmetic mean of all queries. For each query, the evaluation method is applied as follows:

SubTask 1: Lifelog retrieval

Evaluation metrics based on NDCG (Normalized Discounted Cumulative Gain) at different depths are used, i.e., NDCG@N, where N will vary based on the type of the topics, for the recall oriented topics N will be larger (>20), and for the precision oriented topics N will be smaller N (5 or 10 or 20).

SubTask 2: Lifelog summarization

For assessing performance, classic metrics will be deployed. These metrics are:

  • Cluster Recall at X (CR@X) - a metric that assesses how many different clusters from the ground truth are represented among the top X results;
  • Precision at X (P@X) - measures the number of relevant photos among the top X results;
  • F1-measure at X (F1@X) - the harmonic mean of the previous two.

Various cut off points are to be considered, e.g., X=5, 10, 20, 30, 40, 50. Official ranking metrics this year will be the F1-measure@10, which gives equal importance to diversity (via CR@10) and relevance (via P@10).

Participants are allowed to undertake the sub-tasks in an interactive or automatic manner. For interactive submissions, a maximum of five minutes of search time is allowed per topic. In particular, the organizers would like to emphasize methods that allow interaction with real users (via Relevance Feedback (RF), for example), i.e., beside of the best performance, the way of interaction (like number of iterations using RF), or innovation level of the method (for example, new way to interact with real users) are encouraged.

Submitting a working notes paper to CLEF

Upon the completion of the task, participating teams are expected to present their systems in a working note paper, regardless their results. You should keep in mind that the main goal of the lab is not to win the benchmark but compare techniques based on the same data, so everyone can learn from the results. Authors are invited to submit using the LNCS proceedings format.

The CLEF 2017 working notes will be published in the CEUR-WS.org proceedings, facilitating the indexing by DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by the task organizers to ensure quality.

Working notes will have to be submitted before 26th May 2017 11:59 pm - midnight - Central European Summer Time, through the EasyChair submission system. The working notes papers are technical reports written in English and describing the participating systems and the conducted experiments. To avoid redundancy, the papers should *not* include a detailed description of the actual task, data set and experimentation protocol. Instead of this, the papers are required to cite both the general ImageCLEF overview paper and the corresponding lifelog task overview paper, and to present the official results returned by the organizers. Bibtex references will be available soon. A general structure for the paper should provide at a minimum the following information:

  1. Title
  2. Authors
  3. Affiliations
  4. Email addresses of all authors
  5. The body of the text. This should contain information on:
    • tasks performed
    • main objectives of experiments
    • approach(es) used and progress beyond state-of-the-art
    • resources employed
    • results obtained
    • analysis of the results
    • perspectives for future work

The paper should not exceed 12 pages, and further instructions on how to write and submit your working notes will be available soon on this page :

Recommended Reading

[1] Cathal Gurrin, Xavier Giro-i-Nieto, Petia Radeva, Mariella Dimiccoli, Håvard Johansen, Hideo Joho, Vivek K Singh, "LTA 2016: The First Workshop on Lifelogging Tools and Applications", ACM Multimedia, Amsterdam, The Netherlands, 2016.

[2] Cathal Gurrin, Hideo Joho, Frank Hopfgartner, Liting Zhou, Rami Albatal, "Overview of NTCIR-12 Lifelog Task", Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, Tokyo, Japan, 2016.

[3] Duc-Tien Dang-Nguyen, Luca Piras, Giorgio Giacinto, Giulia Boato, Francesco GB De Natale, "A hybrid approach for retrieving diverse social images of landmarks", IEEE International Conference on Multimedia and Expo (ICME), Turin, Italy, 2015.

[4] Working notes of the 2015 MediaEval Retrieving Diverse Social Images task, CEUR-WS.org, Vol. 1436, ISSN: 1613-0073.

[5] B. Ionescu, A.L. Gînscă, B. Boteanu, M. Lupu, A. Popescu,H. Müller, Div150Multi: A Social Image Retrieval Result Diversification Dataset with Multi-topic Queries”, ACM MMSys, Klagenfurt, Austria, 2016.

Helpful tools and resources

Eyeaware lifelogging framework;

OpenCV – Open Source Computer Vision;

LIRE: Lucence Image Retrieval;

trec_eval scoring software;

ImageCLEF - Image Retrieval in CLEF;

Weka Data Mining Software;

Nvidia DIGITS;

Caffee deep learning framework;

Creative Commons.

Organizers

  • Duc-Tien Dang-Nguyen <duc-tien.dang-nguyen(at)dcu.ie>, Dublin City University, Ireland
  • Luca Piras <luca.piras(at)diee.unica.it>, University of Cagliari, Cagliari, Italy
  • Michael Riegler <michaari(at)student.matnat.uio.no>, University of Oslo, Norway
  • Cathal Gurrin <cgurrin(at)computing.dcu.ie>, Dublin City University, Ireland
  • Giulia Boato <giulia.boato(at)unitn.it>, University of Trento, Italy
AttachmentSize
Concept list file26.08 KB