You are here



The RobotVision task has been proposed to the ImageCLEF participants for the first time in 2009. The task attracted considerable attention, with 19 inscribed research groups, 7 groups eventually participating and a total of 27 submitted runs. The second edition of the RobotVision task will be part of the ImageCLEF@ICPR contest introducing new interesting challenges to the pattern recognition community.

The second edition of the RobotVision task addresses the problem of visual place classification. Specifically, participants will be asked to classify rooms and areas on the basis of image sequences, captured by a stereo camera mounted on a mobile robot within an office environment, under varying illumination conditions. The system built by the participants should be able to answer the question "where are you?" when presented with a test sequence imaging rooms seen during training (from different viewpoints and under different conditions) or additional rooms that were not imaged in the training sequence.


  • 10.05.2010 - Important dates (see below) updated with information regarding publication of participants' papers in the LNCS volume and the ICPR contest session.
  • 20.04.2010 - Results and ground truth released.
  • 08.01.2010 - Results and ground truth released (private communication with each of the participants).
  • 19.12.2009 - Camera calibration data released.
  • 14.12.2009 - Test data released.
  • 13.12.2009 - Performance evaluation scripts released.
  • 13.11.2009 - Training and validation data released!


  • Andrzej Pronobis, Royal Institute of Technology, Stockholm, Sweden,
  • Barbara Caputo, IDIAP Research Institute, Martigny, Switzerland,
  • Henrik I. Christensen, Georgia Institute of Technology, Atlanta, GA, USA,

Contact person

Should you have any questions regarding the contest, please contact Andrzej Pronobis.


Task 1 (Obligatory)
Groups Runs
# Group Overall Score Score (Easy) Score (Hard)
1 CVG 3824.0 2047.0 1777.0
2 TRS2008 3674.0 2102.5 1571.5
3 SIMD 3372.5 2000.0 1372.5
4 CAS IDIAP 3344.0 1757.5 1586.5
5 PicSOM TKK 3293.0 2176.0 1117.0
6 Magrit 3272.0 2026.0 1246.0
7 RIM at GT 2922.5 1726.0 1196.5
8 UAIC 2283.5 1609.0 674.5
# Group Overall Score Score (Easy) Score (Hard)
1 CVG 3824.0 2047.0 1777.0
2 CVG 3762.5 2069.5 1693.0
3 CVG 3713.0 1946.5 1766.5
4 TRS2008 3674.0 2102.5 1571.5
5 CVG 3643.5 1977.5 1666.0
6 CVG 3567.5 2051.0 1516.5
7 SIMD 3372.5 2000.0 1372.5
8 CAS-IDIAP 3344.0 1757.5 1586.5
9 CAS-IDIAP 3315.5 1753.0 1562.5
10 PicSOM TKK 3293.0 2176.0 1117.0
11 Magrit 3272.0 2026.0 1246.0
12 PicSOM TKK 3225.0 2176.0 1049.0
13 PicSOM TKK 2946.0 1897.0 1049.0
14 RIM at GT 2922.5 1726.0 1196.5
15 PicSOM TKK 2844.5 1966.0 878.5
16 PicSOM TKK 2774.0 1897.0 877.0
17 PicSOM TKK 2730.5 2065.0 665.5
18 PicSOM TKK 2692.0 1966.0 726.0
19 UAIC 2283.5 1609.0 674.5
20 UAIC 2240.0 1603.0 637.0
21 UAIC 2122.5 1614.5 508.0
22 TRS2008 2000.5 1203.5 797.0
23 UAIC 1946.5 1474.5 472.0
24 PicSOM TKK 1518.5 964.0 554.5
25 TRS2008 1162.0 893.0 269.0

Task 2 (Optional)
Groups Runs
# Group Overall Score Score (Easy) Score (Hard)
1 SIMD 3881.0 2230.5 1650.5
2 TRS2008 3783.5 2135.5 1648.0
3 CAS-IDIAP 3453.5 1768.0 1685.5
4 RIM at GT 2822.0 1589.5 1232.5
# Group Overall Score Score (Easy) Score (Hard)
1 SIMD 3881.0 2230.5 1650.5
2 TRS2008 3783.5 2135.5 1648.0
3 CAS-IDIAP 3453.5 1768.0 1685.5
4 CAS-IDIAP 3372.5 1750.0 1622.5
5 RIM at GT 2822.0 1589.5 1232.5

Important dates

  • 13.11.2009 - Training and validation data and task release
  • 14.12.2009 - Test data release
  • 07.01.2010 - Submission of runs (gate closes 06.01.2010 at 00:00 CET (GMT +1))
  • 08.01.2010 - Release of results and ground truth (private communication with each of the participants)
  • 15.01.2010 - Deadline for the main ICPR 2010 conference paper submission
  • 30.03.2010 - Notification of acceptance of the ICPR 2010 papers
  • 30.03.2010 - Official release of results and ground truth
  • 25.05.2010 - Deadline for the papers about approaches for LNCS volume.
  • 31.05.2010 - Acceptance notification for the LNCS papers.
  • 15.06.2010 - Submission of final version of the LNCS papers.
  • 22.08.2010 - Oral and poster sessions for the participants prior to the ICPR 2010 conference.
  • 23.8. - 26.8.2010 - ICPR 2010 conference
  • The schedule should allow the contest participants to submit papers describing their approaches (optionally including the result obtained in the contest) to the main ICPR 2010 conference.

    How to register for the task

    To register, please use the registration system available here. Registration is free of charge.

    The Task

    Participants are given training data consisting of sequences of stereo images (please note that it is not required to use stereo information in the contest and monocular vision systems relying on either the left or the right camera can be used). The training sequences were recorded using a mobile robot that was manually driven through several rooms of a typical indoor office environment. The acquisition was performed under fixed illumination conditions and at a given time. Each image in the training sequences is labeled and assigned to the room in which it was acquired.

    The challenge is to build a system able to answer the question 'where are you?' (I'm in the kitchen, in the corridor, etc.) when presented with test sequences containing images acquired in the previously observed part of the environment or in additional rooms that were not imaged in the training sequences. The test images were acquired under different illumination settings than the training data. The system should assign each test image to one of the rooms that were present in the training sequences or indicate that the image comes from a room that was not included during training. Moreover, the system can refrain from making a decision (e.g. in the case of lack of confidence).

    We consider two separate tasks, task 1 (obligatory) and task 2 (optional). In task 1, the algorithm must be able to provide information about the location of the robot separately for each test image, without relying on information contained in any other image (e.g. when only some of the images from the test sequences are available or the sequences are scrambled). This corresponds to the problem of global topological localization. In task 2, the algorithm is allowed to exploit continuity of the sequences and rely on the test images acquired before the classified image (images acquired after the classified image cannot be used). The same training, validation and testing sequences are used for both tasks. The reported results will be compared separately and winners will be announced for both tasks.

    The tasks employ two sets of training, validation and testing sequences. The first, easier set contains sequences with constrained viewpoint variability. In this set, training, validation and testing sequences were acquired following similar path through the environment. The second, more challenging set contains sequences acquired following different paths (e.g. the robot was driven in the opposite direction). The final score for each task will be calculated based on the results obtained for both sets. Note: it is allowed to use different techniques for each set.

    The competition starts with the release of annotated training and validation data. Moreover, the participants will be given a tool for evaluating performance of their algorithms. The test image sequences will be released later (see the schedule below). The test sequences were acquired in the same environment, under different conditions, and contain additional rooms that were not imaged previously. The algorithms trained on that sequence will be used to annotate each of the test images. The same tools and procedure as for the validation will be used to evaluate and compare the performance of each method during testing.

    Detailed information about the data used for the competition, the experimental procedure as well as tools and criteria used to evaluate the performance of the algorithms can be found below.

    Data Set

    Mobile robot platform
    used for data acquisition.

    Characteristics of the Data

    The image sequences used for the contest are taken from the previously unreleased COLD-Stockholm database. The sequences were acquired using the MobileRobots PowerBot robot platform equipped with a stereo camera system consisting of two Prosilica GC1380C cameras. Please note that either monocular or stereo vision system can be used in the contest. Download links for the sequences are available below. The acquisition was performed in a subsection of a larger office environment, consisting of 13 areas (usually corresponding to separate rooms) representing several different types of functionality.

    The appearance of the areas was captured under two different illumination conditions: in cloudy weather and at night. The robot was manually driven through the environment while continuously acquiring images at a rate of 5fps. Each data sample was then labeled as belonging to one of the areas according to the position of the robot during acquisition (rather than contents of the images). The video below presents the acquisition procedure as well as parts of image sequences showing the interiors of the rooms, variations caused by activity in the environment (presence/absence of people, objects relocated etc.) and introduced by changing illumination (cloudy/night).

    Acquisition of the COLD-Stockholm Database

    Image Sequences

    Each image sequence is stored as a set of JPEG files in a separate TAR archive. Windows users can use one of the free archive managers such as PeaZip to decompress the archives. Complete information about each image is encoded in its filename. The naming convention used to generate the image filenames is explained below:


    • {frame_number} - Number of the frame in the sequence
    • {camera} - Indicates the camera used to acquire the image ('Left' or 'Right')
    • {area} - Label indicating the area in which the image was acquired.

    Four sequences were selected for the contest. There are two training sequences having different properties, one sequence that should be used for validation and one sequence for testing. The training and validation sequences are available for download, the test sequence will be released according to the schedule. Information about the sequences as well as download links are available below:

    • training_easy - Sequence acquired in 9 areas, during the day, under cloudy weather. The robot was driven through the environment following a similar path as for the test and validation sequences and the environment was observed from many different viewpoints (the robot was positioned at multiple points and performed 360 degree turns).
    • training_hard - Sequence acquired in 9 areas, during the day, under cloudy weather. The robot was driven through the environment in a direction opposite to the one used for the training_easy sequence, without making additional turns.
    • validation - Sequence acquired in 9 areas, at night. Similar path was followed as for the training_easy sequence; however without making additional turns.
    • testing (available through the submission system) - Sequence acquired in similar conditions and following similar path as in case of the validation sequence. This sequence contains additional areas (13 in total) that were not imaged in the training or validation sequences.

    Additional Resources

    The camera calibration data are available for the stereo image sequences. The camera calibration has been performed for both cameras independently and within the stereo setup using the Camera Calibration Toolbox for Matlab. The calibration data are stored within Matlab .mat files as produced by the toolbox. The data are available for download compressed using ZIP or Tar/Gz.

    Experimental Procedure

    This section describes the experimental procedure that is suggested for the validation experiments. This procedure allows to test the algorithms in a scenario very similar to the one considered for the final test run.

    As it was mentioned in the description of the task, in case of each set (easy and hard), the algorithms must be trained on a single training data sequence and tested on another sequence. Each image in the sequence should be assigned to one of the rooms available during training or marked as an unknown room (it is also possible to refrain from a decision). During validation, both training and validation sequences contain the same set of rooms. Therefore, in order to simulate the test run in case of which new rooms will be added to the sequence, the algorithms should be trained on a subset of rooms in the training sequence. The performance evaluation script allows to simulate such scenario.

    The following can be an example of a single experiment:

    • Easy set:
      • Training on training_easy, rooms Elevator, Corridor, Kitchen, LargeOffice1, LargeOffice2, StudentOffice, Lab, PrinterArea
      • Testing on validation, all rooms
    • Hard set:
      • Training on training_hard, rooms Elevator, Corridor, Kitchen, LargeOffice1, LargeOffice2, StudentOffice, Lab, PrinterArea
      • Testing on validation, all rooms

    Performance Evaluation

    Performance Measure

    The following rules are used when calculating the final score for a run:

    • +1.0 points for each correctly classified image.
    • Correct detection of an unknown room is treated the same way as correct classification.
    • -0.5 points for each misclassified image.
    • 0.0 points for each image that was not classified (the algorithm refrained from the decision).
    • The final score will be a sum of points obtained for both sets (easy and hard).

    Performance Evaluation Script

    Python module/script is provided for evaluating performance of the algorithms on the test/validation sequence. The script and some examples are available:

    Python is required in order to use the module or execute the script. Python is available for Unix/Linux, Windows, and Mac OSX and can be downloaded from The knowledge of Python is not required in order to simply run the script; however, basic knowledge might be useful since it can also be integrated with other scripts as a module. A good quick guide to Python can be found at

    The archive contains three files:

    • - the main Python script/module
    • - small example illustrating how to use as a module
    • example.results - example of a file containing fake, fully correct results for the validation sequence

    When using the script/module, the following codes should be used to represent a room ID:

    • Elevator
    • Corridor
    • Kitchen
    • LargeOffice1
    • LargeOffice2
    • SmallOffice2
    • StudentOffice
    • Lab
    • PrinterArea
    • Unknown - room not available during training
    • empty string - no result provided

    The script calculates the final score by comparing the results to the groundtruth encoded as part of its contents. The score is calculated for one set of training/validation/testing sequences. To obtain the final result for both easy and hard sets, sum the results produced for each set separately.

    Using as a script can simply be executed as a script. Given that Python is already installed, running the script without any parameters will produce the following usage note:

|   1.0 2009-12-13            |
| RobotVision@ICPR'10 Performance Evaluation Script |
| Author: Andrzej Pronobis                          |

Error: Incorrect command line arguments.

Usage: ./ [Options] <results_file> <test_sequence>

  <results_file>  - Path to the results file. Each line in the file
                    represents a classification result for a single
                    image and should be formatted as follows:
                    <frame_number> <room_id>
  <test_sequence> - ID of the test sequence: 'validation' or 'testing'

  -u, --unknown <room_ids> - Treat rooms <room_ids> as unknown.
                             <room_ids> should contain a list of IDs
                             of rooms that should be treated as unknown,
                             separated by '-' e.g. Elevator-Kitchen

In Linux, it is sufficient to make the executable (chmod +x ./ and then type ./ in the console. In Windows, the .py extension is usually assigned to the Python interpreter and typing in the console (cmd) is sufficient to produce the note presented above.

In order to obtain the final score for a given test sequence, run the script with the parameters described above e.g. as follows: -u Corridor example.results validation

The command will produce the score for the results taken from the example.results file obtained for the validation sequence. The option -u specifies that the corridor (room ID 'Corridor') should be treated as unknown during training. The outcome should be as follows:

  Calculating the score...

  Final score: 821.5

Each line in the results file should represent a classification result for a single image. Since each image can be uniquely identified by its frame number, each line should be formatted as follows:
<frame_number> <room_id>
As indicated above, <room_id> can be left empty and the image will not contribute to the final score (+0.0 points). The file example.results contains fake, fully correct results for the validation sequence. The score 821.5 is the result of most of the images being correctly classified and all the images acquired in the corridor being misclassified as 'Corridor' while they should be marked as unknown 'Unknown'.

Using as a module in other scripts can also be used as a module within other Python scripts. This might be useful in case when the results are calculated using Python and stored as a list. In order to use the module, import it as shown in the script and execute the evaluate function.

The function evaluate is defined as follows:
def evaluate(results, testSequence, unknownRooms = [])
The function returns the final score for the given results and test sequence ID. As in case of the script, it is possible to specify that some of the rooms in the test sequence should be treated as unknown i.e. unavailable during training. Additionally, the function returns the number of images for which results were not provided.

The function should be executed as follows:
score, missing = robotclef.evaluate(results, testSequence, unknownRooms)
with the following parameters:

  • results - results table of the following format:
    results = [ ("<frame_numer1>", "<room_id1>"), ..., ("<frame_numberN>", "<room_idN>") ]
  • testSequence - ID of the test sequence, use either "validation" or "testing"
  • unknownRooms - a list of IDs of rooms that should be treated as unknown e.g. unknownRooms = ["Elevator", "Kitchen"]

Submission of Results

Each participant can submit multiple sets of results e.g. for different algorithms. As mentioned above, it is obligatory to submit results for task 1 and optional for task 2. When submitting the results, it is important to indicate the correct task to which the results correspond. Results submitted for an incorrect task or submitted without indicating to which task they correspond will be disqualified. It is allowed to submit 20 different runs in total.

As it was mentioned above, each submission should consist of results for both sets: easy and hard. Therefore, there will be 2 result files that should be submitted. However, the submission system does not allow for submitting two separate files per run. Therefore, participants are asked to concatenate the files and generate 1 file that contains first the results for the easy set and then for the hard set. Please note that submissions consisting of only one set or providing results for the sets in the wrong order will be disqualified.

The format accepted by the performance evaluation script should be used for the submitted result files. Therefore, the format of the files should be as follows:

  • Each line in the file should report a result for a single frame (note that a pair of stereo images corresponds to one frame).
  • There are 2551 images in the test sequence, therefore there should be 2x2551 lines in the file.
  • Each line should have the following format: <frame_no> <label>
  • <frame_no> should be an integer between 1 and 2551.
  • <label> can either be empty or one of the following: Unknown Elevator Corridor Kitchen LargeOffice1 LargeOffice2 SmallOffice2 StudentOffice Lab PrinterArea
  • The results should be reported first for the easy set and then for the hard set, i.e.
    1 <label_for_easy_set>
    2551 <label_for_easy_set>
    1 <label_for_hard_set>
    2551 <label_for_hard_set>

Please use the submission system to submit the results. Select Runs->Submit a run and fill in the form as follows:

  • Select the RobotVision track.
  • In the "method description" box, describe the approaches and algorithms used to generate the results. If you submit multiple runs, describe the differences. The description should be clear to the organizers, and notes such as "run 3, with gamma=1" will not be accepted.
  • Retrieval type: Not applicable
  • Language: Not applicable
  • Run type: Not applicable
  • In the "other information" box, clearly specify the task to which the submitted run corresponds (task 1 - obligatory or task 2 - optional).

The gate will close at 06.01.2010 at 00:00 CET (GMT +1).