
The third edition of the Robot Vision challenge is a continuation of two previous successful events. The RobotVision challenge was presented for the first time in 2009 and attracted considerable attention, with 7 participating groups and a total of 27 submitted runs. The second edition of the challenge was held in conjunction with ICPR 2010 and saw an increase in participation, with 9 participating groups and 34 submitted runs. As in case of the previous events, the challenge will address the problem of visual place classification, this time with a special focus on generalization.
The ability to represent knowledge about space and its position therein is crucial for a mobile robot. To this end, topological and semantic descriptions are gaining popularity for augmenting purely metric space representations. Enhancing the space representation to be more meaningful from the point of view of spatial reasoning and human-robot interaction have been at the forefront of the issues being addressed. Indeed, in the concrete case of indoor environments, the ability to understand the existing topological relations and associate semantics terms such as "corridor" or "office" with places, gives a much more intuitive idea of the position of the robot than global metric coordinates. Vision has become in the last years the preferred sensor for capturing this type of information, as a large share of the semantic description of a place is encoded in its visual appearance.
The third edition of the challenge will focus on the problem of visual place classification, with a special focus on generalization. Participants will be asked to classify rooms and functional areas on the basis of image sequences, captured by a stereo camera mounted on a mobile robot within an office environment. The test sequence will be acquired within the same building but at a different floor than the training sequence. It will contain rooms of the same categorical type ("corridor", "office", "bathroom") and it will also contain room categories not seen in the training sequence ("meeting room", "library"). The system built by participants should be able to answer the question "where are you?" when presented with a test sequence imaging a room category seen during training, and it should be able to answer "I do not know this category" when presented with a new room category.
Acquisition of the database used for the challenge.