You are here

LifeCLEF 2015 Bird task

banniere

Results publication

The task overview working note summarizing the results of the whole task is available HERE (pdf). Individual working notes of the participants can be found within CLEF 2015 CEUR-WS proceedings.

Context

The general public as well as professionals like park rangers, ecology consultants, fishers or, of course, the ornithologists themselves might actually be users of an automated bird identifying system, typically in the context of wider initiatives related to ecological surveillance or biodiversity conservation. Using audio records rather than bird pictures is justified by current practices. Birds are actually not easy to photograph as they are most of the time hidden, perched high in a tree or frightened by human presence, whereas audio calls and songs have proved to be easier to collect and much more discriminant. The organization of this task is supported by Xeno-Canto foundation for nature sounds and the French projects Pl@ntNet (INRIA, CIRAD, Tela Botanica) and SABIOD Mastodons.

Chorus      PlantNet     SABIOD Mastodons

Task overview

The task will be focused on bird identification based on different types of audio records over 999 species from South America centered on Brazil. Additional information includes contextual meta-data (author, date, locality name, comment, quality rates). The main originality of this data is that it was built through a citizen sciences initiative conducted by Xeno-canto, an international social network of amateur and expert ornithologists. This makes the task closer to the conditions of a real-world application: (i) audio records of the same species are coming from distinct birds living in distinct areas (ii) audio records by different users that might not used the same combination of microphones and portable recorders (iii) audio records are taken at different periods in the year and different hours of a day involving different background noise (other bird species, insect chirping, etc).

Data

The data will be built from the outstanding Xeno-canto collaborative database (http://www.xeno-canto.org/) involving at the time of writing more than 192k audio records covering 9120 bird species observed all around the world thanks to the active work of more than 2020 contributors.

The subset of Xeno-canto data that will be used for the first year of the task will contain 33203 audio recordings belonging to the 999 bird species having the more recordings in the union of Brazil, Colombia, Venezuela, Guyana, Suriname and French Guiana Xeno-canto recordings. The amount of 999 classes will clearly go one step further previous benchmarks (80 species max) and foster brave new techniques. On the other side, the task will remain feasible with current approaches in terms of the number of records per species and the required hardware to process that data. Detailed statistics are the following:
- minimally 14 recordings per species (maximum >200)
- minimally 10 different recordists, maximally >40, per species.

Before we release this BirdCLEF2015 dataset, sample records can be easily found on Xeno-canto website:
http://www.xeno-canto.org/

Please DO NOT download the whole Xeno-canto data yourself by using their API. Their system is actually not calibrated to support such massive access and they will provide us specific links to download the training data. Please also notice that, to allow a fair evaluation, it will be strictly forbidden to use the online resources of Xeno-canto as training data because some of them might be used as queries in the official test set of the task. More generally, it will be forbidden to use any external training data to enrich the provided one. Many Xeno-canto contents are actually circulating freely on the web and we could not guaranty that the data crawled by some participants does not include some of the records we will use in the test set.

Audio records are associated to various metadata such as the type of sound (call, song, alarm, flight, etc.), the date and localization of the observations (from which rich statistics on species distribution can be derived), some textual comments of the authors, multilingual common names and collaborative quality ratings. The available metadata for each recording includes:

- MediaId: the audio record ID
- FileName: the (normalized) audio record filename
- ClassId: the class ID that must be used as ground-truth
- Species the species name
- Genus: the name of the Genus, one level above the Species in the taxonomical hierarchy used by Xeno-canto
- Family: the name of the Family, two levels above the Species in the taxonomical hierarchy used by Xeno-canto
- Sub-species: (if available) the name of the sub-species, one level under the Species in the taxonomical hierarchy used by Xeno-canto
- VernacularNames: (if available) english common name(s)
- BackgroundSpecies: latin (Genus species) names of other audible species eventually mentioned by the recordist
- Date: (if available) the date when the bird was observed
- Time: (if available) the time when the bird was observed
- Quality: the (round up) average of the user ratings on audio record quality
- Locality: (if available) the locality name, most of the time a town
- Latitude & Longitude
- Elevation (altitude in meters)
- Author: name of the author of the record,
- AuthorID: id of the author of the record,
- Audio Content: comma-separated list of sound types such as 'call' or 'song', free-form
- Comments: free comments from the recordists

Audio recordings pre-processing and features extraction

In order to avoid any bias in the evaluation related to the used recording devices, the whole audio data has been normalized by Univ. Toulon Dyni team: normalization of the bandwith / frequency sample to 44.1 kHz, .wav format (16 bits).

Task description

The task will be evaluated as a bird species retrieval task. A part of the collection will be delivered as a training set available a couple of months before the remaining data is delivered. The goal will be to retrieve the singing species among the top-k returned for each of the undetermined observation of the test set. Participants will be allowed to use any of the provided metadata complementary to the audio content.

Training and test data

As it was mentioned below above-said, the "Background" field in the Metadata may indicate if there are some other species identified in the background like for this observation or not even if Xeno-canto encourage to identify them. Some audio records may also not contain at all a dominant bird species like in this example.
The training dataset will contain only audio records with a dominant bird species, with or without other identified bird species in the background. Participants are free to use these background informations or not.
The test dataset will contain the same type of audio records but with purged background informations and comments which can potentially also include some species names. More precisely, the purged test xml files will only include:
- MediaId: the audio record ID
- FileName: the (normalized) audio record filename
- Date: (if available) the date when the bird was observed
- Time: (if available) the time when the bird was observed
- Locality: (if available) the locality name, most of the time a town
- Latitude & Longitude
- Elevation (altitude in meters)
- Author: name of the author of the record,
- AuthorID: id of the author of the record

Run format

The run file must be named as "teamname_runX.run" where X is the identifier of the run (i.e. 1, 2, 3 or 4). The run file has to contain as much lines as the total number of predictions, with at least one prediction and a maximum of 999 predictions per test audio record (999 being the total number of species). Each prediction item (i.e. each line of the file) has to respect the following format:
< MediaId;ClassId;rank;probability>
where probability is a real value in [0;1] decreasing with the confidence in the prediction.

Here is a short fake run example respecting this format on only 8 test MediaId:
myTeam_run2.txt

For each submitted run, please give in the submission system a description of the run. A combobox will specify wether the run was performed fully automatically or with a human assistance in the processing of the queries. Then, a textarea should contain a short description of the used method, for helping differentiating the different runs submitted by the same group, and where we ask to the participants to indicate if they used a method based on
- only AUDIO
- only METADATA
- both AUDIO x METADA
For instance:
Only AUDIO, using provided MFFC features, multiple multi-class Support Vector Machines with probabilistic outputs

Optionally, you can add one or several bibtex reference(s) to publication(s) describing the method more in details.

Metric

The used metric will be the mean Average Precision (mAP), considering each audio file of the test set as a query and computed as:
equation1
where Q is the number of test audio files and AveP(q) for a given test file q is computed as:
equation1
where k is the rank in the sequence of returned species, n is the total number of returned species, P(k) is the precision at cut-off k in the list and rel(k) is an indicator function equaling 1 if the item at rank k is a relevant species (i.e. one of the species in the ground truth).

Working notes

Submitting a working note with the full description of the methods used in each run is mandatory (deadline is May 31st). Any run that could not be reproduced thanks to its description in the working notes might be removed from the official publication of the results. Working notes are published within CEUR-WS proceedings, resulting in an assignment of an individual DOI (URN) and an indexing by many bibliography systems including DBLP. According to the CEUR-WS policies, a light review of the working notes will be conducted by LifeCLEF organizing committee to ensure quality.

Results

The task overview working note summarizing the results of the whole task is available HERE (pdf). Individual working notes of the participants can be found within CLEF 2015 CEUR-WS proceedings.

Among the 130 registered groups worldwide who downloaded the data, 6 of them submitted a total of 17 runs. Thanks to all of them for their efforts.

BirdMAPs

Run Run filename Type MAP 2 (without Background Species) MAP 1 (with Background Species)
MNB TSA Run 4 MarioTsaBerlin_run4.txt AUDIO 0.454 0.414
MNB TSA Run 3 MarioTsaBerlin_run3.txt AUDIO 0.442 0.411
MNB TSA Run 2 MarioTsaBerlin_run2.txt AUDIO 0.442 0.405
MNB TSA Run 1 MarioTsaBerlin_run1.txt AUDIO 0.424 0.388
INRIA ZENITH Run 2 inria-zenith-svm-acontrario-scoring.run AUDIO 0.334 0.291
QMUL Run 1 danstowell_run1.run AUDIO 0.302 0.262
INRIA ZENITH Run 3 inria-zenith-svm-inverse_rank.run AUDIO 0.292 0.259
INRIA ZENITH Run 1 inria-zenith-knn-classifier.run AUDIO 0.265 0.240
GOLEM Run 2 res_nomax_120_test.txt AUDIO 0.171 0.149
GOLEM Run 1 res_nomax_100_test.txt AUDIO 0.161 0.139
CHIN. AC. SC. Run 1 djcasia_run_m128.txt AUDIO 0.01 0.009
CHIN. AC. SC. Run 3 djcasia_run_scatt128.txt AUDIO 0.009 0.01
CHIN. AC. SC. Run 2 djcasia_run_scatt32.txt AUDIO 0.007 0.008
MARF Run 1 marf_run1.run AUDIO 0.006 0.005
MARF Run 2 marf_run2.run METADATA 0.003 0.002
MARF Run 3 marf_run3.run AUDIO & METADATA 0.005 0.005
MARF Run 4 marf_run4.run AUDIO 0.000 0.000
AttachmentSize
Image icon results.png273.28 KB
Image icon results-bird2015.png272.54 KB