You are here

PlantCLEF 2017

banniere

Usage scenario

Crowdsourced initiatives such as iNaturalist, Tela Botanica, or iSpot produce big amounts of biodiversity data that are intended in the long term, to renew today’s ecological monitoring approaches with much more timely and cheaper raw input data. At the same time, with the recent advances in computer vision, we see the emergence of more and more effective mobile search tools allowing to set-up large scale data collection platforms such as the popular Pl@ntNet initiative. This platform is already being used by more than 500K people who produce tens of thousands of validated plant observations each year. This explicitly shared and validated data is only the tip of the iceberg. The real potential relies on the millions of raw image queries submitted by the users of the mobile application for which there is no human validation. People make such requests to get information on a plant along a hike or something they find in their garden but not know anything about. Allowing the exploitation of such contents in a fully automatic way could scale up the world-wide collection of plant observations by several orders of magnitude, and potentially bring a valuable resource for ecological monitoring studies.

Data collection and evaluated challenge

The test data to be analyzed is a large sample of the raw query images submitted by the users of the mobile application Pl@ntNet (iPhone & Androïd), covering a large number of wild plant species mostly coming from the Western Europe Flora and the North American Flora, but also plant species used all around the world as cultivated or ornamental plants, or even endangered species precisely because of their non-regulated commerce.

As training data, we will provide two main sets based both on the same list of 10 000 plant species:
- a “trusted” training set based on the online collaborative Encyclopedia Of Life (EoL)
- a “noisy” training set built through from web crawlers (more exactly from google and bing image search results)

The main idea of providing both datasets is to evaluate to what extent machine learning and computer vision techniques can learn from noisy data compared to trusted data (as usually done in supervised classification).

Pictures of EoL are themselves coming from several public databases (such as Wikimedia, iNaturalist, Flickr) or from some Institutions or less formal websites dedicated to botany. All the pictures can be potentially revised and rated on the EOL website.

On the other side, the noisy training set will contain more images for a lot of species, but with several type and level of noises which are basically impossible to automatically entirely control and clean: a picture can be associated to the wrong species but the correct genus or family, a picture can be a portrait of a botanist working on the species, the pictures can be associated to the correct species but be a drawing or an herbarium sheet of a dry specimen, etc.

Task description

The task will consist of automatically detecting in the Pl@ntNet query images, specimens of plants belonging to the provided training data. More practically, the run file to be submitted has to contain as much lines as the number of predictions, each prediction being composed of an ObservationId (the identifier of a specimen that can be itself composed of several images), a ClassId, a Probability and a Rank (used in case of equal probabilities). Each line should have the following format:
<ObservationId;ClassId;Probability;Rank>

where Probability is a scalar in [0,1] representing the confidence of the system in that recognition (Probability=1 means that the system is very confident) and Rank is an integer between in [1:100] (i.e. one single test ObservationId might be associated to at most 100 species predictions).

Here is a short fake run example respecting this format for only 3 observations:
myTeam_PlantCLEF2017_run2.txt

Each participating group is allowed to submit up to 4 runs built from different methods. Semi-supervised, interactive or crowdsourced approaches are allowed but will be compared independently from fully automatic methods. Any human assistance in the processing of the test queries has therefore to be signaled in the submitted runs.

We encourage participants to compare the use of the noisy and the trusted training setswithin their runs and it will be required to mention which training set were used in each run (EOL, WEB or EOL+WEB). Please note that the two training sets can have some common pictures (even if we excluded EOL domain from the web crawl).

Participants are allowed to use complementary training data (e.g. for pre-training purposes) but at the condition that (i) the experiment is entirely re-produceable, i.e. that the used external resource is clearly referenced and accessible to any other research group in the world, (ii) the use of external training data or not is mentioned for each run, and (iii) the additional resource does not contain any of the test observations.

Metric

The used metric will be the Mean Reciprocal Rank (MRR). The MRR is a statistic measure for evaluating any process that produces a list of possible responses to a sample of queries ordered by probability of correctness. The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer. The MRR is the average of the reciprocal ranks for the whole test set:
MRR formula
where |Q| is the total number of query occurrences in the test set.


Results

A total of 8 participating groups submitted 29 runs. Thanks to all of you for your efforts and your constructive feedbacks regarding the organization!
plantclef2017results
The following table and figure give the results and report with more details which dataset(s) were used as training set(s):
- E: trusted training set EOL
- P: trusted training set PlantCLEF 2016
- W: noisy training set Web
- FW: Filtered noisy training Web)

Run NameRunScoreTrain
TrustedTrustedNoisyNoisyNoisyNoisyNoisyTop1Top5
EE,PWE,WE,P,WE,P,FWE,FW
MarioTsaBerlin Run 4MarioTsaBerlin_04_EolAndWeb_Avr_All_v40,92 X 0,8850,962
MarioTsaBerlin Run 2MarioTsaBerlin_02_EolAndWeb_Avr_6x50,915 X 0,8770,96
MarioTsaBerlin Run 3MarioTsaBerlin_03_EolAndFilteredWeb_Avr_3x5_v40,894 X 0,8570,94
KDETUT Run 4bluefield.average.0,853 X 0,7930,927
MarioTsaBerlin Run1MarioTsaBerlin_01_Eol_Avr_3x5_v20,847 X 0,7940,911
CMP Run 1CMP_run1_combination0,843 X0,7860,913
KDETUT Run 3bluefield.mixed0,837X0,7690,922
KDETUT Run 2bluefield.noisy0,824X0,7540,911
CMP Run 3CMP_run3_eol0,807X0,7410,887
FHDO_BCSG Run 2FHDO_BCSG_2_finetuned_inception-resnet-v2_top-5-subset-web_eol0,806X0,7380,893
FHDO_BCSG Run 3FHDO_BCSG_3_ensemble_1_20,804X0,7360,891
UM Run 2UM_WEB_ave_run20,799X0,7260,888
UM Run 3UM_COM_ave_run30,798X0,7270,886
FHDO_BCSG Run 1FHDO_BCSG_1_finetuned_inception-resnet-v20,792X0,7230,878
UM Run 4UM_COM_max_run40,789X0,7150,882
KDETUT Run 1bluefield.trusted0,772X0,7070,85
CMP Run 2CMP_run2_combination_prior0,765X0,680,87
CMP Run 4CMP_run4_eol_prior0,733X0,6410,849
UM Run 1UM_EOL_ave_run10,7X0,6210,795
SabanciUGebzeTU Run 4Sabanci-GebzeTU_Run40,638X0,5570,738
SabanciUGebzeTU Run 1Sabanci-GebzeTU_Run10,636X0,5560,737
SabanciUGebzeTU Run 3Sabanci-GebzeTU_Run30,622X0,5370,728
PlantNet Run 1PlantNet_PlantCLEF2017_runTrusted-repaired0,613X0,5130,734
SabanciUGebzeTU Run 2Sabanci-GebzeTU_Run2_EOLonly0,581X0,5080,68
UPB HES SO Run 3UPB-HES-SO_PlantCLEF2017_run30,361X0,2930,442
UPB HES SO Run 4UPB-HES-SO_PlantCLEF2017_run40,361X0,2930,442
UPB HES SO Run 1UPB-HES-SO_PlantCLEF2017_run10,326X0,260,406
UPB HES SO Run 2UPB-HES-SO_PlantCLEF2017_run20,305X0,2390,383
FHDO_BCSG Run 4FHDO_BCSG_4_finetuned_inception-v40X00
plantclef2017resultsdetailed
AttachmentSize
PlantCLEF2017_results_1.png157.68 KB
PlantCLEF2017_results_3.png176.15 KB