You are here

GeoLifeCLEF 2019

Location-Based Species Recommendation

Registration and data access

  • Each participant has to register on ( with username, email and password. A representative team name should be used
    as username.
  • In order to be compliant with the CLEF requirements, participants also have to fill in the following additional fields on their profile:
    • First name
    • Last name
    • Affiliation
    • Address
    • City
    • Country
  • Once set up, participants will have access to the CrowdAI GeoLifeCLEF challenge's page

  • Usage scenario

    Automatically predicting the list of species that are the most likely to be observed at a given location is useful for many scenarios in biodiversity informatics. First of all, it could improve species identification processes and tools by reducing the list of candidate species that are observable at a given location (be they automated, semi-automated or based on classical field guides or flora). More generally, it could facilitate biodiversity inventories through the development of location-based recommendation services (typically on mobile phones) as well as the involvement of non-expert nature observers. Last but not least, it might serve educational purposes thanks to biodiversity discovery applications providing functionalities such as contextualized educational pathways.


    The aim of the challenge is to predict the list of species that are the most likely to be observed at a given location. Therefore, we will provide a large training set of species occurrences, each occurrence being associated to a multi-channel image characterizing the local environment. Indeed, it is usually not possible to learn a species distribution model directly from spatial positions because of the limited number of occurrences and the sampling bias. What is usually done in ecology is to predict the distribution on the basis of a representation in the environmental space, typically a feature vector composed of climatic variables (average temperature at that location, precipitation, etc.) and other variables such as soil type, land cover, distance to water, etc. The originality of GeoLifeCLEF is to generalize such niche modeling approach to the use of an image-based environmental representation space. Instead of learning a model from environmental feature vectors, the goal of the task will be to learn a model from k-dimensional image patches, each patch representing the value of an environmental variable in the neighborhood of the occurrence (see figure below for an illustration).


    A detailed description of the protocol used to build the datasets will be provided soon. In a nutshell, the dataset was built from different sources of occurrence data including GBIF and Pl@ntNet plateforms. Each occurrence is characterized by a set of local environmental images. These environmental images are constructed from various open datasets including Chelsea Climate [1], ESDB soil pedology data [2,3,4], Corine Land Cover 2012 soil occupation data, CGIAR-CSI evapotranspiration data [5,6], USGS Elevation data (Data available from the U.S. Geological Survey.) and BD Carthage hydrologic data. This dataset is split in 3/4 for training and 1/4 for testing. Coming soon. .

    External data
    Participants are allowed to use other external training data but at the condition that (i) the experiment is entirely re-produceable, i.e. that the used external ressource is clearly referenced and accessible to any other research group in the world, (ii) participants submit at least one run without external training data so that we can study the contribution of such ressources, (iii) the additional ressource does not contain any of the test observations.


    Coming soon

    Registration and data access

    Please refer to the general LifeCLEF registration instructions


    [1] Karger, Dirk Nikolaus, Conrad, Olaf, Böhner, Jürgen, Kawohl, Tobias, Kreft, Holger, Soria-Auza,
    Rodrigo Wilber, Zimmermann, Niklaus, Linder, H Peter, & Kessler, Michael. 2016. Climatologies
    at high resolution for the earth’s land surface areas. arXiv preprint arXiv :1607.00217.
    [2] Panagos, Panos. 2006. The European soil database. GEO : connexion, 5(7), 32–33.
    [3] Panagos, Panos, Van Liedekerke, Marc, Jones, Arwyn, & Montanarella, Luca. 2012. European Soil
    Data Centre : Response to European policy support and public data requirements. Land Use Policy,
    29(2), 329–338.
    [4] Van Liedekerke, M, Jones, A, & Panagos, P. 2006. ESDBv2 Raster Library-a set of rasters derived
    from the European Soil Database distribution v2. 0. European Commission and the European Soil
    Bureau Network, CDROM, EUR, 19945.
    [5] Zomer, Robert J, Bossio, Deborah A, Trabucco, Antonio, Yuanjie, Li, Gupta, Diwan C, & Singh,
    Virendra P. 2007. Trees and water : smallholder agroforestry on irrigated lands in Northern India.
    Vol. 122. IWMI.
    [6] Zomer, Robert J, Trabucco, Antonio, Bossio, Deborah A, & Verchot, Louis V. 2008. Climate change
    mitigation : A spatial analysis of global land suitability for clean development mechanism afforestation
    and reforestation. Agriculture, ecosystems & environment, 126(1), 67–80.