You are here

GeoLifeCLEF 2024

GeoLifeCLEF

Tentative Schedule

  • December 2023: Registration opens for all LifeCLEF challenges Registration is free of charge
  • 28 February 2024: Competition Start
  • 24 May 2024: Competition Deadline
  • 31 May 2024: Deadline for submission of working note papers by participants [CEUR-WS proceedings]
  • 21 June 2024: Notification of acceptance of working note papers [CEUR-WS proceedings]
  • 8 July 2024: Camera-ready deadline for working note papers.
  • 9-12 Sept 2024: CLEF 2024 Grenoble - France

All deadlines are at 11:59 PM CET of the corresponding day unless otherwise stated.

The competition organizers reserve the right to update the contest timeline if they deem it necessary.

Motivation

Predicting plant species composition and its change in space and time at a fine resolution is useful for many biodiversity management and conservation scenarios, improving species identification and inventory tools, and educational purposes.
This challenge aims to predict plant species in a given location and time using various possible predictors: satellite images and time series, climatic time series, and other rasterized environmental data: land cover, human footprint, bioclimatic, and soil variables.
To do so, we provide a large-scale training set of about 5M plant occurrences in Europe (single-label, presence-only data) as well as a validation set of about 5K plots and a test set with 20K plots, with all the present species (multi-label, presence-absence data).
The difficulties of the challenge include multi-label learning from single positive labels, strong class imbalance, multi-modal learning, and large-scale.

GeoLifeCLEF2024

How to participate?

1. Subscribe to CLEF (GeoLifeCLEF task) by filling this form
2. Go to the Kaggle's GeoLifeCLEF 2024 challenge page and compete: Kaggle challenge

Data collection

The training data comprises species observations and environmental data. Below, we explain the data in detail.

Observations data

The species related training data comprises:

  1. Presence-Absence (PA) surveys: including around 90 thousand surveys with roughly 10,000 species of the European flora. The presence-absence data (PA) is provided to compensate for the problem of false-absences of PO data and calibrate models to avoid associated biases.
  2. Presence-Only (PO) occurrences: combines around five million observations from numerous datasets gathered from the Global Biodiversity Information Facility (GBIF, www.gbif.org). This data constitutes the larger piece of the training data and covers all countries of our study area, but it has been sampled opportunistically (without standardized sampling protocol), leading to various sampling biases. The local absence of a species among PO data doesn't mean it is truly absent. An observer might not have reported it because it was difficult to "see" it at this time of the year, to identify it as not a monitoring target, or just unattractive.

There are two CSVs with species occurrence data on the Seafile available for training. The detailed description is provided again on SeaFile in separate ReadME files in relevant folders.

  • The PO metadata are available in PresenceOnlyOccurences/GLC24_PO_metadata_train.csv.
  • The PA metadata are available in PresenceAbsenceSurveys/GLC24_PA_metadata_train.csv.

Environmental data

Besides species data, we provide spatialized geographic and environmental data as additional input variables (see Figure 1). More precisely, For each species observation location, we provide:

  1. Satellite image patches: 3-band (RGB) and 1-band (NIR) 128x128 images at 10m resolution.
  2. Satellite time series: Up to 20 years of values for six satellite bands (R, G, B, NIR, SWIR1, and SWIR2).
  3. Environmental rasters Various climatic, pedologic, land use, and human footprint variables at the European scale. We provide scalar values, time-series, and original rasters from which you may extract local 2D images.

There are three separate folders with the relevant data on the Seafile available for training. The detailed description is provided below and again on SeaFile in separate "Readme" files in relevant folders.

New Seafile repository❗: repository containing all the data. To optimize download times, see the section data downloading at the bottom of this page.

GLC GitHub repository❗: Useful codes to manipulate data with simple data loaders, examples, and sample data. More dataloaders can be added after the challenge starts.

CVPR24 and CLEF24 Context

This competition is held jointly as part of:

Being part of scientific research, the participants are encouraged to participate to both events.
In particular, only participants who submitted a working note paper to LifeCLEF (see below) will be part of the officially published ranking used for scientific communication.

Credit

This project has received funding from the European Union’s Horizon research and innovation program under grant agreement No 101060639 (MAMBO project) and No 101060693 (GUARDEN project).


GUARDEN MAMBO