Workshop on Machine Vision for Earth Observation and Environment Monitoring

Data-Centric Land Cover Classification Challenge

Real-world semantic segmentation datasets often present multiple challenges, such as limited labeled data, severe class imbalance, and multi-modality. These issues can significantly hinder the performance of machine learning models, especially when combined. Therefore, it is crucial to develop methods that can robustly handle such data-related challenges, extracting all feasible information and improving the final results.

Challenge Summary

The goal of this Data-centric Land Cover Classification Challenge, as part of the Workshop on Machine Vision for Earth Observation and Environment Monitoring and the British Machine Vision Conference (BMVC) 2025, is to design and develop AI-based models that achieve the highest possible performance on a small, highly imbalanced semantic segmentation dataset. For this, participants will be provided with training and test sets, each containing multispectral and synthetic aperture radar data. Additionally, segmentation masks with 14 classes (encoded with integer values from 0 to 13) will be provided for the training set only.

Your task is to develop a method and generate prediction masks for the test set. These masks must follow the same 0–13 class encoding as the training labels and be submitted for evaluation. Success is measured by comparing these submitted prediction masks with hidden reference labels using the Jaccard score.

Please, check for more info on CodaBench.

Dataset

The training set is composed of approximately 20,000 images of size 128×128 pixels. Each sample includes:

one Sentinel-1 synthetic aperture radar (SAR) image,
one Sentinel-2 multispectral images (MSI) image, and
one corresponding segmentation mask, with 14 classes encoded with integer values from 0 to 13.

The test set contains around 3,000 unlabeled images of the same dimensions, with both SAR and MSI data available for all samples.

Participants have the flexibility to choose their input data and modeling approach. They may use only SAR data, only MSI data, or a combination of both in their proposed solutions.

You can download all data here. The dataset is organized into five folders:


						train_val_sar_images/ -- SAR data, 2 bands

						train_val_msi_images/ -- MSI, 12 bands

						train_val_masks/

						test_sar_images/ -- SAR data, 2 bands

						test_msi_images/ -- MSI, 12 bands

where:

train_val_sar_images contains Sentinel-1 synthetic aperture radar (SAR) images with 2 bands, for training and validation.
train_val_msi_images contains Sentinel-2 multispectral images (MSI) with 12 bands, for training and validation.
train_val_masks contains the corresponding ground-truth segmentation masks for the training data and validation.
test_sar_images contains Sentinel-1 SAR images (2 bands) for testing.
test_msi_images contains Sentinel-2 multispectral images (12 bands) for testing.

Please, check for more info on CodaBench.

Submission and Evaluation

Prediction masks for the test set must follow the same encoding as the training data, with class values from 0 to 13. All prediction masks should be compressed into a single .zip file and submitted for evaluation.

Please follow these requirements:

File naming: Each prediction mask must have exactly the same filename as its corresponding test image.
Format: Accepted file formats are .png, .tiff, or .tif.
Folder structure: The zip file should contain only the image files (no folders or subfolders).

Performance will be measured using the macro-averaged Jaccard score, accounting for class imbalance.

Please submit your solutions on CodaBench.

Timeline

Participants may submit 5 submissions every day and 100 in total. The best submitted solution will be automatically listed on the leaderboard.

Subject	Date
Submission Deadline	Tuesday, 11 November 2025
Workshop	Thursday, 27 November 2025

Results, Presentation, Awards, and Prizes

The final results of this challenge will be presented during the Workshop. The authors of the top-ranked methods will be invited to present their approaches at the Workshop in Sheffield/UK, on 27 November 2025. These authors will also be invited to co-author a journal paper which will summarize the outcome of this challenge and will be submitted with open access to IEEE JSTARS.

Organizers

Keiller Nogueira, University of Liverpool, UK
Ronny Hänsch, German Aerospace Center (DLR), Germany
Ribana Roscher, Research Center Jülich and University of Bonn, Germany