Browse documentation

Methodology

This page covers how the analysis actually works: where the data comes from, how the model is calibrated, what resolution you’re getting, and what the system can and can’t tell you.

What the model does

The model takes satellite imagery of a field and predicts the chemical and physical properties of the topsoil. It does this by combining multispectral reflectance from Sentinel-2 with topographic, climatic, and geological context, then comparing the patterns against tens of thousands of laboratory soil samples used to train the model.

The output is a per-pixel prediction at 10-metre resolution for nitrogen, phosphorus, potassium, pH, organic matter, cation exchange capacity, and soil texture (sand, silt, clay).

Data sources

The model uses several layers of input data, all open and globally available:

Data	Source	Resolution
Satellite imagery	Sentinel-2 SR Harmonised (Copernicus)	10 m
Elevation and terrain	FABDEM, ASTER GDEM, HAND	250 m
Topographic indices	Slope, aspect, TWI (derived)	10 m
Climate	BIOCLIM bioclimatic variables	Global
Geology	Bedrock age and parent material	Global
Landform	Terrain classification	Global

For each analysis date, the system pulls 8 months of Sentinel-2 imagery backwards from your target date, builds a monthly cloud-masked composite for each month, and feeds the full stack into the model. This temporal depth is what lets the model see through partial cloud cover and seasonal vegetation changes.

Resolution and what “10-metre” actually means

Source imagery is at 10 metres per pixel: each pixel represents a 10 × 10 m patch on the ground. To make the report readable, the system aggregates pixels into Voronoi cells within each cluster of the field. A small field gets a smaller number of larger cells; a large field gets many smaller ones. Each cell shows one representative value per property in the data table.

The underlying GeoTIFF (available in the export) preserves the full 10-metre grid for users who want to work with it directly.

Regional calibration

Soil chemistry varies by region: the same chemical reading can mean different things on Bulgarian chernozems vs Hungarian alluvial soils vs the Mediterranean basin. The model accounts for this in two ways:

A single European model is trained on samples from across the continent and runs on every order.
Regional calibration tables are applied to phosphorus and potassium classification thresholds based on the field’s location. See Soil properties for the specific Bulgarian and Hungarian scales.

Currently supported regions: Europe, including Ukraine. Other regions are not officially supported.

Training data

The model is trained against laboratory soil samples published as part of European open-data programs (notably LUCAS, the EU’s pan-European soil monitoring dataset) and supplemented with regional sampling campaigns. Siora also publishes some of its own validation datasets openly: see open soil datasets for examples from Murcia (Spain) and Ukraine.

What the model is good at

Spatial completeness: every part of the field gets a value, not just the few points where someone took a sample.
Repeatability: ordering the same field on the same date will always give the same answer.
Affordability per hectare: the per-hectare cost is a fraction of physical sampling at the same density.
Historical analyses: you can look back to any date from 2020 onwards, useful for tracking change or auditing past management.

What the model is not good at

A few honest limitations:

It’s not a substitute for a certified lab test if you need legal compliance or contractual documentation. The output is a high-resolution prediction calibrated against lab data, not a lab test itself.
Heavy vegetation hides the soil. The model uses 8 months of imagery to mitigate this, but a field with continuous dense canopy will produce less precise predictions than one with bare-soil periods. Pick dates near tillage, post-harvest, or fallow periods for the best results.
Subsoil layers are not captured. The model represents the topsoil. It doesn’t tell you about deeper horizons, drainage layers, or compaction zones.
Outside Europe, results are not validated. The system may still run, but accuracy is not guaranteed.
No prescription or VRA output. The data is at the right resolution for variable-rate planning, but the system doesn’t generate application maps or ISOXML files. You can build those from the shapefile export in your VRA tool of choice.

Validation

Siora compares model predictions against held-out lab samples not used in training, reporting accuracy per property and per region. Some of this work is published in open datasets so users can verify performance independently. Specific accuracy figures are available on request.

Updating the model

The model is improved continuously: new training data, refined calibration, expanded regional coverage. When updates ship, they apply to new orders. Past orders preserve their original values so you can always reference what you saw at the time.

Next steps

Soil properties: how the model’s predictions are turned into the values you see in the report.
Open soil datasets: public datasets and validation work.
Place an order: try the product on one of your fields.