Smart Saped Software

Posted on Feb 5 • Originally published at smartshaped.com

FARM-TECH: transforming multi-source data into operational decisions

#datascience #agriculture #agritech #database

ARTICLE BY AURELIANO D'AMICI
Open-source platform (SDSS) for precision agriculture: integration of geospatial data, vegetation indices, and predictive estimates with a pragmatic approach.

In agriculture and livestock farming, data from various sources are utilized: satellites, drones, field sensors, and weather stations, to name a few. The recurring problem is not the availability of data, but its conversion into consistent, comparable indicators over time and, consequently, into actionable outputs. FARM-TECH, funded by Spoke 3 (P.P. 3.1.1) of the Tech4You consortium, was created to integrate geospatial management, multi-temporal analysis, and targeted inference flows into a single open-source platform.

FARM-TECH is an open-source platform designed (one of the project deliverables is the open-source repository of the platform itself) as a Spatial Decision Support System (SDSS) for agronomic and livestock scenarios—a system that combines geospatial data management, predictive analytics, and ancillary features within a single application context.

The platform's role is to integrate diverse information sources (raster, vector, tabular data), organize them consistently, and deliver outputs in the form of map layers, dashboards, and processing results. FARM-TECH is designed to prioritize clear workflows and interpretable results, with a structure that allows for a progressive increase in complexity and accuracy as the quantity and quality of available data grow.

In this perspective, FARM-TECH aims to be an enabling platform, building a foundation for recurring activities in precision farming, including consultation and multi-temporal comparison of maps, dataset management, and the execution of targeted processing (in the pilot case, estimates on specific plots defined on the map).

Pragmatic Modeling

FARM-TECH adopts a pragmatic approach to predictive modeling. The core idea is that the value of a model is not determined by its degree of complexity or sophistication, but by how profitably it can be used in practice: it must be able to function in a repeatable manner, using inputs that are actually available, and producing interpretable results.

During the project requirements gathering phase, a typical constraint of real-world contexts emerged: building a supervised model trained ad hoc can prove impractical when (i) the dataset is numerically limited and (ii) the sources are too heterogeneous to guarantee consistent and replicable features during inference. Under these conditions, the risk is obtaining a model that lacks generalizability and is difficult to maintain in the field.

For these reasons, it was decided to prioritize two operational choices:

Starting from traceable and easily verifiable baselines (even with partial data);
Maintaining an evolutionary trajectory: models and pipelines can be progressively refined and replaced as data quality and availability increase, without compromising the overall structure of the platform.

Data: formats, geospatial primitives, and historization

FARM-TECH has the capacity to consistently manage data from typical precision farming sources: multi-parameter weather stations, satellite multispectral imagery, drone acquisitions, and agronomic/livestock datasets. The project's goal was not only to import this information into the platform but also to normalize it and make it searchable over the long term, enabling multi-period analysis and comparisons.

From a geospatial data perspective, the platform explicitly handles:

Raster (e.g., GeoTIFF from satellite/drone, calculated indices);
Vector/tabular (perimeters, structured datasets, exportable sensor measurements).

Building on this foundation are two requirements that make the system truly usable in an operational context:

Historization: Data must be accessible and comparable over time to enable evolutionary analysis, moving beyond mere static snapshots.
Portability via export/download: The platform provides the ability to download datasets (for example, in CSV format) for offline analysis or external integrations, maintaining an open data management model consistent with the project requirements.

Vegetation Indices

An important feature of FARM-TECH is the ability to transform multispectral images into quantitative georeferenced indicators, allowing them to be used as cartographic layers that are comparable over time. This step is essential to make the information more stable and interpretable: indices synthesize the state of the crop into readable numbers, reducing ambiguity and facilitating analysis.

The initial set of indices is built on consolidated choices in the agronomic field: vegetative vigor, chlorophyll sensitivity, and stress detection. Specifically:

NDVI: The Normalized Difference Vegetation Index serves as a proxy for vegetative vigor and is calculated from Red (visible red) and NIR (Near InfraRed). NDVI =(NIR - Red)/(NIR + Red)
NDRE: The Normalized Difference Red Edge Index, based on the RE (Red Edge) band, is more stable in the presence of high biomass and more sensitive to chlorophyll. NDRE =(NIR - RE)/(NIR + RE)
GNDVI: The Green Normalized Difference Vegetation Index is used to estimate photosynthetic activity and is calculated from Green (visible green) and NIR (Near InfraRed). GNDVI =(NIR - Green)/(NIR + Green)
SSAVI: The Soil-Adjusted Vegetation Index corrects for soil brightness effects, which is particularly useful in the early stages of analysis. SAVI =(1 + L)(NIR - Red)/(NIR + Red + L), where L is a constant factor for soil correction, usually set at 0,5.

From an engineering perspective, the extraction of these indices is set up as a modular and replicable process, ensuring that processing can be integrated into subsequent pipelines.

Predictive functionality: tomato yield estimation

To make the platform useful even under suboptimal data conditions, FARM-TECH integrates a predictive feature built on a simple and verifiable baseline. In the project's use case, tomato yield is estimated starting from the average NDVI calculated on the plot of interest, assuming a linear relationship between vegetative state and final production.

The adopted model is:

Estimated Yield (kg/ha) = a · (NDVI medio) + b

with coefficients initialized as:

a = 63179
b = 20318 (Note the linear nature of this equation, equivalent to the line y = 63179 · x + 20318)

The value of this choice is twofold:

It allows for the generation of a decision-making output without requiring a large and homogeneous training set;
It keeps the process interpretable and controllable, facilitating verification and comparison with real-world data.

Furthermore, a preliminary validation was performed on 9 plantations based on multispectral acquisitions from June 27, 2024. The comparison between the estimated average yield and the actual yield makes the model fully compatible with operational use as a baseline.

Operational inference workflow

The estimated yield is designed as a function that can be used directly by users through a dedicated workflow. The inference is structured as a guided flow. There are two modes to initiate the inference, covering different operational scenarios:

GeoTIFF Upload: The user uploads a pre-existing georeferenced TIFF (for example, from drone campaigns or external pre-processing) and starts the inference.
Polygon + Time Interval: The user defines the perimeter of the plot and the period of interest; the platform retrieves the necessary data (Sentinel-2) via the Copernicus API, generates the GeoTIFF, and then starts the processing.

In both cases, the platform orchestrates the execution and returns a structured result, including the estimated yield and the geospatial elements required to visualize the result on the map.

Architecture

FARM-TECH is built with a modular and open-source architecture that combines components for geospatial management with an application back-end and a layer dedicated to executing processing tasks. The goal is to keep technical roles separate (data management, layer publication, application logic, computation), avoiding rigid dependencies and simplifying potential evolution over time.

The main components can be viewed as a functional chain:

GeoNode acts as the access point for uploading, cataloging, and consuming geospatial datasets, enabling consultation (via maps and dashboards) and collaborative data management.
GeoServer handles the publication of raster and vector layers and their exposure, making data usable both through the platform's interface and external GIS tools.
Django + PostGIS provide the application back-end and persistence, supporting orchestration logic and structured geospatial data management.
ML_Runner is the execution layer that orchestrates and runs analytics/ML pipelines, integrating them into the application flow.
Object Storage is used for the operational management of datasets and files (particularly in workflows involving uploads and raster inputs), maintaining separation between storage and computing/publishing services.

This structure allows the platform to function as an SDSS: data is managed and published as layers, while processing (indices and inferences) is executed as services, with results returned in a structured format.

Operational Security

Given the need to use the platform in a multi-user and multi-role context, FARM-TECH features an access model based on roles and controls over sensitive functionalities. The fundamental elements of this framework are:

Separation of Privilege Levels User categories (guest/registered/uploader/admin) are established with different permissions. The system includes checks and verifications to ensure that accessing specific functions without the correct authorization is impossible.
Integration with Centralized IAM Authentication relies on an Identity & Access Management (Keycloak) layer, handling logins/tokens and integrating with the application system. The authorization logic (actual permissions on resources and operational flows) remains consistent with both internal platform policies and external project requirements.

The designed framework reduces the risk of unauthorized access and, at the same time, makes the use of the platform sustainable in collaborative scenarios.

Conclusion

The value of FARM-TECH lies not in a single feature, but in the continuity of the workflow within a single governed platform. The decision to start with established indices and a transparent baseline is consistent with the real-world constraints of agritech projects. It is a solid starting point that allows for immediate utility without introducing fragility.

The most interesting prospect is its evolution: as the volume of validated data increases, FARM-TECH can integrate more accurate models and new use cases while keeping the architecture and usage patterns unchanged. In this sense, it is a platform designed for incremental growth, rather than an isolated prototype.