Data pipeline design

Description

The goal of this pipeline is to help WFP (World Food Programme) with data for seasonal monitoring and early warning activities. Therefore our pipeline generates COPERNICUS Vegetation Indicators aggregations compared to a reference peridod.

This pipeline packages an algorithm that process datasets from LAI and fAPAR (each product of these are dekads, meaning that each product represents 10 days) and results on resolution daily data aggregations for area of interest after being smoothed and gap-filled which are: - Average value over the past N time steps (10 days periods) produced every 10 days out of the filtered data (N = 3,6,9,12,15,18,27,36), - Maximum value over the past N time steps (10 days periods) produced every 10 days out of the filtered data (N = 3,6,9,12,15,18,27,36).

Ellip Workflows archetype instantiated for the wfp-01-03-01 data pipeline

Ellip Workflows archetype instantiated for the wfp-01-03-01 data pipeline

Data Sources

The data requirements need are analyzed and their retrieval mechanism accessed to make sure that the data is available in the system to be consumed as expected. In this pipeline because we are doing aggregations the outputs will be datasets of N data.

Catalogue endpoint: https://catalog.terradue.com/cgls/description

Repository: https://gitlab.com/ec-better/wfp/applications/ewf-wfp-01-03-01

Tools and Libraries

The tools and libraries necessary to execute the applications are analyzed and their compatibility is evaluated taking in consideration the computational resources available. The following libs were used: osgeo, geopandas, gzip, cioppy, shutil, sys, numpy, pandas, math, re and os.

Trigger

Queue

Queue

Queue