better-wfp-00007 data pipeline results (CHIRPS Rainfall Estimates Aggregations): loading from pickle file and exploiting

This Notebook shows how to: * reload better-wfp-00007 data pipeline results previously stored in a pickle file * Create a new dataframe with data values cropped over the area of interest * Plot an RGB * Create a geotif

Import needed modules

In [1]:
import pandas as pd
from geopandas import GeoDataFrame
import gdal
import numpy as np
from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib
import matplotlib.pyplot as plt
from PIL import Image
%matplotlib inline
from shapely.wkt import loads
from shapely.geometry import box
from urlparse import urlparse
import requests

Read picke data file

In [2]:
pickle_filename = 'better-wfp-00007_validation.pkl'
In [3]:
results = GeoDataFrame(pd.read_pickle(pickle_filename))
In [5]:
results.head(5)
Out[5]:
enclosure enddate identifier self startdate title wkt
0 https://store.terradue.com/better-wfp-00007/_r... 2017-01-31 2F972053F721342E729EC1FA7062C4DAC0587A64 https://catalog.terradue.com//better-wfp-00007... 2017-01-01 Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... POLYGON((11.5030755518998 -11.1141633706909,41...
1 https://store.terradue.com/better-wfp-00007/_r... 2017-01-31 E1E2526083E64B52661E4E4CCCB0E77597031521 https://catalog.terradue.com//better-wfp-00007... 2017-01-01 Output CHIRPSv2_SouthernAfrica_N30_countaboveo... POLYGON((11.5030755518998 -11.1141633706909,41...
2 https://store.terradue.com/better-wfp-00007/_r... 2017-01-31 F44BE47DF19F96AB7E9F1AAC01F009CB6229FFE7 https://catalog.terradue.com//better-wfp-00007... 2017-01-01 Output CHIRPSv2_SouthernAfrica_N30_dryspell_20... POLYGON((11.5030755518998 -11.1141633706909,41...
3 https://store.terradue.com/better-wfp-00007/_r... 2016-01-31 670F163E595D0760E8FE51018117FCB0FD8ED9B0 https://catalog.terradue.com//better-wfp-00007... 2016-01-01 Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... POLYGON((11.5030755518998 -11.1141633706909,41...
4 https://store.terradue.com/better-wfp-00007/_r... 2016-01-31 C44244E6BD37A41ECEE5FE949750FACF0ECEAA13 https://catalog.terradue.com//better-wfp-00007... 2016-01-01 Output CHIRPSv2_SouthernAfrica_N30_countaboveo... POLYGON((11.5030755518998 -11.1141633706909,41...

Credentials for ellip platform access

  • Provide here your ellip username and api key for platform access
In [6]:
import getpass

user = getpass.getpass("Enter you ellip username:")
api_key = getpass.getpass("Enter the API key:")

Define filter parameters

Time of interest

In [16]:
mydates = ['2015-01-01T00:00:00Z', '2015-01-31T23:59:59Z']

Area of interest (two approaches to determine VOI:)

  • The user can choose to define an AOI selecting a Point and a buffer size used to build a squared polygon around that point
In [ ]:
point_of_interest = Point(27.11,8.149)
In [ ]:
buffer_size = 0.60
In [ ]:
aoi_wkt1 = box(*point_of_interest.buffer(buffer_size).bounds)
In [ ]:
aoi_wkt1.wkt
  • Or creating a Polygon from a points list (in this case this is a point inside the intersect of all results’ polygons) or simply using a WKT Polygon:
In [8]:
#aoi_wkt2 = Polygon([(26.832, 9.5136), (28.6843, 9.5136), (28.6843, 7.8009), (26.832, 7.8009), (26.832, 9.5136)])
#aoi_wkt2 = Polygon([(43.721626, 4.577164), (45.956501, 5.041664), (46.261715, 3.533043), (44.031803 ,3.064234), (43.721626, 4.577164)])
#aoi_wkt2 = 'POLYGON((26.832 9.5136, 28.6843 9.5136, 28.6843 7.8009, 26.832 7.8009, 26.832 9.5136))'
aoi_wkt2 = loads('POLYGON ((11.50307555189977 -11.11416337069092, 41.03432555189977 -11.11416337069092, 41.03432555189977 -34.97636566938584, 11.50307555189977 -34.97636566938584, 11.50307555189977 -11.11416337069092))')
In [9]:
aoi_wkt2.wkt
Out[9]:
'POLYGON ((11.50307555189977 -11.11416337069092, 41.03432555189977 -11.11416337069092, 41.03432555189977 -34.97636566938584, 11.50307555189977 -34.97636566938584, 11.50307555189977 -11.11416337069092))'

POLYGON((43.721626 4.577164,45.956501 5.041664,46.261715 3.533043,44.031803 3.064234,43.721626 4.577164))’

Create a new dataframe with data values

  • Get the subframe data values related to selected tiles and the selected band, and create a new GeoDataFrame having the original metadata stack with the specific band data extended info (data values, size, and geo projection and transformation).

Define auxiliary methods to create new dataframe

  • The first method gets download reference, band number, cropping flag, AOI cropping area and returns the related band data array with related original geospatial infos
  • If crop = False the original band extent is returned
In [10]:

def get_band_as_array(url,band_name,crop,bbox):

    output = '/vsimem/clip.tif'

    ds = gdal.Open('/vsicurl/%s' % url)
    print url


    if crop == True:
        ulx, uly, lrx, lry = bbox[0], bbox[3], bbox[2], bbox[1]
        ds = gdal.Translate(output, ds, projWin = [ulx, uly, lrx, lry], projWinSRS = 'EPSG:4326', noData=-999)
        print 'data cropping : DONE'
    else:

        ds = gdal.Translate(output, ds)
    ds = None
    ds = gdal.Open(output)
    w = ds.GetRasterBand(1).XSize
    h = ds.GetRasterBand(1).YSize
    geo_transform = ds.GetGeoTransform()
    projection = ds.GetProjection()


    data = ds.GetRasterBand(1).ReadAsArray(0, 0, w, h).astype(np.float)

    data[data==-999] = np.nan



    band = None

    print 'band %s value got!' %band_name
    ds = None

    return data,geo_transform,projection,w,h
  • The second one selects the data to be managed (only the actual results related to the filtered metadata) and returns a new GeoDataFrame containing the requested bands extended info with the original metadata set
In [11]:
def get_GDF_with_datavalues(row, user, api_key, band_name, crop=False, bbox=None):


    bands_name = ['band_1','band_2']
    if band_name not in bands_name:
        raise ValueError('Selected band is not defined!')


    parsed_url = urlparse(row['enclosure'])

    url = '%s://%s:%s@%s/api%s' % (list(parsed_url)[0], user, api_key, list(parsed_url)[1], list(parsed_url)[2])
    data,geo_transform,projection,w,h = get_band_as_array(url,band_name,crop,bbox)
    print(url)
    extended_info = dict()


    extended_info=dict(row)
    extended_info['band'] = band_name
    extended_info['data'] = data
    extended_info['geo_transform'] = geo_transform
    extended_info['projection'] = projection
    extended_info['xsize'] = w
    extended_info['ysize'] = h


    print 'Get data values for %s : DONE!' %row['title']
    return extended_info
In [12]:
results.apply(lambda row: 'tif' in row['title'], axis=1)
Out[12]:
0    True
1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
dtype: bool
In [13]:
pd.to_datetime(mydates)
Out[13]:
DatetimeIndex(['2015-01-10 00:00:00', '2015-01-31 23:59:59'], dtype='datetime64[ns]', freq=None)
In [15]:
results.apply(lambda row: row['startdate'] in pd.to_datetime(mydates) , axis=1)
Out[15]:
0    False
1    False
2    False
3    False
4    False
5    False
6    False
7    False
8    False
dtype: bool
  • Select from metadataframe a subframe containing the geotiff results of the datetimes of interest
In [17]:
mydate_results = results[(results.apply(lambda row: 'tif' in row['title'], axis=1)) & (results.apply(lambda row: row['startdate'] in pd.to_datetime(mydates) , axis=1))]
print(mydate_results)
                                           enclosure    enddate  \
6  https://store.terradue.com/better-wfp-00007/_r... 2015-01-31
7  https://store.terradue.com/better-wfp-00007/_r... 2015-01-31
8  https://store.terradue.com/better-wfp-00007/_r... 2015-01-31

                                 identifier  \
6  4AFD740D35CEB1D7CC5A7705AFB50B0FDFA7384A
7  988619EB8077CE1BDB12D1CC27ACFB019223A037
8  B6D5E5D46D30F5126629AC6CA7D8BB80CD1A76DA

                                                self  startdate  \
6  https://catalog.terradue.com//better-wfp-00007... 2015-01-01
7  https://catalog.terradue.com//better-wfp-00007... 2015-01-01
8  https://catalog.terradue.com//better-wfp-00007... 2015-01-01

                                               title  \
6  Output CHIRPSv2_SouthernAfrica_N30_countaboveo...
7  Output CHIRPSv2_SouthernAfrica_N30_daystotal_2...
8  Output CHIRPSv2_SouthernAfrica_N30_dryspell_20...

                                                 wkt
6  POLYGON((11.5030755518998 -11.1141633706909,41...
7  POLYGON((11.5030755518998 -11.1141633706909,41...
8  POLYGON((11.5030755518998 -11.1141633706909,41...

*PARAMETERS*

  • bands_list = all|the list of the bands names

For these products we have two bands: band_1 and band_2. band_1 contains the aggregations data, band_2 contains the amount of no_data_values for each region in the image.

  • crop = True|False

if crop=False the bbox can be omitted

In [18]:
myGDF = GeoDataFrame()
for row in mydate_results.iterrows():
    myGDF = myGDF.append(get_GDF_with_datavalues(row[1], user, api_key, 'band_1', True, bbox=aoi_wkt2.bounds), ignore_index=True)
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif : DONE!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif : DONE!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif : DONE!
In [19]:
myGDF
Out[19]:
band data enclosure enddate geo_transform identifier projection self startdate title wkt xsize ysize
0 band_1 [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... https://store.terradue.com/better-wfp-00007/_r... 2015-01-31 (11.5030755519, 0.0499682741117, 0.0, -11.1141... 4AFD740D35CEB1D7CC5A7705AFB50B0FDFA7384A GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... https://catalog.terradue.com//better-wfp-00007... 2015-01-01 Output CHIRPSv2_SouthernAfrica_N30_countaboveo... POLYGON((11.5030755518998 -11.1141633706909,41... 591.0 477.0
1 band_1 [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... https://store.terradue.com/better-wfp-00007/_r... 2015-01-31 (11.5030755519, 0.0499682741117, 0.0, -11.1141... 988619EB8077CE1BDB12D1CC27ACFB019223A037 GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... https://catalog.terradue.com//better-wfp-00007... 2015-01-01 Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... POLYGON((11.5030755518998 -11.1141633706909,41... 591.0 477.0
2 band_1 [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... https://store.terradue.com/better-wfp-00007/_r... 2015-01-31 (11.5030755519, 0.0499682741117, 0.0, -11.1141... B6D5E5D46D30F5126629AC6CA7D8BB80CD1A76DA GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... https://catalog.terradue.com//better-wfp-00007... 2015-01-01 Output CHIRPSv2_SouthernAfrica_N30_dryspell_20... POLYGON((11.5030755518998 -11.1141633706909,41... 591.0 477.0
  • Show data values of band_1.
In [ ]:
list(myGDF[myGDF['band'] == 'band_1']['data'].values)
In [20]:
bands = None
bands =  myGDF[(myGDF['band'] == 'band_1')]['data'].values

numbands = bands.size
for i in range(numbands):
    bmin = np.nanmin(bands[i])
    bmax = np.nanmax(bands[i])
    if bmin != bmax:
        bands[i][~np.isnan(bands[i])] = (bands[i][~np.isnan(bands[i])] - bmin)/(bmax - bmin) * 255
In [21]:
bands[0]
Out[21]:
array([[ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ...,
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]])
In [22]:
numbands
Out[22]:
3
In [23]:
fig = plt.figure(figsize=(50,50))

for i in range(numbands):
    a = fig.add_subplot(1, numbands, i+1)

    imgplot = plt.imshow(bands[i],
                         cmap='gray')

plt.tight_layout()
fig = plt.gcf()
plt.show()

fig.clf()
plt.close()
../../../../_images/pipelines_WFP_wfp-01-03-02_exploitation_wfp-00007_exploitation_from_pickle_42_0.png

Exporting geotif

In [26]:
band_number = 2
rg_bands=(bands[0],bands[1])
cols = int(myGDF['xsize'].values[0])
rows = int(myGDF['ysize'].values[0])

print (cols,rows)

geo_transform = myGDF['geo_transform'].values[0]
projection = myGDF['projection'].values[0]

drv = gdal.GetDriverByName('GTiff')

ds = drv.Create('export.tif', cols, rows, band_number, gdal.GDT_Float32)

ds.SetGeoTransform(geo_transform)
ds.SetProjection(projection)

ds.GetRasterBand(1).WriteArray(rg_bands[0], 0, 0)
ds.GetRasterBand(2).WriteArray(rg_bands[1], 0, 0)

ds.FlushCache()
(591, 477)

Download functionalities

Download a product

  • Define the download function
In [27]:
def get_product(url, dest, api_key):

    request_headers = {'X-JFrog-Art-Api': api_key}

    r = requests.get(url, headers=request_headers)

    open(dest, 'wb').write(r.content)

    return r.status_code
  • Get the reference download endpoint for the product related to the first date
In [28]:
enclosure = myGDF[(myGDF['startdate'] == mydates[0])]['enclosure'].values[0]

enclosure
Out[28]:
'https://store.terradue.com/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif'
In [29]:
output_name = myGDF[(myGDF['startdate'] == mydates[0])]['title'].values[0]

output_name
Out[29]:
'Output CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif'
In [30]:
get_product(enclosure,
            output_name,
            api_key)
Out[30]:
200

Bulk Download

  • Define the bulk download function
In [31]:
def get_product_bulk(row, api_key):

    return get_product(row['enclosure'],
                       row['title'],
                       api_key)
  • Download all the products related to the chosen dates
In [32]:
myGDF.apply(lambda row: get_product_bulk(row, api_key), axis=1)
Out[32]:
0    200
1    200
2    200
dtype: int64