better-wfp-00007 data pipeline results (CHIRPS Rainfall Estimates Aggregations): loading from pickle file and exploiting¶
This Notebook shows how to: * reload better-wfp-00007 data pipeline results previously stored in a pickle file * Create a new dataframe with data values cropped over the area of interest * Plot an RGB * Create a geotif
Import needed modules¶
In [1]:
import pandas as pd
from geopandas import GeoDataFrame
import gdal
import numpy as np
from shapely.geometry import Point
from shapely.geometry import Polygon
import matplotlib
import matplotlib.pyplot as plt
from PIL import Image
%matplotlib inline
from shapely.wkt import loads
from shapely.geometry import box
from urlparse import urlparse
import requests
Read picke data file¶
In [2]:
pickle_filename = 'better-wfp-00007_validation.pkl'
In [3]:
results = GeoDataFrame(pd.read_pickle(pickle_filename))
In [5]:
results.head(5)
Out[5]:
enclosure | enddate | identifier | self | startdate | title | wkt | |
---|---|---|---|---|---|---|---|
0 | https://store.terradue.com/better-wfp-00007/_r... | 2017-01-31 | 2F972053F721342E729EC1FA7062C4DAC0587A64 | https://catalog.terradue.com//better-wfp-00007... | 2017-01-01 | Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... | POLYGON((11.5030755518998 -11.1141633706909,41... |
1 | https://store.terradue.com/better-wfp-00007/_r... | 2017-01-31 | E1E2526083E64B52661E4E4CCCB0E77597031521 | https://catalog.terradue.com//better-wfp-00007... | 2017-01-01 | Output CHIRPSv2_SouthernAfrica_N30_countaboveo... | POLYGON((11.5030755518998 -11.1141633706909,41... |
2 | https://store.terradue.com/better-wfp-00007/_r... | 2017-01-31 | F44BE47DF19F96AB7E9F1AAC01F009CB6229FFE7 | https://catalog.terradue.com//better-wfp-00007... | 2017-01-01 | Output CHIRPSv2_SouthernAfrica_N30_dryspell_20... | POLYGON((11.5030755518998 -11.1141633706909,41... |
3 | https://store.terradue.com/better-wfp-00007/_r... | 2016-01-31 | 670F163E595D0760E8FE51018117FCB0FD8ED9B0 | https://catalog.terradue.com//better-wfp-00007... | 2016-01-01 | Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... | POLYGON((11.5030755518998 -11.1141633706909,41... |
4 | https://store.terradue.com/better-wfp-00007/_r... | 2016-01-31 | C44244E6BD37A41ECEE5FE949750FACF0ECEAA13 | https://catalog.terradue.com//better-wfp-00007... | 2016-01-01 | Output CHIRPSv2_SouthernAfrica_N30_countaboveo... | POLYGON((11.5030755518998 -11.1141633706909,41... |
Credentials for ellip platform access¶
- Provide here your ellip username and api key for platform access
In [6]:
import getpass
user = getpass.getpass("Enter you ellip username:")
api_key = getpass.getpass("Enter the API key:")
Define filter parameters¶
Time of interest¶
In [16]:
mydates = ['2015-01-01T00:00:00Z', '2015-01-31T23:59:59Z']
Area of interest (two approaches to determine VOI:)¶
- The user can choose to define an AOI selecting a Point and a buffer size used to build a squared polygon around that point
In [ ]:
point_of_interest = Point(27.11,8.149)
In [ ]:
buffer_size = 0.60
In [ ]:
aoi_wkt1 = box(*point_of_interest.buffer(buffer_size).bounds)
In [ ]:
aoi_wkt1.wkt
- Or creating a Polygon from a points list (in this case this is a point inside the intersect of all results’ polygons) or simply using a WKT Polygon:
In [8]:
#aoi_wkt2 = Polygon([(26.832, 9.5136), (28.6843, 9.5136), (28.6843, 7.8009), (26.832, 7.8009), (26.832, 9.5136)])
#aoi_wkt2 = Polygon([(43.721626, 4.577164), (45.956501, 5.041664), (46.261715, 3.533043), (44.031803 ,3.064234), (43.721626, 4.577164)])
#aoi_wkt2 = 'POLYGON((26.832 9.5136, 28.6843 9.5136, 28.6843 7.8009, 26.832 7.8009, 26.832 9.5136))'
aoi_wkt2 = loads('POLYGON ((11.50307555189977 -11.11416337069092, 41.03432555189977 -11.11416337069092, 41.03432555189977 -34.97636566938584, 11.50307555189977 -34.97636566938584, 11.50307555189977 -11.11416337069092))')
In [9]:
aoi_wkt2.wkt
Out[9]:
'POLYGON ((11.50307555189977 -11.11416337069092, 41.03432555189977 -11.11416337069092, 41.03432555189977 -34.97636566938584, 11.50307555189977 -34.97636566938584, 11.50307555189977 -11.11416337069092))'
POLYGON((43.721626 4.577164,45.956501 5.041664,46.261715 3.533043,44.031803 3.064234,43.721626 4.577164))’
Create a new dataframe with data values¶
- Get the subframe data values related to selected tiles and the selected band, and create a new GeoDataFrame having the original metadata stack with the specific band data extended info (data values, size, and geo projection and transformation).
Define auxiliary methods to create new dataframe¶
- The first method gets download reference, band number, cropping flag, AOI cropping area and returns the related band data array with related original geospatial infos
- If crop = False the original band extent is returned
In [10]:
def get_band_as_array(url,band_name,crop,bbox):
output = '/vsimem/clip.tif'
ds = gdal.Open('/vsicurl/%s' % url)
print url
if crop == True:
ulx, uly, lrx, lry = bbox[0], bbox[3], bbox[2], bbox[1]
ds = gdal.Translate(output, ds, projWin = [ulx, uly, lrx, lry], projWinSRS = 'EPSG:4326', noData=-999)
print 'data cropping : DONE'
else:
ds = gdal.Translate(output, ds)
ds = None
ds = gdal.Open(output)
w = ds.GetRasterBand(1).XSize
h = ds.GetRasterBand(1).YSize
geo_transform = ds.GetGeoTransform()
projection = ds.GetProjection()
data = ds.GetRasterBand(1).ReadAsArray(0, 0, w, h).astype(np.float)
data[data==-999] = np.nan
band = None
print 'band %s value got!' %band_name
ds = None
return data,geo_transform,projection,w,h
- The second one selects the data to be managed (only the actual results related to the filtered metadata) and returns a new GeoDataFrame containing the requested bands extended info with the original metadata set
In [11]:
def get_GDF_with_datavalues(row, user, api_key, band_name, crop=False, bbox=None):
bands_name = ['band_1','band_2']
if band_name not in bands_name:
raise ValueError('Selected band is not defined!')
parsed_url = urlparse(row['enclosure'])
url = '%s://%s:%s@%s/api%s' % (list(parsed_url)[0], user, api_key, list(parsed_url)[1], list(parsed_url)[2])
data,geo_transform,projection,w,h = get_band_as_array(url,band_name,crop,bbox)
print(url)
extended_info = dict()
extended_info=dict(row)
extended_info['band'] = band_name
extended_info['data'] = data
extended_info['geo_transform'] = geo_transform
extended_info['projection'] = projection
extended_info['xsize'] = w
extended_info['ysize'] = h
print 'Get data values for %s : DONE!' %row['title']
return extended_info
In [12]:
results.apply(lambda row: 'tif' in row['title'], axis=1)
Out[12]:
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
dtype: bool
In [13]:
pd.to_datetime(mydates)
Out[13]:
DatetimeIndex(['2015-01-10 00:00:00', '2015-01-31 23:59:59'], dtype='datetime64[ns]', freq=None)
In [15]:
results.apply(lambda row: row['startdate'] in pd.to_datetime(mydates) , axis=1)
Out[15]:
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
dtype: bool
- Select from metadataframe a subframe containing the geotiff results of the datetimes of interest
In [17]:
mydate_results = results[(results.apply(lambda row: 'tif' in row['title'], axis=1)) & (results.apply(lambda row: row['startdate'] in pd.to_datetime(mydates) , axis=1))]
print(mydate_results)
enclosure enddate \
6 https://store.terradue.com/better-wfp-00007/_r... 2015-01-31
7 https://store.terradue.com/better-wfp-00007/_r... 2015-01-31
8 https://store.terradue.com/better-wfp-00007/_r... 2015-01-31
identifier \
6 4AFD740D35CEB1D7CC5A7705AFB50B0FDFA7384A
7 988619EB8077CE1BDB12D1CC27ACFB019223A037
8 B6D5E5D46D30F5126629AC6CA7D8BB80CD1A76DA
self startdate \
6 https://catalog.terradue.com//better-wfp-00007... 2015-01-01
7 https://catalog.terradue.com//better-wfp-00007... 2015-01-01
8 https://catalog.terradue.com//better-wfp-00007... 2015-01-01
title \
6 Output CHIRPSv2_SouthernAfrica_N30_countaboveo...
7 Output CHIRPSv2_SouthernAfrica_N30_daystotal_2...
8 Output CHIRPSv2_SouthernAfrica_N30_dryspell_20...
wkt
6 POLYGON((11.5030755518998 -11.1141633706909,41...
7 POLYGON((11.5030755518998 -11.1141633706909,41...
8 POLYGON((11.5030755518998 -11.1141633706909,41...
*PARAMETERS*
- bands_list = all|the list of the bands names
For these products we have two bands: band_1 and band_2. band_1 contains the aggregations data, band_2 contains the amount of no_data_values for each region in the image.
- crop = True|False
if crop=False the bbox can be omitted
In [18]:
myGDF = GeoDataFrame()
for row in mydate_results.iterrows():
myGDF = myGDF.append(get_GDF_with_datavalues(row[1], user, api_key, 'band_1', True, bbox=aoi_wkt2.bounds), ignore_index=True)
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif : DONE!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_daystotal_2015-01-01_2015-01-31.tif : DONE!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif
data cropping : DONE
band band_1 value got!
https://better-wfp-00007:AKCp5bBXiiX2aL2wFq6EseHCGupNTJN2FYBr1CZGQHpU2v4rKMPEnLEcbLbdtVfLeyWrTXA2G@store.terradue.com/api/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif
Get data values for Output CHIRPSv2_SouthernAfrica_N30_dryspell_2015-01-01_2015-01-31.tif : DONE!
In [19]:
myGDF
Out[19]:
band | data | enclosure | enddate | geo_transform | identifier | projection | self | startdate | title | wkt | xsize | ysize | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | band_1 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | https://store.terradue.com/better-wfp-00007/_r... | 2015-01-31 | (11.5030755519, 0.0499682741117, 0.0, -11.1141... | 4AFD740D35CEB1D7CC5A7705AFB50B0FDFA7384A | GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... | https://catalog.terradue.com//better-wfp-00007... | 2015-01-01 | Output CHIRPSv2_SouthernAfrica_N30_countaboveo... | POLYGON((11.5030755518998 -11.1141633706909,41... | 591.0 | 477.0 |
1 | band_1 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | https://store.terradue.com/better-wfp-00007/_r... | 2015-01-31 | (11.5030755519, 0.0499682741117, 0.0, -11.1141... | 988619EB8077CE1BDB12D1CC27ACFB019223A037 | GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... | https://catalog.terradue.com//better-wfp-00007... | 2015-01-01 | Output CHIRPSv2_SouthernAfrica_N30_daystotal_2... | POLYGON((11.5030755518998 -11.1141633706909,41... | 591.0 | 477.0 |
2 | band_1 | [[nan, nan, nan, nan, nan, nan, nan, nan, nan,... | https://store.terradue.com/better-wfp-00007/_r... | 2015-01-31 | (11.5030755519, 0.0499682741117, 0.0, -11.1141... | B6D5E5D46D30F5126629AC6CA7D8BB80CD1A76DA | GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS... | https://catalog.terradue.com//better-wfp-00007... | 2015-01-01 | Output CHIRPSv2_SouthernAfrica_N30_dryspell_20... | POLYGON((11.5030755518998 -11.1141633706909,41... | 591.0 | 477.0 |
- Show data values of band_1.
In [ ]:
list(myGDF[myGDF['band'] == 'band_1']['data'].values)
In [20]:
bands = None
bands = myGDF[(myGDF['band'] == 'band_1')]['data'].values
numbands = bands.size
for i in range(numbands):
bmin = np.nanmin(bands[i])
bmax = np.nanmax(bands[i])
if bmin != bmax:
bands[i][~np.isnan(bands[i])] = (bands[i][~np.isnan(bands[i])] - bmin)/(bmax - bmin) * 255
In [21]:
bands[0]
Out[21]:
array([[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
...,
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]])
In [22]:
numbands
Out[22]:
3
In [23]:
fig = plt.figure(figsize=(50,50))
for i in range(numbands):
a = fig.add_subplot(1, numbands, i+1)
imgplot = plt.imshow(bands[i],
cmap='gray')
plt.tight_layout()
fig = plt.gcf()
plt.show()
fig.clf()
plt.close()
Exporting geotif¶
In [26]:
band_number = 2
rg_bands=(bands[0],bands[1])
cols = int(myGDF['xsize'].values[0])
rows = int(myGDF['ysize'].values[0])
print (cols,rows)
geo_transform = myGDF['geo_transform'].values[0]
projection = myGDF['projection'].values[0]
drv = gdal.GetDriverByName('GTiff')
ds = drv.Create('export.tif', cols, rows, band_number, gdal.GDT_Float32)
ds.SetGeoTransform(geo_transform)
ds.SetProjection(projection)
ds.GetRasterBand(1).WriteArray(rg_bands[0], 0, 0)
ds.GetRasterBand(2).WriteArray(rg_bands[1], 0, 0)
ds.FlushCache()
(591, 477)
Download functionalities¶
Download a product¶
- Define the download function
In [27]:
def get_product(url, dest, api_key):
request_headers = {'X-JFrog-Art-Api': api_key}
r = requests.get(url, headers=request_headers)
open(dest, 'wb').write(r.content)
return r.status_code
- Get the reference download endpoint for the product related to the first date
In [28]:
enclosure = myGDF[(myGDF['startdate'] == mydates[0])]['enclosure'].values[0]
enclosure
Out[28]:
'https://store.terradue.com/better-wfp-00007/_results/workflows/ec_better_wfp_01_03_02_wfp_01_03_02_1_17/run/23a022f2-403d-11e9-a8b5-0242ac11000f/0024594-181221095105003-oozie-oozi-W/2a5af678-7dea-40ac-a45a-47af57fd8495/CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif'
In [29]:
output_name = myGDF[(myGDF['startdate'] == mydates[0])]['title'].values[0]
output_name
Out[29]:
'Output CHIRPSv2_SouthernAfrica_N30_countaboveone_2015-01-01_2015-01-31.tif'
In [30]:
get_product(enclosure,
output_name,
api_key)
Out[30]:
200
Bulk Download¶
- Define the bulk download function
In [31]:
def get_product_bulk(row, api_key):
return get_product(row['enclosure'],
row['title'],
api_key)
- Download all the products related to the chosen dates
In [32]:
myGDF.apply(lambda row: get_product_bulk(row, api_key), axis=1)
Out[32]:
0 200
1 200
2 200
dtype: int64