Step 4 - Implement the data transformation application steps

The step takes one Sentinel-1 product, staged-in during Step 3 - Stage-in the EO data, and generate the backscatter using the Sentinel Application Platform (SNAP) Python bindings.

The data transformation application steps below use a Jupyter Notebook to:

  • Read the Sentinel-1 product using the SNAP reader classes
  • Application of orbit file using the SNAP Apply-Orbit-File operator
  • Border noise removal using the SNAP ThermalNoiseRemoval operator
  • Calibration using the SNAP Calibration operator
  • Speckle filtering using the SNAP Speckle-Filter operator
  • Terrain correction using the SNAP Terrain-Correction operator
  • Conversion to dB using the SNAP linearToFromdB operator

Procedure

This step provides an example of data transformation application.

Obtain the notebook file

  • On the JupyterLab Launcher, start a new Terminal
  • Type:
cd /workspace
git clone https://gitlab.com/ellip/quick-start/jupyterlab/data-transformation.git
  • Copy the notebook file into your application. Type:
APP_NAME=<app-name>

cp -f data-transformation/input.ipynb ${APP_NAME}/src/main/app-resources/notebook/libexec/
  • Using the JupyterLab Left Sidebar, navigate to ${APP_NAME}/src/main/app-resources/notebook/libexec, and open the file input.ipynb

Change the service definition

The data transformation application will be exposed as a Web Processing Service. The definition of the Web Processing Service information such as the title and the abstract is done with a Python dictionary.

  • Change the cell containing the service definition, using the proper value for the id, using the value of APP_NAME. For instance:
[ ]:
service = dict([('title', 'Sentinel-1 backscatter timeseries'),
                ('abstract', 'Data transformation application - Sentinel-1 backscatter timeseries'),
                ('id', 'myapp')])

Run the notebook

  • Type:
input_identifier="S1A_IW_GRDH_1SDV_20171210T182024_20171210T182049_019644_021603_0A33"
  • Click Kernel/Restart Kernel and Run All Cells and wait for the conclusion,

    • This step can take several minutes to conclude. You can be sure that it has finished when all the cells have number within the squared parentheses. On the other hand, the symbol [*] close to a cell means that the execution of that cell is ongoing. It can happen that the symbol [*] hangs on the last two cells of the notebook. If you observe that it is stuck for more than 30 minutes, follow the procedure below to check if the result is complete and valid.

      • Compute the MD5 hash value of the output file:
      md5sum /workspace/${APP_NAME}/src/main/app-resources/notebook/libexec/${input_identifier}_Beta0_VV.tif | awk '{print $1}'
      
      • See the following output:
      26fe39e2d58eb4e283bd58b345917bd6
      

      If your output corresponds to the above value, you can continue with the procedure. Otherwise, you would need to wait for the completion and possibly retry this check procedure.

  • Using the JupyterLab Launcher, start a Terminal and type:

mv /workspace/${APP_NAME}/src/main/app-resources/notebook/libexec/${input_identifier}_Beta0_VV.tif /workspace
  • Using the JupyterLab Left Sidebar, navigate to the Home and find the product ${input_identifier}_Beta0_VV.tif.

Going further

Parameter Definition

The data transformation application may have to expose parameters that can be changed via the Web Processing Service interface at submission time. These parameters are defined using a Python dictionary that defines the parameter identifier, its title and abstract and finally its default value:

[ ]:
filterSizeX = dict([('id', 'filterSizeX'),
               ('value', '5'),
               ('title', 'Speckle-Filter filterSizeX'),
               ('abstract', 'Set the Speckle-Filter filterSizeX (defaults to 5)')])

To use the parameter value, symply do:

[ ]:
int(filterSizeX['value'])

Runtime parameter definition

Runtime parameters are mandatory and define those parameters whose values will be changed at runtime.

These are:

  • input_identifier - this is the Sentinel-1 product identifier. At runtime its value is replaced with the Sentinel-1 product identifier being processed
  • input_reference - this is the Sentinel-1 product catalogue entry URL. At runtime its value is also replaced with the Sentinel-1 product catalogue entry URL being processed
  • data_path - this is the local path where the Sentinel-1 was staged-in in Step 3 - Stage-in the EO data. At runtime its value is replaced by a folder with an unique value

Discover parameters for a given SNAP operator

You might be wondering how to discover what parameters should be used with a given SNAP operator. The cell below shows how to do it programmatically:

[ ]:
operator = 'ThermalNoiseRemoval'

op_spi = GPF.getDefaultInstance().getOperatorSpiRegistry().getOperatorSpi(operator)

op_params = op_spi.getOperatorDescriptor().getParameterDescriptors()

for param in op_params:
    print(param.getName(), param.getDefaultValue())

Documenting the Jupyter Notebook streaming executable

One of the nice features of Jupyter Notebooks is that they can incorporate text that documents what is done.

The input.ipynb file contains already for a proposal for documenting the Sentinel-1 backscatter timeseries data transformation streaming notebook. You can use it as a starting point.

Next step

The next step is to deploy the data transformation as a local Web Processing Servic and test it against a Sentinel-1 product using the Sandbox resources (see Step 5 - Deploy and run the Web Processing Service locally)

[ ]: