Hands-On Exercise 2: make a robust workflow and debug it

In this exercise we will install a more robust version of the basic workflow. Also we will learn how to debug the workflow through the Hadoop JobTracker Web GUI.

Prerequisites

Install the Hands-On

  • Install the Hands-On Exercise 2, just type:
cd
cd dcs-hands-on
mvn clean install -D hands.on=2 -P bash

Inspect the run executable

  • Open the my_node/run executable with a text editor or using the more command:
cd $_CIOP_APPLICATION_PATH
more my_node/run

You will see the cleanExit() function. It will be useful to trace our workflow and make it more robust.

Run the node and debug the workflow

  • Execute the node my_node:
ciop-run my_node
  • From the output of the ciop-run command, we can copy in the clipboard the Tracking URL. It will be similar to this:
2016-01-19 12:31:57 [INFO ] - Tracking URL:
2016-01-19 12:31:57 [INFO ] - http://sb-10-16-10-50.dev.terradue.int:11000/oozie/?job=0000001-160119102214227-oozie-oozi-W
  • Open a browser and paste the Tracking URL just copied.
  • You will see the workflow details in the Web GUI. On the screenshot, the red rounded field represents the node my_node. Click on this field and then in the lens icon:
Workflow summary
Node detail
  • You will see the job details in the Web GUI. On the screenshot, the red rounded link represents the number of parallel tasks (in Sandbox mode, the default is 2). Click on this link:
Job summary
  • We have the list of tasks.
  • To see the details about of one of them, just click on the name in the Task column (the rounded one):
Tasks details
  • Now we have the list of task attempts.
  • To see the output related to one of them (in this case we have just one attempt), just click on the All link in the Task Logs column (the rounded one):
Attempts details
  • Finally we have the output list of the selected task attempt.
  • We can see the output of the ciop-log function:
Attempts output

Congrats

You learnt how to insert an Exit function in your run executable, and to visualize the associated log message generated from the task completion.

Here’s the related piece of code of the run executable:

    ciop-log "INFO" "The input file is: $inputfile"

Hint

Try to debug the second task to see the output generated.

Recap

  1. We installed a different version of the run executable that includes some additional features to make it more robust;
  2. We ran the node ‘expression’ and we debugged the output in the Hadoop JobTracker Web GUI.