Data pipeline design¶
Parallelization Strategy¶
where the service developer defines the Complex directed acyclic graph (DAG) of the service showing the different jobs, inputs and outputs of the workflow. This graph will show the parallelization strategy to be applied on the Cloud Environment;
Data Sources¶
the data requirements need are analyzed and their retrieval mechanism accessed to make sure that the data is available in the system to be consumed as expected;
Tools and Libraries¶
the tools and libraries necessary to execute the applications are analyzed and their compatibility is evaluated taking in consideration the computational resources available.