The DORII Middleware. Workflow Management System (WfMS)

Overview

The main goal of the Workflow Manager System (WfMS) is supporting users in definition, management and monitoring of measurement scenarios. What is outstanding (among other workflow system) in our solution is that it allows connect many computational and experimental jobs in one execution graph.

The workflow component has been first introduced in the Virtual Laboratory project [5]. The first prototype of the WfMS system has been deployed within the field of nuclear magnetic resonance spectroscopy and was used by scientist to design and manage their experiment scenarios. The user was able to define experiment components, data flows between components, describe the experiment parameters and conditions. Moreover, the composed measurement scenario could be submitted for processing i.e. to the grid environment.

The experience gained during design and implementation of the VLab project is currently being used in the EXPReS project. The overall objective of EXPReS is to create a production-level real-time, electronic VLBI, or e-VLBI, service in which the radio telescopes are reliably connected to the central supercomputer at JIVE in the Netherlands via a high-speed optical-fibre communication network. The skeleton of the WfMS has been used as a prototype of the application for controlling and managing data flows in the VLBI. The tool will be the central point of the VLBI, and will be used to design and control the observation.

Experiments executed in the science laboratories are usually complex and consist of many stages. Thus, we can define a graph which describes the execution path specified by a user. Nodes in this graph correspond to experimental, computational and storage tasks. Edges (links) correspond to the path the measurement execution is following. Type of application and its parameters are defined in nodes.

WfMS facilitates the measurement process beginning from the preparation stage through experimental and computational processes to results analysis (based on the achieved visualization data). WfMS is functionally divided into three parts: editor, manager and monitor.

In Workflow Editor (WE) a user is given a list of building blocks. Each block represents the available resource. It is the user’s responsibility to design and construct the experiment workflow from the available building blocks. However, the application can be equipped with the knowledge from the specific domain. In that case WE will be able to support users during the design phase of the workflow.

Each application available in the measurement scenario must be first analyzed from the functional point of view. Input and output parameters have to be taken into consideration. Also, input and output format files must be described. When the application analysis is done connection diagram must be prepared. This diagram consists of information about what possible paths between applications can be created.

The WE is designed as a stand alone application, which can be launched using Java Web Start technology. Using Java Web Start technology, standalone Java software applications can be deployed with a single click over the network. Java Web Start ensures the most current version of the application will be deployed, as well as the correct version of the Java Runtime Environment will be used. The main advantage of such an approach is an easy, unique and intuitive interface. Moreover, the application can be run at every computer connected to the Internet and equipped with web browser.

Fig. 1: Workflow conception

Equipping user with graphical interface for scenario definition has big advantages - user does not have to know workflow description language. Using WE the user builds his workflow by connecting defined resources together and providing the required properties. After that workflow is submitted to the central management called Workflow Manager (WM), where is decomposed according to the user specification and each node is launched. All experimental tasks are directed to Instrument Element (IE) service and similarly all computational tasks are submitted to Computing Element (CE). Application results are stored on Storage Element (SE).

Finally, WM needs grid infrastructure information, which can be taken from the grid monitoring system. This information will be integrated with active workflow and presented to the user.

General conception of the workflow work is presented in Fig. 1.

The concept of the WfMS allows defining the process of an experiment in any way, from pre-processing, through executing the experiment, to the post-processing and visualization tasks. Defining the measurement scenario allows to spare a lot of time during computation. The user does not have to wait for the end of a given process stage to submit another one. It is made automatically. Thanks to the Workflow Editor the user can easily define, submit and monitor the progress of the workflow realization.

Architecture details

Workflow Editor is launched by user from the Virtual Control Room interface. Current session_id is sent to Workflow Editor and additionally package consists of session_id and user_id is sent to the Workflow Manager module.

We need to draw a distinction between workflow and scenario. Our implementation scenario consists of information about VO specific application, possible connections between applications, application parameters (their ranges, default values, step change, etc.) and many other interface specification data. Workflow is specific instance of scenario prepared and submitted by the user.

Workflow Manager is module responsible for receiving prepared by WE workflow and launching dedicated instance of Workflow Updater (WU) responsible for servicing it. Moreover, WM provides all necessary information for WE. All requests related to workflow submission, job monitoring, workflow changes, scenario information are sent/received via WM. This solution allows to limit the number of open communication ports what improves security.

Workflow Updater is responsible for execution of specific workflow. In the first stage workflow is decomposed into single jobs. Next, according to execution graph and based on application facade and provided parameters - job description is prepared. This description is forwarded to Middleware Access Object module which using current facade interface submits job to the gLite service.

Fig. 2: Workflow architecture

Workflow database is functionally divided into three parts: User database – user info, user VO, workflow relations, etc. Workflow database – information about currently executed workflows, their status, etc. Scenario database – information about applications, possible connections, parameters, etc. Workflow architecture is presented on Fig. 2.

Source code

Binary data

Documentation and manuals

Back to top
DORII project receives funding from the EC's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° RI-211693.