Comparative analysis of scientific workflow management systems
Mikhail L. Voskoboinikov, Alexander G. Feoktistov
Matrosov Institute for System Dynamics and Control Theory of SB RAS
The rapid evolution of parallel and distributed computing systems, telecommunication technologies, and cloud platforms has enabled the development and use of scientific applications to prepare and conduct largescale experiments with large amounts of data. Often, the applications implement a complex problem-solving scheme based on the integrated execution of processes for data transfer, processing and analysis, resource-intensive computation, and decision-making. At the same time, the mathematical models and software of applications may be developed by different groups of specialists from different organizations and focused on heterogeneous computing resources. This requires the use of advanced tools for the design, implementation, deployment, and execution of scientific workflows within a single distributed computing environment, ultimately integrating algorithmic knowledge, software and hardware used, data, and various services. Today, such tools are usually workflow management systems. In this context, the paper is dedicated to discuss the current state of known workflow management systems, as well as to address the problems associated with the development and use of scientific workflows in different computing environments. The problems associated with the development and use of such systems, which are currently not fully solved, are highlighted. In particular, we point out the need to take into account subject domain specificities, the computation scaling, the demand for service-oriented applications, and the efficiency of using heterogeneous distributed environments that integrate high-performance user resources, cluster resources of shared use centers, Grid systems, and cloud platforms
distributed computing, scientific workflows, workflow management systems