Application of data mining methods to study the properties of supercomputing applications
Denis I. Shaikhislamov, Vadim V. Voevodin
Research computing center of Lomonosov Moscow State University
For efficient use of supercomputer resources, it is necessary to constantly analyze various aspects of the quality of modern high-performance systems. One of the most important aspects is the efficiency of execution of parallel applications running on a supercomputer. And in to study this aspect, it is often useful to have information about how different applications are similar to each other. Previously, we proposed two approaches to comparing applications: based on static information about executable files, as well as the behavior during execution. In this paper, we will show two practical methods for applying these approaches: clustering and predicting metrics for assessing the quality of the use of supercomputer resources. Using clustering, we will show how abnormal groups of job launches can be detected, for example, within the entire flow of supercomputing applications or within the launches of a single user. Using the prediction of metrics for assessing the quality of use of supercomputer resources, it will be shown how, while minimizing the impact on running applications, to collect statistics on the efficiency of user applications. These methods were successfully tested on a petaflop supercomputer Lomonosov-2.
high performance computing, application efficiency, supercomputing center, abnormal launches, data mining