Estimation of stochastic event flow parameters using machine learning methods
- Daria D. Salimzyanova, National Research Tomsk State University (Tomsk, Russia)
- Ekaterina Yu. Lisovskaya, National Research Tomsk State University (Tomsk, Russia)
- Sergey A. Samoilov, National Research Tomsk State University (Tomsk, Russia)
This paper addresses the problem of estimating parameters of stochastic event flows based on sample data using machine learning methods. Event flows, characterized by random intervals between the moments of occurrence, are widely used in the modeling of network traffic, telecommunications, computing systems, and in queuing theory. Accurate estimation of such flow parameters is crucial for subsequent analysis, forecasting, and load management in systems with uncertain input information. As training data for the models, we used event arrival times from two types of streams: a Poisson flow (with inter-arrival times following the exponential distribution) and a renewal process (with inter-arrival times following one of twelve probability distributions: gamma, hyperexponential, lognormal, uniform, inverse gamma, Weibull, Pareto, Lévy, Fisher, Fréchet, Lomax, and Burr XII). These distributions were selected due to their diverse statistical properties (presence or absence of moments, asymmetry, heavy tails), which enables coverage of a broad range of applicable scenarios. To solve the parameter estimation task, we employed fully connected neural networks and the CatBoost implementation of the gradient boosting algorithm. As input features for the models, we used the inter-arrival times and their numerical characteristics: mean, standard deviation, variance, coefficient of variation, and quantiles of various levels. To evaluate the model performance, classical machine learning metrics were used: MAE, RMSE, and R2. The study also included an assessment of the importance of features used in training. This was done using built-in interpretation tools of gradient boosting, which allow for a quantitative analysis of each feature's contribution to the parameter estimation.
traffic identification, network traffic, parameter estimation, gradient boosting
2025-12-01