On the software implementation of the Monte Carlo method for estimating confidence bands of linear regression
- Aleksandr V. Stepanov, D.I. Mendeleev Institute for Metrology (VNIIM) (St. Petersburg, Russia)
The paper addresses the problem of calculating uncertainty bands for linear regression with correlated input data. To estimate confidence bands, the generalized least squares method is applied. To determine their boundaries, a coverage factor is introduced; when multiplied by the standard uncertainty of the regression at specific points, this factor yields the required limits. The relevance of this research stems from the fact that standard methods for constructing confidence intervals, based on the assumption of independent errors, lead to a systematic underestimation of the uncertainty band width in the presence of autocorrelation. This, in turn, creates a false impression of forecast accuracy and may result in erroneous statistical conclusions. To construct confidence bands correctly, the structure of the temporal dependence of errors must be taken into account. This study considers the following models of correlated noise: autoregressive processes with exponential correlation decay, and colored noise characterized by power-law decay and long-term memory.
Unlike the classical case of independent errors, where the coverage factor corresponds to a quantile of the normal distribution, no analytical expression exists for this factor in the presence of correlation. The value of the factor directly depends on the structure of the error covariance matrix, the training sample size, and the forecasting horizon. To determine it, the paper employs a numerical Monte Carlo method combined with an iterative bisection procedure, which allows finding the coverage factor with a specified accuracy.
Specialized software has been developed in Python using the NumPy and SciPy libraries. The software implementation solves the problem of estimating hyperbolic-shaped linear regression bands when the error correlation structure is described by the aforementioned models. Corresponding examples for estimating the coverage factor and regression bands are provided, along with a link to the software implementation. The modular architecture of the developed program allows for expansion to other types of correlation structures. The applicability of the work is due to the need for correct uncertainty estimation in the statistical processing of experimental data obtained during the solution of measurement problems.
linear regression, confidence bands, generalized least squares, correlated noise, colored noise, Monte Carlo method, numerical methods
2026-03-05