Digital profile of regional development: analysis of grant projects using LDA and BERTopic
- Tatiana V. Mitrofanova, Chuvash State University named after I.N. Ulyanov (Cheboksary, Russia)
- Anastasia V. Khristoforova, Chuvash State University named after I.N. Ulyanov (Cheboksary, Russia)
Abstract. This article proposes an approach for creating a digital profile of a region’s socio-economic development
through automated text data analysis. Grant project descriptions from the Republic of Mari El (2023–2025) were
used as a key indicator. By combining classical LDA with the modern neural BERTopic model, latent thematic
patterns shaping the region’s current development agenda were identified and visualized. The resulting digital
profile allows for an objective evaluation of civic initiatives, highlights dominant areas such as social support,
human capital development, sports, and education, and assesses their alignment with regional strategic goals.
The study found that the target group "children" forms the core of the grant agenda, centering on clusters of educational, rehabilitation, and inclusive projects. LDA modeling identified five consistent thematic areas, including support for families with children and the adaptation of individuals with disabilities. BERTopic, in turn, enabled the detailed elaboration of narrow niche practices, such as inclusive sports, rehabilitation for people with visual impairments, and cultural and leisure programs for children with disabilities, confirming the high semantic sensitivity of the neural network approach. Comparison of the resulting thematic clusters with the draft Strategy for Socioeconomic Development of the Republic of Mari El revealed both areas of complete priority overlap (childhood support, inclusion) and strategic imbalances: the underrepresentation of projects in the creative industries, tourism, and youth work. The proposed methodology demonstrates that machine learning methods open up new opportunities for monitoring and diagnosing the state of regional and municipal systems, providing management teams with a data-driven tool for decision-making and adjusting grant policy. Thus, thematic modeling enables the translation of unstructured text data into an objective analysis, transforming descriptions of civic initiatives into a reliable tool for identifying real social priorities and the basis for strategic planning.
digital regional profile, topic modeling, LDA, BERTopic, data-driven management, grant analysis, regional development, machine learning, natural language processing (NLP), Republic of Mari El
2026-03-05