Data mining when constructing a knowledge graph of a multidisciplinary journal

Olga M. Ataeva, Ludmila V. Massel, Vladimir A. Serebryakov, Natalia P. Tuchkova

FRC CSC RAS, Melentiev Energy Systems Institute SB RAS

The paper explores the thematic diversity of the interdisciplinary journal. The purpose of the research is to build a knowledge graph of the journal for the thematic presentation and systematization of the electronic archive and new publications of the journal. The initial data are journal articles devoted to various information and mathematical technologies in science and management, that is, interdisciplinary research. The systematization of texts using vector analysis methods is proposed. In the process of thematic analysis of the content of the journal, a division into headings is proposed, links of headings and articles with the corresponding descriptions of the specialties of the Higher Attestation Commission are established. To analyze the topic, an exploratory analysis of the source texts is used, then data mining methods are used. The results of the division are provided to the experts of the journal, after which a decision is made on the formation of a thematic heading and the inclusion of the specialties of the Higher Attestation Commission in it. The journal articles are integrated into the LibMeta semantic library, which is why the library's ontology is being completed and the journal's ontology is being formed, and the journal's knowledge graph is being built on this basis. A procedure for navigating through the content of the journal using the knowledge graph in the LibMeta semantic library is proposed, which can become the basis for information support of scientific research and the creation of a digital assistant in an interdisciplinary subject area. Examples are given for specific journal content, but the proposed technology can be extended to other journals, since most journals belonging to several specialties of the Higher Attestation Commission naturally capture several disciplines.

knowledge graph, semantic library, ontology completion, clustering of scientific articles, text summarization

Back