Klyachin V.A., Khizhnyakova E.V. On the Possibility of Using the Wiener Index to Calculate Features of Natural Language Texts

Details: Hits: 111

https://doi.org/10.15688/mpcm.jvolsu.2025.3.3

Vladimir A. Klyachin
Doctor of Sciences (Physics and Mathematics), Head of the Department of Computer Sciences and Experimental Mathematics, Volgograd State University
This email address is being protected from spambots. You need JavaScript enabled to view it. , This email address is being protected from spambots. You need JavaScript enabled to view it. ,
https://orcid.org/0000-0003-1922-7849
Prosp. Universitetsky, 100, 400062 Volgograd, Russian Federation

Ekaterina V. Khizhnyakova
Senior Lecturer, Department of Computer Sciences and Experimental Mathematics, Volgograd State University
This email address is being protected from spambots. You need JavaScript enabled to view it. , This email address is being protected from spambots. You need JavaScript enabled to view it. ,
https://orcid.org/0000-0002-7914-9988
Prosp. Universitetsky, 100, 400062 Volgograd, Russian Federation

Abstract. The article demonstrates the application of the Wiener index to solving one of the problems of natural language text processing. The Wiener index is defined as the sum of all shortest distances in a weighted connected graph. This value characterizes the complexity of the graph. In this paper, two modifications of this index are introduced. In the first version, the usual Wiener index of an N vertex connected graph is divided by (N − 1)² . In the second version, the Wiener index of a Euclidean graph is divided by the sum of the distances between any pair of non-coinciding vertices. For application to text processing problems, the article introduces a graph of text sentences: an edge is formed by a pair of words that occur in the text in some sentence. To calculate the value of the Wiener index for a Euclidean graph, word embedding is used. The article briefly describes the algorithm for learning word embeddings by T. Mikolova. In addition, the article provides an algorithm for approximate calculation of a spanning tree with a minimal Wiener index. The algorithm is based on minimizing the new term when adding an edge to the constructed part of the tree. In order to identify uninformative text, 4 features are calculated based on the Wiener index and its modifications. Classification is carried out using standard machine learning methods.

Key words: graph, Wiener index, spanning tree, words embedding, machine learning.

On the Possibility of Using the Wiener Index to Calculate Features of Natural Language Texts by Klyachin V.A., Khizhnyakova E.V. is licensed under a Creative Commons Attribution 4.0 International License.

Citation in English: Mathematical Physics and Computer Simulation. Vol. 28 No. 3 2025, pp. 24-36

Attachments:
klyachin.pdf URL: https://mp.jvolsu.com/index.php/en/component/attachments/download/1252	176 Downloads

Nav view search

Navigation

Search

Main Menu

Login Form

Индексируется в:

Klyachin V.A., Khizhnyakova E.V. On the Possibility of Using the Wiener Index to Calculate Features of Natural Language Texts