Back to Search View Original Cite This Article

Abstract

<jats:p>The subject of the article is the processes of  the automated analysis of server logs of software systems using Natural Language Processing methods. The goal is to develop an approach to improve the efficiency of log analysis by applying NLP algorithms and text compression methods. The tasks to be solved are: to study the features of logs as a source of technical information; to analyze modern methods of text data processing, including statistical, vector-based, and neural network approaches; to justify the feasibility of using NLP algorithms for automating log analysis; to substantiate the use of text compression methods to reduce data volume; and to develop a concept and architecture of an automated log analysis system. The methods used include: Natural Language Processing techniques such as TF-IDF, Word2Vec, FastText, and transformer-based models; text preprocessing methods; log compression and log template extraction approaches; and information retrieval methods applied to text corpora. The following results were obtained. A concept of an automated server log analysis system based on NLP methods and text compression algorithms was developed. A generalized system architecture was proposed, including modules for log collection, storage, filtering, preprocessing, text compression, NLP analysis, incident representation, and solution search. An algorithm for system operation was developed, providing step-by-step log processing while taking into account the textual nature and large volume of data. It was established that the use of NLP methods improves the accuracy of error detection and incident classification, while text compression reduces the computational load and increases system performance. Conclusions. The proposed approach improves the efficiency of software diagnostics, reduces the time required for error detection and resolution, and decreases the workload on developers. The scientific novelty of the obtained results lies in the following: an approach to automated log analysis combining NLP methods and text compression techniques was developed; a generalized architecture of a log analysis system that considers the specifics of unstructured textual data was proposed; existing log processing methods were further developed through the integration of modern NLP models with data volume optimization stages, which improves analysis efficiency in conditions of large-scale information flows</jats:p>

Show More

Keywords

methods analysis text compression system

Related Articles