Data Mining Based on Deep Reinforcement Learning for Prediction of Operating Frequencies and Bands in a Cognitive Radio System, Интеллектуальный анализ данных на базе глубокого обучения с подкреплением для прогноза рабочих частот и полос в системе когнитивного радио

Authors: Andrey Tolstykh, Andrey Golubinskiy

Publication: Informatics and Automation, Информатика и автоматизация

Published: Apr 6, 2026

Source: Crossref

Back to Search View Original Cite This Article

Abstract

<jats:p>The paper proposes a method for solving the problem of choosing a communication channel in cognitive radio based on information about the current state of all available communication channels using the mathematical apparatus of reinforcement learning. The method consists in formalizing the problem of choosing communication channels in terms of "environment-agent" and training agents using the REINFORCE, SARSA and A2C algorithms. The calculation of memory costs for solving the problem of selecting communication channels using classical methods is given. The memory estimate is 4×22n bytes for a random state of channels (busy/free) and 4×n2 bytes for one free channel at each step when solving the problem using the tabular Q-learning algorithm. Two different formalizations of the reward for the agent within the framework of the problem being solved using reinforcement learning are presented - for the trivial case (binary availability/unavailability of the frequency channel) and for a more complex case considering the power (in dB) in the selected communication channel. The restriction on the first formalization is that at each iteration there should be only one free communication channel out of all available channels. The second proposed formalization of the reward function does not impose such restrictions and is more universal. Computational experiments are presented for the corresponding formalizations of the reward function. Agents are trained using the SARSA and A2C algorithms. On average, error-free solutions are achieved after 8,000 training episodes for the corresponding formalizations of training in a model problem for various agent implementations. The REINFORCE algorithm does not provide error-free solutions, but reward formulation takes into account the improved training efficiency of the REINFORCE algorithm. Theoretical estimates of the computational complexity of the considered methods are provided, which are consistent with the computational experiments.</jats:p>

Keywords

problem communication using channel channels

Abstract

Keywords

Related Articles

EFFICIENT INDEXING METHODS IN THE DATA MINING CONTEXT

Different or alike? Comparing computer-based and paper-based card sorting

WEB-BASED COLLABORATIVE LEARNING AND KNOWLEDGE SHARING IN INFORMAL BUSINESS NETWORKS

ACHIEVING OPTIMAL AVERAGE DATA NODE STORAGE UTILIZATION IN K-DIMENSIONAL POINT DATA INDEXES

A chlorhexidine-releasing epoxy-based coating on titanium implants prevents Staphylococcus aureus experimental biomaterial-associated infection