Constrained Contextual Bandit Learning for Adaptive Radar Waveform Selection

نویسندگان

چکیده

A sequential decision process in which an adaptive radar system repeatedly interacts with a finite-state target channel is studied. The capable of passively sensing the spectrum at regular intervals, provides side information for waveform selection process. transmitter uses sequence observations as well feedback from collocated receiver to select waveforms accurately estimate parameters. It shown that problem can be effectively addressed using linear contextual bandit formulation manner both computationally feasible and sample efficient. Stochastic adversarial models are introduced, allowing achieve effective performance broad classes physical environments. Simulations radar-communication coexistence scenario, radar-jammer demonstrate proposed substantial improvement detection when Thompson sampling EXP3 algorithms used drive Further, it harmful impacts pulse-agile behavior on coherently processed data mitigated by adopting time-varying constraint radar’s catalog.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Q-Learning-Based Adaptive Waveform Selection in Cognitive Radar

Cognitive radar is a new framework of radar system proposed by Simon Haykin recently. Adaptive waveform selection is an important problem of intelligent transmitter in cognitive radar. In this paper, the problem of adaptive waveform selection is modeled as stochastic dynamic programming model. Then Q-learning is used to solve it. Q-learning can solve the problems that we do not know the explici...

متن کامل

Adaptive Representation Selection in Contextual Bandit with Unlabeled History

We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decisionmaking begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training o...

متن کامل

Research on Adaptive Waveform Selection Algorithm in Cognitive Radar

Cognitive radar is a new framework of radar system proposed by Simon Haykin recently. Adaptive waveform selection is an important problem of intelligent transmitter in cognitive radar. In this paper, the problem of adaptive waveform selection is modeled as stochastic dynamic programming model. Then backward dynamic programming, temporal difference learning and Q-learning are used to solve this ...

متن کامل

Contextual Bandit Algorithms with Supervised Learning Guarantees

We address the problem of competing with any large set of N policies in the nonstochastic bandit setting, where the learner must repeatedly select among K actions but observes only the reward of the chosen action. We present a modification of the Exp4 algorithm of Auer et al. [2], called Exp4.P, which with high probability incurs regret at most O( √ KT lnN). Such a bound does not hold for Exp4 ...

متن کامل

Contextual Bandit Learning with Predictable Rewards

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on the action and context. We consider this problem under a realizability assumption: there exists a function in a (known) function class, always capable of predicting the expected reward, given the action and context. Unde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Aerospace and Electronic Systems

سال: 2022

ISSN: ['1557-9603', '0018-9251', '2371-9877']

DOI: https://doi.org/10.1109/taes.2021.3109110