نتایج جستجو برای: dql technique
تعداد نتایج: 611417 فیلتر نتایج به سال:
With the growth of the Internet, directory softwares have recently proliferated. The Lightweight Directory Access Protocol (LDAPv3 [18]) is the standard, proposed by the Internet Engineering Task Force (IETF [9]) for modelling and querying network directory information, as well as accessing network directory services. More recently, several extensions have been proposed to the directory model t...
This paper describes a new algorithm, called MDQL, for the solution of multiple objective optimization problems. MDQL is based on a new distributed Q-learning algorithm, called DQL, which is also introduced in this paper. In DQL a family of independent agents, exploring diierent options, nds a common policy in a common environment. Information about action goodness is transmitted using traces o...
In this paper, we introduce GERMS, a dataset designed to accelerate progress on active object recognition in the context of human robot interaction. GERMS consists of a collection of videos taken from the point of view of a humanoid robot that receives objects from humans and actively examines them. GERMS provides methods to simulate, evaluate, and compare active object recognition approaches t...
Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs. However, trade-off between quantization bitwidth and final accuracy is complex non-convex, which makes it difficult optimized directly. Minimizing direct loss (DQL) coefficient data local optimization method, but previous works often neglect accurate control DQL, resulting in a higher ...
This paper develops deep reinforcement learning (DRL) algorithms for optimizing the operation of home energy system which consists photovoltaic (PV) panels, battery storage system, and household appliances. Model-free DRL can efficiently handle difficulty modeling uncertainty PV generation. However, discrete-continuous hybrid action space considered challenges existing either discrete actions o...
Conventional anti-jamming methods mainly focus on preventing single jammer attacks with an invariant jamming policy or from multiple jammers similar policies. These are ineffective against a following several different policies distinct Therefore, this article proposes method that can adapt its to the current attack. Moreover, for scenario, estimates future occupied channels using jammers’ in p...
The rise of the new generation cyber threats demands more sophisticated and intelligent defense solutions equipped with autonomous agents capable learning to make decisions without knowledge human experts. Several reinforcement methods (e.g., Markov) for automated network intrusion tasks have been proposed in recent years. In this paper, we introduce a detection method, which combines Q-learnin...
Finding the optimal signal timing strategy is a difficult task for problem of large-scale traffic control (TSC). Multiagent reinforcement learning (MARL) promising method to solve this problem. However, there still room improvement in extending problems and modeling behaviors other agents each individual agent. In article, new MARL, called cooperative double Q-learning (Co-DQL), proposed, which...
Abstract With the expansion of communicative and perceptual capabilities mobile devices in recent years, number complex high computational applications has also increased rendering traditional methods traffic management resource allocation quite insufficient. Recently, edge computing (MEC) emerged as a new viable solution to these problems. It can provide additional features at network allow al...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید