نتایج جستجو برای: q algorithm
تعداد نتایج: 863118 فیلتر نتایج به سال:
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision Processes under the expected total discounted reward criterion when both the state and action spaces are continuous. This algorithm is based on Watkins’ Q-learning, but uses Nadaraya-Watson kernel smoothing to generalize knowledge to unvisited states. As expected, continuity conditions must be im...
We propose a new simple and natural algorithm for learning the optimal Q-value function of a discounted-cost Markov Decision Process (MDP) when the transition kernels are unknown. Unlike the classical learning algorithms for MDPs, such as Q-learning and ‘actor-critic’ algorithms, this algorithm doesn’t depend on a stochastic approximation-based method. We show that our algorithm, which we call ...
Let Q(α) and Q(β) be algebraic number fields. We describe a new method to find (if they exist) all isomorphisms, Q(β) → Q(α). The algorithm is particularly efficient if there is only one isomorphism.
In this paper, it is proposed a new soft decoding algorithm of the the q-ary image of some q-ary Reed-Solomon Codes. This algorithm uses some permutations to improve the performances of an usual soft decision decoding.
We describe an adaptation of the number field sieve to the problem of computing logarithms in a finite field. We conjecture that the running time of the algorithm, when restricted to finite fields of an arbitrary but fixed degree, is Lq[1/3; (64/9)1/3 + o(1)], where q is the cardinality of the field, Lq [s; c] = exp(c(log q)s(log log q)1−s), and the o(1) is for q →∞. The number field sieve fact...
Let Q(α) and Q(β) be algebraic number fields. We describe a new method to find (if they exist) all isomorphisms, Q(β) → Q(α). The algorithm is particularly efficient if there is only one isomorphism.
In this paper we consider the problem of string matching algorithm based on a two-dimensional mesh. This has applications such as string databases, cellular automata and computational biology. The main use of this method is to reduce the time spent on comparisons in string matching by using mesh connected network which achieves a constant time for mismatch a text string and we obtained O(¥. -ti...
Quantum searching requires precise knowledge of problem parameters (such as the fraction of target states) for efficient operation. Recently an algorithm has been discovered, referred to as the Phase-π/3 search algorithm, which gets around this limitation. This algorithm can search a database with the fraction of target states equal to 1 − ǫ so that in q queries it produces a probability of err...
guiding vehicles to their destination under dynamic traffic conditions is an important topic in the field of intelligent transportation systems (its). nowadays, many complex systems can be controlled by using multi agent systems. adaptation with the current condition is an important feature of the agents. in this research, formulation of dynamic guidance for vehicles has been investigated based...
We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A sketch of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms fo...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید