نتایج جستجو برای: extension policy
تعداد نتایج: 407573 فیلتر نتایج به سال:
Markov control algorithms that perform smooth, non-greedy updates of the policy have been shown to be very general and versatile, with policy gradient and Expectation Maximisation algorithms being particularly popular. For these algorithms, marginal inference of the reward weighted trajectory distribution is required to perform policy updates. We discuss a new exact inference algorithm for thes...
according to the articles 169 to 179 of the civil judgment enforcement law, the enforcement of the foreign judgment in the iranian law is based on the reciprocal act. but extension of international relations in the world today, above all in the commercial domain, requires that enforcement of foreign judgment be accepted after obtaining its correctness conditions based on res judicata. therefore...
among well-known managerial practices to protect scarce biological resources, payment for ecosystem services is an efficient and effective policy tool to manage such resources and its application is attracting a growing interest in the world. this is why, in addition to clarifying the role of such a policy tool to encourage farmers to modify their irrigation system to reduce pressures on water ...
We propose a sequential learning policy for ranking and selection problems, where we use a non-parametric procedure for estimating the value of a policy. Our estimation approach aggregates over a set of kernel functions in order to achieve a more consistent estimator. Each element in the kernel estimation set uses a di erent bandwidth to achieve better aggregation. The nal estimate uses a weigh...
In this paper we present a new way of predicting the performance of a reinforcement learning policy given historical data that may have been generated by a different policy. The ability to evaluate a policy from historical data is important for applications where the deployment of a bad policy can be dangerous or costly. We show empirically that our algorithm produces estimates that often have ...
Solutions to non-cooperative multiagent systems often require achieving a joint policy which is as fair to all parties as possible. There are a variety of methods for determining the fairest such joint policy. One approach, min fairness, finds the policy which maximizes the minimum average reward given to any agent. We focus on an extension, leximin fairness, which breaks ties among candidate p...
Climate change—and, by extension, climate policy—is beset with unknowns and unknowables. This “Reflections” presents an overview of approaches to managing climate uncertainties, in the hopes of providing guidance for current policy decisions as well as future research. We propose the following guidance for policy makers: Treat climate change as a risk management problem; recognize that benefit-...
Attribute-based encryption (ABE) is an extension of traditional public key encryption in which the encryption and decryption phases are based on user’s attributes. More precisely, we focus on ciphertext-policy ABE (CP-ABE) where the secret-key is associated to a set of attributes and the ciphertext is generated with an access policy. It then becomes feasible to decrypt a ciphertext only if one’...
In the classical versions of “Best Choice Problem”, the sequence of offers is a random sample from a single known distribution. We present an extension of this problem in which the sequential offers are random variables but from multiple independent distributions. Each distribution function represents a class of investment or offers. Offers appear without any specified order. The objective is...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید