نتایج جستجو برای: discounted models

تعداد نتایج: 912542  

2018
Jayakumar Subramanian Aditya Mahajan

In this paper, we present an online reinforcement learning algorithm, called Renewal Monte Carlo (RMC), for infinite horizon Markov decision processes with a designated start state. RMC is a Monte Carlo algorithm and retains the advantages of Monte Carlo methods including low bias, simplicity, and ease of implementation while, at the same time, circumvents their key drawbacks of high variance a...

Journal: :CoRR 2011
Krishnendu Chatterjee Luca de Alfaro Pritam Roy

Turn-based stochastic games and its important subclass Markov decision processes (MDPs) provide models for systems with both probabilistic and nondeterministic behaviors. We consider turnbased stochastic games with two classical quantitative objectives: discounted-sum and long-run average objectives. The game models and the quantitative objectives are widely used in probabilistic verification, ...

Journal: :Discrete Applied Mathematics 2019

Journal: :Theoretical Economics 2018

Journal: :Bulletin of the American Mathematical Society 1971

Journal: :North American Actuarial Journal 2021

Journal: :Systems & Control Letters 2015

Journal: :Operations Research 2007
Manel Baucells Rakesh K. Sarin

In this paper, we propose a model of intertemporal choice that explicitly incorporates satiation due to previous consumption in the evaluation of the utility of current consumption. In the discounted utility (DU) model, the utility of consumption is evaluated afresh in each time period. In our model, the utility of current consumption represents an incremental utility from the past level. When ...

Journal: :Math. Oper. Res. 1998
Eilon Solan

We give an alternative proof to a result of Mertens and Parthasarathy, stating that every n-player discounted stochastic game with general setup, and with a norm-continuous transition, has a subgame perfect equilibrium. † Institute of Mathematics and Center for Rationality and Interactive Decision Theory, The Hebrew University, Givat Ram, 91904 Jerusalem, Israel. e-mail: [email protected] ...

2017

Recurrent Neural Networks architectures excel at processing sequences by modelling dependencies over different timescales. The recently introduced Recurrent Weighted Average (RWA) unit captures long term dependencies far better than an LSTM on several challenging tasks. The RWA achieves this by applying attention to each input and computing a weighted average over the full history of its comput...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید