نتایج جستجو برای: finite planning horizon
تعداد نتایج: 479354 فیلتر نتایج به سال:
A typical discrete-time sequential decision problem involves a system whose state is assumed to evolve either deterministically or probabilistically over time-periods that are often called stages. This evolution is affected by the decisions a planner makes at the beginning of each stage after observing the system state. The decision maker’s goal then is to optimize some measure of system perfor...
We investigate the computability of problems in probabilistic planning and partially observable infinite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic finite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there ex...
Models for long-term planning often lead to infinite horizon stochastic programs that offer significant challenges for computation. Finite-horizon approximations are often used in these cases but they may also become computationally difficult. In this paper, we directly solve for value functions of infinite horizon stochastic programs. We show that a successive linear approximation method conve...
In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...
In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle. We study the effectiveness of the nearoptimal cost-to-go oracle on the planning horizon and demonstrate that the costto-go oracle shortens the learner’s planning horizon as function of its accuracy: a globally optimal oracle can shorten the planning horizon to one, leading t...
For large state-space Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling...
Incorporating adaptive learning into macroeconomics requires assumptions about how agents incorporate their forecasts into their decision-making. We develop a theory of bounded rationality that we call finite-horizon learning. This approach generalizes the two existing benchmarks in the literature: Eulerequation learning, which assumes that consumption decisions are made to satisfy the one-step...
In this paper we present a Nash equilibrium problem of linear quadratic zero-sum dynamic games for descriptor system. We assume that the players give a linear feedback to the game. For the game with finite planning horizon we derive a differential Riccati type equation. For the game with infinite planning horizon we consider an algebraic Riccati type equation. The connection of the game solutio...
In this paper, we consider a class of nonlinear regulator optimal control problems with an infinite planning horizon. By assuming a specific feedback structure for the controller, the resultant regular problem is reduced to an optimal parameter selection problem. Since the reduced problem is on an infinite planning horizon, we construct a sequence of finite time optimal parameter selection prob...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید