On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs
نویسندگان
چکیده
We consider the linear programming approach for constrained and unconstrained Markov decision processes (MDPs) under long-run average-cost criterion, where class of MDPs in our study have Borel state spaces discrete countable action spaces. Under a strict unboundedness condition on one-stage costs recently introduced majorization transition stochastic kernel, we infinite-dimensional programs prove absence duality gap other optimality results. Our results do not require lower-semicontinuous MDP model. Thus, they can be applied to space dynamics are discontinuous variable. proofs make use continuity property measurable functions asserted by Lusin’s theorem.
منابع مشابه
Exact finite approximations of average-cost countable Markov decision processes
For a countable-state Markov decision process we introduce an embedding which produces a finite-state Markov decision process. The finite-state embedded process has the same optimal cost, and moreover, it has the same dynamics as the original process when restricting to the approximating set. The embedded process can be used as an approximation which, being finite, is more convenient for comput...
متن کاملCountable State Markov Decision Processes with Unbounded Jump Rates and Discounted Cost: Optimality Equation and Approximations
This paper considers Markov decision processes (MDPs) with unbounded rates, as a function of state. We are especially interested in studying structural properties of optimal policies and the value function. A common method to derive such properties is by value iteration applied to the uniformised MDP. However, due to the unboundedness of the rates, uniformisation is not possible, and so value i...
متن کاملApproximate Linear Programming for Constrained Partially Observable Markov Decision Processes
In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate lin...
متن کاملControlled Markov Decision Processes with AVaR criteria for unbounded costs
In this paper, we consider the control problem with the Average-Value-at-Risk (AVaR) criteria of the possibly unbounded L1-costs in infinite horizon on a Markov Decision Process (MDP). With a suitable state aggregation and by choosing a priori a global variable s heuristically, we show that there exist optimal policies for the infinite horizon problem for possibly unbounded costs. Mathematics S...
متن کاملl AVERAGE COST SEMI - MARKOV DECISION PROCESSES
^ The Semi-Markov Decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost Incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (nonrandomized) policy exist are then presented. It is also ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics of Operations Research
سال: 2022
ISSN: ['0364-765X', '1526-5471']
DOI: https://doi.org/10.1287/moor.2021.1177