Age-based maintenance under population heterogeneity: Optimal exploration and exploitation

نویسندگان

چکیده

We consider a system with finite lifespan and single critical component that is subject to random failures. An age-based replacement policy applied preventively replace the before its failure. The components used for come from either weak population or strong population, referred as heterogeneity. However, true type unknown decision maker. By considering maker has belief on probability of having we build partially observable Markov process model objective minimizing total cost over system. resulting optimal updates variable in Bayesian fashion by using data obtained course lifespan, it denotes when execute preventive replacement. It optimally balances trade-off between learning (via deliberately delaying time better learn type) maintenance activities. addressing this so-called exploration-exploitation trade-off, generate insights compare performance existing heuristic approaches literature. also characterize lower bound cost, allowing us determine value resolving uncertainty type.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nearly Optimal Exploration-Exploitation Decision Thresholds

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds for the multi-armed bandit problem, one for the infinite horizon discounted reward case and one for the finite horizon undiscounted reward case are derived, which make the link between the reward horizon, uncertainty ...

متن کامل

Shortest Path under Uncertainty: Exploration versus Exploitation

In the Canadian Traveler Problem (CTP), a traveler seeks a shortest path to a destination through a road network, but unknown to the traveler, some roads may be blocked. This paper studies the Bayesian CTP (BCTP), in which road states are correlated with known prior probabilities and the traveler can infer the states of an unseen road from past observations of other correlated roads. As general...

متن کامل

Human and Optimal Exploration and Exploitation in Bandit Problems

We consider a class of bandit problems in which a decision-maker must choose between a set of alternativeseach of which has a fixed but unknown rate of rewardto maximize their total number of rewards over a short sequence of trials. Solving these problems requires balancing the need to search for highly-rewarding alternatives with the need to capitalize on those alternatives already known to be...

متن کامل

Infomax strategies for an optimal balance between exploration and exploitation

Proper balance between exploitation and exploration is what makes good decisions, which achieve high rewards like payoff or evolutionary fitness. The Infomax principle postulates that maximization of information directs the function of diverse systems, from living systems to artificial neural networks. While specific applications are successful, the validity of information as a proxy for reward...

متن کامل

An Optimal Exploration-Exploitation Approach for Assortment Selection

We consider an online assortment optimization problem, where in every round, the retailer offers a Kcardinality subset (assortment) of N substitutable products to a consumer, and observes the response. We model consumer choice behavior using the widely used multinomial logit (MNL) model, and consider the retailer’s problem of dynamically learning the model parameters, while optimizing cumulativ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: European Journal of Operational Research

سال: 2022

ISSN: ['1872-6860', '0377-2217']

DOI: https://doi.org/10.1016/j.ejor.2021.11.038