نتایج جستجو برای: geo grid reinforcement
تعداد نتایج: 139042 فیلتر نتایج به سال:
In the theory of supervised learning, the identical assumption, i.e. the training and the test samples are drawn from the same probability distribution, plays a crucial role. Unfortunately, this essential assumption is often violated in the presence of selection bias. Under such condition, the standard supervised learning frameworks may suffer a significant bias. In this thesis, we use the impo...
We present a method for batch Q-learning when some of the data are missing. Our approach uses Bayesian multiple imputation to build Q-functions using all of the observed data. This is safer than using complete case analysis, i.e. throwing out incomplete training samples, because it can avoid non-response bias when possible. We also present a method for assessing confidence in the learned Qfunct...
We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risk-sensitive. We derive a risk-sensitive reinforcement learning algorithm with convergence guarantees that employs convex risk metrics and models of human decisionmaking deriving from behavioral economics. The risk-sensitive reinforcement learning algorithm provides the theoretical underpi...
In recent years, the proliferation of mobile devices has led to the emergence of mobile grid computing, that is extending the reach of grid computing by enabling mobile devices both to contribute to and utilise grid resources. Thus, the pool of available computational and storage resources can be significantly enriched by leveraging idle capacities of mobile devices. Nevertheless, the emergence...
This work represents the first step towards a task library system in the reinforcement learning domain. Task libraries could be useful in speeding up the learning of new tasks through task transfer. Related transfer can increase learning rate and can help prevent convergence to sub-optimal policies in reinforcement learning. Unrelated transfer can be extremely detrimental to the learning rate. ...
We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state predict...
Hyper-heuristics can be identified as methodologies that search the space generated by a finite set of low level heuristics for solving search problems. An iterative hyper-heuristic framework can be thought of as requiring a single candidate solution and multiple perturbation low level heuristics. An initially generated complete solution goes through two successive processes (heuristic selectio...
– Qualitative examples showing recognition episodes in streaming environment. Also see our project webpage for video examples. (A) – Streaming recognition results with different detector speeds (B). – Un-trimmed detection results with different object detector speeds (C). – State feature and reward function design (D). – Details for policy iteration (E). – Details for observation imputation in ...
Spatial Information Grid is a kind of application grid which tries to connect resources such as computer, data sources, and processing algorithms, and builds a distributed, robust, flexible and powerful infrastructure for geocomputation. It needs a powerful and easy-to-use running environment. In this paper, an autonomic runtime environment for geo-computation is proposed and named SIGRE — the ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید