نتایج جستجو برای: modified thompson

تعداد نتایج: 257891  

Journal: :Biometrics 2004
Jiahua Chen Mary E Thompson Changbao Wu

The fish abundance index over an ocean region is defined here to be the integral of expected catch per unit effort (CPUE), approximated by the sum of expected CPUE over grid squares. When trawl surveys are done within grid squares selected according to a probability sampling design, several other sources of variation such as the fish population dynamics and the catching process are also involve...

2011
Pedro A. Ortega Daniel A. Braun Simon J. Godsill

We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intractable planning problem reduces to a simple multi-armed bandit problem, where each lever stands for a potentially arbitrarily complex policy. Furthermore, we use the Bayesian control rule to construct an adaptive band...

2013

The following lemma is implied by Theorem 1 in Abbasi-Yadkori et al. (2011): Lemma 7. (Abbasi-Yadkori et al., 2011) Let (F ′ t; t ≥ 0) be a filtration, (mt; t ≥ 1) be an R-valued stochastic process such that mt is (F ′ t−1)-measurable, (ηt; t ≥ 1) be a real-valued martingale difference process such that ηt is (F ′ t)-measurable. For t ≥ 0, define ξt = ∑t τ=1mτητ and Mt = Id + ∑t τ=1mτm T τ , wh...

2010
Yusaku Kaneta Shin-ichi Minato Hiroki Arimura

In this paper, we extend the SHIFT-AND approach by BaezaYates and Gonnet (CACM 35(10), 1992) to the matching problem for network expressions, which are regular expressions without Kleene-closure and useful in applications such as bioinformatics and event stream processing. Following the study of Navarro (RECOMB, 2001) on the extended string matching, we introduce new operations called Scatter, ...

2015
Saba Yahyaa

The multi-objective multi-armed bandit (MOMAB) problem is a sequential decision process with stochastic rewards. Each arm generates a vector of rewards instead of a single scalar reward. Moreover, these multiple rewards might be conflicting. The MOMAB-problem has a set of Pareto optimal arms and an agent’s goal is not only to find that set but also to play evenly or fairly the arms in that set....

2016
Erik Waingarten

Consider the problem of learning a parametric distribution from observations. A frequentist approach to learning considers parameters to be fixed, and uses the data learn those parameters as accurately as possible. For example, consider the problem of learning Bernoulli distribution’s parameter ( a random variable is distributed as Bernoulli(μ) is 1 with probability μ and 0 with probability 1 −...

2006
Kurt Z. Long Jose Ignacio Santos Jorge L. Rosado Catalina Lopez-Saucedo Rocio Thompson-Bonilla Maricela Abonce Herbert L. DuPont Ellen Hertzmark Teresa Estrada-Garcia

Kurt Z. Long, Jose Ignacio Santos, Jorge L. Rosado, Catalina Lopez-Saucedo, Rocio Thompson-Bonilla, Maricela Abonce, Herbert L. DuPont, Ellen Hertzmark, and Teresa Estrada-Garcia Department of Nutrition and Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, and University of Texas Medical School and School of Public Health, Houston; Hospital Infantil de Mexico F...

2013
Pierce Edmiston Gary Lupyan

Although the word “dog” and an unambiguous barking sound may point to the same concept DOG, verbal labels and nonverbal cues appear to activate conceptual information in systematically different ways (Lupyan & Thompson-Schill, 2012). Here we investigate these differences in more detail. We replicate the finding that labels activate a more prototypical representation than do sounds, and find tha...

Journal: :CoRR 2017
Yichi Zhou Jun Zhu Jingwei Zhuo

Thompson sampling has impressive empirical performance for many multi-armed bandit problems. But current algorithms for Thompson sampling only work for the case of conjugate priors since these algorithms require to infer the posterior, which is often computationally intractable when the prior is not conjugate. In this paper, we propose a novel algorithm for Thompson sampling which only requires...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید