نتایج جستجو برای: reward penalty scheme

تعداد نتایج: 265788  

Journal: :CoRR 2018
Daniel A. Abolafia Mohammad Norouzi Quoc V. Le

We consider the task of program synthesis in the presence of a reward function over the output of programs, where the goal is to find programs with maximal rewards. We employ an iterative optimization scheme, where we train an RNN on a dataset of K best programs from a priority queue of the generated programs so far. Then, we synthesize new programs and add them to the priority queue by samplin...

1991
Thomas Kunz

This paper discusses a load balancing heuristic in a general-purpose distributed computer system. We implemented a task scheduler based on the concept of a Stochastic Learning Automaton on a network of Unix workstations. The used heuristic and our implementation are shortly described. Creating an executable artiicial workload, a number of experiments examined diierent learning schemes. Using a ...

2016
Inna Kofman Nurul Huda

Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and may be reluctant to forward packets for other nodes. One approach to encourage co-operations among nodes is to reward nodes that forward data for others. Such an incentive-based scheme require...

2002
Alex Reznik Sergio Verdú

The main question addressed in this paper is the problem of maximizing the ”transport capacity” of a broadcast network in a Gaussian power-law channel, where by transport capacity we mean a quantity akin to the bandwidth-distance product as used in [6]. In the process of addressing this issue we also derive a transportcapacity maximizing resource allocation scheme for a general set of reward an...

Journal: :The Journal of Thoracic and Cardiovascular Surgery 2018

Journal: :Management Science 2010
Asunur Cezar Huseyin Cavusoglu Srinivasan Raghunathan

We examine the implications of a firm outsourcing both (i) security device management which attempts to prevent security breaches and (ii) security monitoring which attempts to detect security breaches to managed security service providers (MSSPs). In the context of security outsourcing, the firm not only faces the traditional moral hazard problem as it cannot observe an MSSP’s prevention or de...

Journal: :journal of advances in computer research 2015
najmeh hosseinpour mohammad mosleh saeed setayeshi

nowadays diabetes disease is one of the main problems of health domain and it’s known as the fourth factor of death in the world. the main problem with this dangerous disease is the late or weak diagnosis. the reason of weak diagnosis is because sometimes doctors aren’t able to select the right patterns or they can’t use the standard patterns very well, so the outcome is that the disease will b...

2017
Jinhee Kim Hackjin Kim Eunjoo Kang

Reward processing, which plays a critical role in adaptive behavior, is impaired in addiction disorders, which are accompanied by functional abnormalities in brain reward circuits. Internet gaming disorder, like substance addiction, is thought to be associated with impaired reward processing, but little is known about how it affects learning, especially when feedback is conveyed by less-salient...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید