Sequential Learning without Feedback
نویسندگان
چکیده
In many security and healthcare systems a sequence of features/sensors/tests are used for detection and diagnosis. Each test outputs a prediction of the latent state, and carries with it inherent costs. Our objective is to learn strategies for selecting tests to optimize accuracy & costs. Unfortunately it is often impossible to acquire-in-situ ground truth annotations and we are left with the problem of unsupervised sensor selection (USS). We pose USS as a version of stochastic partial monitoring problem with an unusual reward structure (even noisy annotations are unavailable). Unsurprisingly no learner can achieve sublinear regret without further assumptions. To this end we propose the notion of weak-dominance. This is a condition on the joint probability distribution of test outputs and latent state and says that whenever a test is accurate on an example, a later test in the sequence is likely to be accurate as well.We empirically verify that weak dominance holds on real datasets and prove that it is a maximal condition for achieving sublinear regret. We reduce USS to a special case of multi-armed bandit problem with side information and develop polynomial time algorithms that achieve sublinear regret.
منابع مشابه
The effect of self-control feedback on the learning of generalized motor program and parameters during physical and observational practice
The purpose of this study was to examine the effect of self-control feedback on the learningof generalized motor program and parameters during physical and observational practice. Participants (n=90) were randomly assigned to physical and observational practice (self-control, yoked and instructor KR) groups. They practiced a sequential timing task. The task required participants to press four k...
متن کاملImpact of immediate feedback on the learning of medical students in pharmacology
Introduction: Providing feedback to students is an essentialcomponent in medical education and has been shown to improvethe students’ learning. The purpose of this study is to evaluatethe effect of computer-based immediate feedback on the medicalstudents’ learning in a pharmacology course.Methods: In this prospective intervention study some feedbackmodules in pharmacology (FMP) were prepared in...
متن کاملSequential Learning for Dialog Act Classification in Tutorial Dialog
Dialog act classification or tagging is the task of assigning labels such as “question”, “assertion”, “positive feedback” and “negative feedback” to the turns in a dialog. In this project, we study the dialog act classification task as applied to human-human tutoring dialogs in the domain of thermodynamics. We initially establish a baseline by posing the task as a classification problem and app...
متن کاملLearning Sequential Tasks by Incrementally Adding Higher Orders
An incremental, higher-order, non-recurrent network combines two properties found to be useful for learning sequential tasks: higherorder connections and incremental introduction of new units. The network adds higher orders when needed by adding new units that dynamically modify connection weights. Since the new units modify the weights at the next time-step with information from the previous s...
متن کاملAugmenting Reinforcement Learning with Human Feedback
As computational agents are increasingly used beyond research labs, their success will depend on their ability to learn new skills and adapt to their dynamic, complex environments. If human users — without programming skills — can transfer their task knowledge to agents, learning can accelerate dramatically, reducing costly trials. The TAMER framework guides the design of agents whose behavior ...
متن کاملA Reinforcement-and-Generalization Model of Sequential Effects in Identification Learning
Responses in identification-learning tasks depend on events from recent trials. A model for these sequential effects is proposed, based on previous work in category learning and founded on theories of reinforcement learning and generalization. The model is compared to two other theories in their predictions of the influence of previous stimuli and previous feedback. Two experimental paradigms a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1610.05394 شماره
صفحات -
تاریخ انتشار 2016