Leon F. Guerrero, University of Central Florida
A Differentiation between Bayesian Updating and Model-Free Reinforcement Learning in Human Choice under Uncertainty
Abstract:
Advances in neuroscience, and more recently, in neuroeconomics, have started to expose the underlying neurological and behavioral mechanisms in simple decision-making processes. We study two major learning frameworks under uncertainty, Bayesian updating and model-free reinforcement learning, and seek to differentiate them in human choice during a simple experiment. Bayesian updating can satisfactorily address challenging problems and it has even been proposed that the architecture of the nervous system is well suited for it. In model-free reinforcement learning the learning rate (adjustment of beliefs to prediction errors) is constant, and, most importantly, symmetric in the prediction error (the same whether the prediction error is positive or negative). To differentiate it from Bayesian updating, we take advantage of a very important property of the latter known as Doob's martingale property, which states that the expected probability of an outcome given new evidence is equal to the probability of the same outcome given past evidence. Choice data will be collected during a simple experiment. An incentive-compatible mechanism will be used to truthfully elicit subjects' beliefs in our stochastic environment.
Mentors: Dr. Peter Bossaerts and Dr. Mathieu D'Acremont (California Institute of Technology), Dr. Wolfram Schultz (University of Cambridge)