Abstract The exploration versus exploitation dilemma is a critical issue in human information acquisition and sequential belief formation, the multi-armed bandit problem has been widely used to address it. Because of its high descriptive accuracy, SGU model, which combines SoftMax type probabilistic selection, Gaussian process regression updating, upper confidence interval evaluation, attracted...