The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function R from policy pi. To do this, we need model how pi relates R. In the current literature, most common models are optimality, Boltzmann rationality, and causal entropy maximisation. One primary motivations behind IRL human preferences behaviour. However, true relationship between behaviour much more complex than any cur...