A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment
Best AI papers explained - A podcast by Enoch H. Kang

Categories:
This paper investigates whether Generative Pre-trained Transformer (GPT) models, trained solely for next-token prediction, implicitly learn a causal world model. By proposing a causal interpretation of GPT's attention mechanism, the authors suggest that these models can perform zero-shot causal structure learning for input sequences. Experiments in controlled game environments like Othello and Chess show that GPT is more likely to generate legal moves for out-of-distribution sequences when the attention mechanism strongly encodes a causal structure, highlighting a connection between implicit causal learning and adherence to rules.