Simulating Social Behavior with GPT-4: A Study of Predictive Accuracy in Social Science Experiments

Rhythm Blues AI - A podcast by Andrea Viliotti, digital innovation consultant (augmented edition)

A team of researchers from Stanford and New York universities tested whether language models can predict the outcomes of social experiments. Using GPT-4 on a dataset of 70 experiments and 476 treatments involving 105,165 participants, they simulated responses from American citizens. GPT-4's predictions were highly accurate, with a correlation coefficient of 0.85, often comparable to or even surpassing previous human predictions. These findings highlight the potential of LLMs in replicating social effects with human-level precision.

Visit the podcast's native language site