Alexander Pan on the MACHIAVELLI benchmark

The Inside View - A podcast by Michaël Trazzi

Categories:

I've talked to Alexander Pan, 1st year at Berkeley working with Jacob Steinhardt about his paper "Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark" accepted as oral at ICML. Youtube: https://youtu.be/MjkSETpoFlY Paper: https://arxiv.org/abs/2304.03279