58 Episodes

  1. 34 - AI Evaluations with Beth Barnes

    Published: 7/28/2024
  2. 33 - RLHF Problems with Scott Emmons

    Published: 6/12/2024
  3. 32 - Understanding Agency with Jan Kulveit

    Published: 5/30/2024
  4. 31 - Singular Learning Theory with Daniel Murfet

    Published: 5/7/2024
  5. 30 - AI Security with Jeffrey Ladish

    Published: 4/30/2024
  6. 29 - Science of Deep Learning with Vikrant Varma

    Published: 4/25/2024
  7. 28 - Suing Labs for AI Risk with Gabriel Weil

    Published: 4/17/2024
  8. 27 - AI Control with Buck Shlegeris and Ryan Greenblatt

    Published: 4/11/2024
  9. 26 - AI Governance with Elizabeth Seger

    Published: 11/26/2023
  10. 25 - Cooperative AI with Caspar Oesterheld

    Published: 10/3/2023
  11. 24 - Superalignment with Jan Leike

    Published: 7/27/2023
  12. 23 - Mechanistic Anomaly Detection with Mark Xu

    Published: 7/27/2023
  13. Survey, store closing, Patreon

    Published: 6/28/2023
  14. 22 - Shard Theory with Quintin Pope

    Published: 6/15/2023
  15. 21 - Interpretability for Engineers with Stephen Casper

    Published: 5/2/2023
  16. 20 - 'Reform' AI Alignment with Scott Aaronson

    Published: 4/12/2023
  17. Store, Patreon, Video

    Published: 2/7/2023
  18. 19 - Mechanistic Interpretability with Neel Nanda

    Published: 2/4/2023
  19. New podcast - The Filan Cabinet

    Published: 10/13/2022
  20. 18 - Concept Extrapolation with Stuart Armstrong

    Published: 9/3/2022

2 / 3

AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.