Tom Burns, Cornell University | At the Interface of AI Safety and Neuroscience

The Foresight Institute Podcast - A podcast by Foresight Institute - Fridays

SpeakerTom Burns graduated with First Class Honours in his Bachelor of Science from Monash University in 2013, exploring the mathematical features of human perception of non-linguistic sounds. Shifting to philosophy, he completed a Master of Bioethics in 2014, analyzing the ethics of euthanasia legislation in Australia, followed by a World Health Organization Bioethics Fellowship in 2015, contributing to the Ebola epidemic response. In 2023, as a Fall Semester Postdoc at Brown University's ICERM, he contributed to the 'Math + Neuroscience' program. Recently affiliated with Timaeus, an AI safety organization, Tom is continuing his research at Cornell University's new SciAI Center from March 2024.Session Summary Neuroscience is a burgeoning field with many opportunities for novel research directions. Due to experimental and physical limitations, however, theoretical progress relies on imperfect and incomplete information about the system. Artificial neural networks, for which perfect and complete information is possible, therefore offer those trained in the neurosciences an opportunity to study intelligence to a level of granularity which is beyond comparison to biological systems, while still relevant to them. Additionally, applying neuroscience methods, concepts, and theory to AI systems offers a relatively under-explored avenue to make headwind in the daunting challenges posed by AI safety — both for present-day risks, such as enshrining biases and spreading misinformation, and for future risks, including on existential scales. In this talk, Tom presents two emerging examples of interactions between neuroscience and AI safety. In the direction of ideas from neuroscience being useful for AI safety, he demonstrates how associative memory has become a tool for interpretability of Transformer-based models. In the opposite direction, he discusses how statistical learning theory and the developmental interpretability research program have applicability in understanding neuroscience phenomena, such as perceptual invariance and representational drift. Full transcript, list of resources, and art piece: https://www.existentialhope.com/podcastsExistential Hope was created to collect positive and possible scenarios for the future so that we can have more people commit to creating a brighter future, and to begin mapping out the main developments and challenges that need to be navigated to reach it. Existential Hope is a Foresight Institute project.Hosted by Allison Duettmann and Beatrice ErkersFollow Us: Twitter | Facebook | LinkedIn | Existential Hope InstagramExplore every word spoken on this podcast through Fathom.fm. Hosted on Acast. See acast.com/privacy for more information.