12 subscribers
با برنامه Player FM !
Ep 11 - Technical alignment overview w/ Thomas Larsen (Director of Strategy, Center for AI Policy)
Manage episode 389405391 series 3428190
We speak with Thomas Larsen, Director for Strategy at the Center for AI Policy in Washington, DC, to do a "speed run" overview of all the major technical research directions in AI alignment. A great way to quickly learn broadly about the field of technical AI alignment.
In 2022, Thomas spent ~75 hours putting together an overview of what everyone in technical alignment was doing. Since then, he's continued to be deeply engaged in AI safety. We talk to Thomas to share an updated overview to help listeners quickly understand the technical alignment research landscape.
We talk to Thomas about a huge breadth of technical alignment areas including:
* Prosaic alignment
* Scalable oversight (e.g. RLHF, debate, IDA)
* Intrepretability
* Heuristic arguments, from ARC
* Model evaluations
* Agent foundations
* Other areas more briefly:
* Model splintering
* Out-of-distribution (OOD) detection
* Low impact measures
* Threat modelling
* Scaling laws
* Brain-like AI safety
* Inverse reinforcement learning (RL)
* Cooperative AI
* Adversarial training
* Truthful AI
* Brain-machine interfaces (Neuralink)
Hosted by Soroush Pour. Follow me for more AGI content:
Twitter: https://twitter.com/soroushjp
LinkedIn: https://www.linkedin.com/in/soroushjp/
== Show links ==
-- About Thomas --
Thomas studied Computer Science & Mathematics at U. Michigan where he first did ML research in the field of computer vision. After graduating, he completed the MATS AI safety research scholar program before doing a stint at MIRI as a Technical AI Safety Researcher. Earlier this year, he moved his work into AI policy by co-founding the Center for AI Policy, a nonprofit, nonpartisan organisation focused on getting the US government to adopt policies that would mitigate national security risks from AI. The Center for AI Policy is not connected to foreign governments or commercial AI developers and is instead committed to the public interest.
* Center for AI Policy - https://www.aipolicy.us
* LinkedIn - https://www.linkedin.com/in/thomas-larsen/
* LessWrong - https://www.lesswrong.com/users/thomas-larsen
-- Further resources --
* Thomas' post, "What Everyone in Technical Alignment is Doing and Why" https://www.lesswrong.com/posts/QBAjndPuFbhEXKcCr/my-understanding-of-what-everyone-in-technical-alignment-is
* Please note this post is from Aug 2022. The podcast should be more up-to-date, but this post is still a valuable and relevant resource.
15 قسمت
Ep 11 - Technical alignment overview w/ Thomas Larsen (Director of Strategy, Center for AI Policy)
Artificial General Intelligence (AGI) Show with Soroush Pour
Manage episode 389405391 series 3428190
We speak with Thomas Larsen, Director for Strategy at the Center for AI Policy in Washington, DC, to do a "speed run" overview of all the major technical research directions in AI alignment. A great way to quickly learn broadly about the field of technical AI alignment.
In 2022, Thomas spent ~75 hours putting together an overview of what everyone in technical alignment was doing. Since then, he's continued to be deeply engaged in AI safety. We talk to Thomas to share an updated overview to help listeners quickly understand the technical alignment research landscape.
We talk to Thomas about a huge breadth of technical alignment areas including:
* Prosaic alignment
* Scalable oversight (e.g. RLHF, debate, IDA)
* Intrepretability
* Heuristic arguments, from ARC
* Model evaluations
* Agent foundations
* Other areas more briefly:
* Model splintering
* Out-of-distribution (OOD) detection
* Low impact measures
* Threat modelling
* Scaling laws
* Brain-like AI safety
* Inverse reinforcement learning (RL)
* Cooperative AI
* Adversarial training
* Truthful AI
* Brain-machine interfaces (Neuralink)
Hosted by Soroush Pour. Follow me for more AGI content:
Twitter: https://twitter.com/soroushjp
LinkedIn: https://www.linkedin.com/in/soroushjp/
== Show links ==
-- About Thomas --
Thomas studied Computer Science & Mathematics at U. Michigan where he first did ML research in the field of computer vision. After graduating, he completed the MATS AI safety research scholar program before doing a stint at MIRI as a Technical AI Safety Researcher. Earlier this year, he moved his work into AI policy by co-founding the Center for AI Policy, a nonprofit, nonpartisan organisation focused on getting the US government to adopt policies that would mitigate national security risks from AI. The Center for AI Policy is not connected to foreign governments or commercial AI developers and is instead committed to the public interest.
* Center for AI Policy - https://www.aipolicy.us
* LinkedIn - https://www.linkedin.com/in/thomas-larsen/
* LessWrong - https://www.lesswrong.com/users/thomas-larsen
-- Further resources --
* Thomas' post, "What Everyone in Technical Alignment is Doing and Why" https://www.lesswrong.com/posts/QBAjndPuFbhEXKcCr/my-understanding-of-what-everyone-in-technical-alignment-is
* Please note this post is from Aug 2022. The podcast should be more up-to-date, but this post is still a valuable and relevant resource.
15 قسمت
همه قسمت ها
×
1 Ep 14 - Interp, latent robustness, RLHF limitations w/ Stephen Casper (PhD AI researcher, MIT) 2:42:17

1 Ep 13 - AI researchers expect AGI sooner w/ Katja Grace (Co-founder & Lead Researcher, AI Impacts) 1:20:28

1 Ep 12 - Education & advocacy for AI safety w/ Rob Miles (YouTube host) 1:21:26

1 Ep 11 - Technical alignment overview w/ Thomas Larsen (Director of Strategy, Center for AI Policy) 1:37:19

1 Ep 10 - Accelerated training to become an AI safety researcher w/ Ryan Kidd (Co-Director, MATS) 1:16:58

1 Ep 9 - Scaling AI safety research w/ Adam Gleave (CEO, FAR AI) 1:19:12

1 Ep 8 - Getting started in AI safety & alignment w/ Jamie Bernardi (AI Safety Lead, BlueDot Impact) 1:07:23

1 Ep 7 - Responding to a world with AGI - Richard Dazeley (Prof AI & ML, Deakin University) 1:10:05

1 Ep 6 - Will we see AGI this decade? Our AGI predictions & debate w/ Hunter Jay (CEO, Ripe Robotics) 1:20:58

1 Ep 4 - When will AGI arrive? - Ryan Kupyn (Data Scientist & Forecasting Researcher @ Amazon AWS) 1:03:23

1 Ep 3 - When will AGI arrive? - Jack Kendall (CTO, Rain.AI, maker of neural net chips) 1:01:34

1 Ep 1 - When will AGI arrive? - Logan Riggs Smith (AGI alignment researcher) 1:10:51
به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.