با برنامه Player FM !
#66 – Michael Cohen on Input Tampering in Advanced RL Agents
Manage episode 366981201 series 2607952
Michael Cohen is is a DPhil student at the University of Oxford with Mike Osborne. He will be starting a postdoc with Professor Stuart Russell at UC Berkeley, with the Center for Human-Compatible AI. His research considers the expected behaviour of generally intelligent artificial agents, with a view to designing agents that we can expect to behave safely.
You can see more links and a full transcript at www.hearthisidea.com/episodes/cohen.
We discuss:
- What is reinforcement learning, and how is it different from supervised and unsupervised learning?
- Michael's recently co-authored paper titled 'Advanced artificial agents intervene in the provision of reward'
- Why might it be hard to convey what we really want to RL learners — even when we know exactly what we want?
- Why might advanced RL systems might tamper with their sources of input, and why could this be very bad?
- What assumptions need to hold for this "input tampering" outcome?
- Is reward really the optimisation target? Do models "get reward"?
- What's wrong with the analogy between RL systems and evolution?
Key links:
- Michael's personal website
- 'Advanced artificial agents intervene in the provision of reward' by Michael K. Cohen, Marcus Hutter, and Michael A. Osborne
- 'Pessimism About Unknown Unknowns Inspires Conservatism' by Michael Cohen and Marcus Hutter
- 'Intelligence and Unambitiousness Using Algorithmic Information Theory' by Michael Cohen, Badri Vallambi, and Marcus Hutter
- 'Quantilizers: A Safer Alternative to Maximizers for Limited Optimization' by Jessica Taylor
- 'RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning' by Marc Rigter, Bruno Lacerda, and Nick Hawes
- 'Quantilizers: A Safer Alternative to Maximizers for Limited Optimization' by Jessica Taylor
- Season 40 of Survivor
87 قسمت
Manage episode 366981201 series 2607952
Michael Cohen is is a DPhil student at the University of Oxford with Mike Osborne. He will be starting a postdoc with Professor Stuart Russell at UC Berkeley, with the Center for Human-Compatible AI. His research considers the expected behaviour of generally intelligent artificial agents, with a view to designing agents that we can expect to behave safely.
You can see more links and a full transcript at www.hearthisidea.com/episodes/cohen.
We discuss:
- What is reinforcement learning, and how is it different from supervised and unsupervised learning?
- Michael's recently co-authored paper titled 'Advanced artificial agents intervene in the provision of reward'
- Why might it be hard to convey what we really want to RL learners — even when we know exactly what we want?
- Why might advanced RL systems might tamper with their sources of input, and why could this be very bad?
- What assumptions need to hold for this "input tampering" outcome?
- Is reward really the optimisation target? Do models "get reward"?
- What's wrong with the analogy between RL systems and evolution?
Key links:
- Michael's personal website
- 'Advanced artificial agents intervene in the provision of reward' by Michael K. Cohen, Marcus Hutter, and Michael A. Osborne
- 'Pessimism About Unknown Unknowns Inspires Conservatism' by Michael Cohen and Marcus Hutter
- 'Intelligence and Unambitiousness Using Algorithmic Information Theory' by Michael Cohen, Badri Vallambi, and Marcus Hutter
- 'Quantilizers: A Safer Alternative to Maximizers for Limited Optimization' by Jessica Taylor
- 'RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning' by Marc Rigter, Bruno Lacerda, and Nick Hawes
- 'Quantilizers: A Safer Alternative to Maximizers for Limited Optimization' by Jessica Taylor
- Season 40 of Survivor
87 قسمت
همه قسمت ها
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.