Flash Forward is a show about possible (and not so possible) future scenarios. What would the warranty on a sex robot look like? How would diplomacy work if we couldn’t lie? Could there ever be a fecal transplant black market? (Complicated, it wouldn’t, and yes, respectively, in case you’re curious.) Hosted and produced by award winning science journalist Rose Eveleth, each episode combines audio drama and journalism to go deep on potential tomorrows, and uncovers what those futures might re ...
…
continue reading
محتوای ارائه شده توسط The Thesis Review and Sean Welleck. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Thesis Review and Sean Welleck یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !
با برنامه Player FM !
[07] John Schulman - Optimizing Expectations: From Deep RL to Stochastic Computation Graphs
Manage episode 302418438 series 2982803
محتوای ارائه شده توسط The Thesis Review and Sean Welleck. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Thesis Review and Sean Welleck یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
John Schulman is a Research Scientist and co-founder of Open AI. John co-leads the reinforcement learning team, researching algorithms that safely and efficiently learn by trial and error and by imitating humans. His PhD thesis is titled "Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs", which he completed in 2016 at Berkeley. We talk about his work on stochastic computation graphs and TRPO, how it evolved to PPO and how it's used in large-scale applications like Open AI Five, as well as his recent work on generalization in RL. Episode notes: https://cs.nyu.edu/~welleck/episode7.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.buymeacoffee.com/thesisreview
…
continue reading
47 قسمت
Manage episode 302418438 series 2982803
محتوای ارائه شده توسط The Thesis Review and Sean Welleck. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Thesis Review and Sean Welleck یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
John Schulman is a Research Scientist and co-founder of Open AI. John co-leads the reinforcement learning team, researching algorithms that safely and efficiently learn by trial and error and by imitating humans. His PhD thesis is titled "Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs", which he completed in 2016 at Berkeley. We talk about his work on stochastic computation graphs and TRPO, how it evolved to PPO and how it's used in large-scale applications like Open AI Five, as well as his recent work on generalization in RL. Episode notes: https://cs.nyu.edu/~welleck/episode7.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.buymeacoffee.com/thesisreview
…
continue reading
47 قسمت
همه قسمت ها
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.