56 subscribers
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده
PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274
Manage episode 452058172 series 3241972
Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. // MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms. // Abstract Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements. // Bio Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices. // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app
Timestamps: [00:00] Michael's preferred coffee [00:21] Takeaways [01:59] Please like, share, leave a review, and subscribe to our MLOps channels! [02:10] Gaming to AI Accelerators [11:34] Torch Chat goals [18:53] Pytorch benchmarking and competitiveness [21:28] Optimizing MLOps models [24:52] GPU optimization tips [29:36] Cloud vs On-device AI [38:22] Abstraction across devices [42:29] PyTorch developer experience [45:33] AI and MLOps-related antipatterns [48:33] When to optimize [53:26] Efficient edge AI models [56:57] Wrap up
445 قسمت
Manage episode 452058172 series 3241972
Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. // MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms. // Abstract Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements. // Bio Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices. // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app
Timestamps: [00:00] Michael's preferred coffee [00:21] Takeaways [01:59] Please like, share, leave a review, and subscribe to our MLOps channels! [02:10] Gaming to AI Accelerators [11:34] Torch Chat goals [18:53] Pytorch benchmarking and competitiveness [21:28] Optimizing MLOps models [24:52] GPU optimization tips [29:36] Cloud vs On-device AI [38:22] Abstraction across devices [42:29] PyTorch developer experience [45:33] AI and MLOps-related antipatterns [48:33] When to optimize [53:26] Efficient edge AI models [56:57] Wrap up
445 قسمت
همه قسمت ها
×
1 The Creator of FastAPI’s Next Chapter // Sebastián Ramírez // #324 1:09:37

1 A Candid Conversation Around MCP and A2A // Rahul Parundekar and Sam Partee // #316 SF Live 1:04:42

1 Making AI Reliable is the Greatest Challenge of the 2020s // Alon Bochman // #312 1:01:37

1 Behavior Modeling, Secondary AI Effects, Bias Reduction & Synthetic Data // Devansh Devansh // #311 1:01:35

1 GraphBI: Expanding Analytics to All Data Through the Combination of GenAI, Graph, & Visual Analytics // Paco Nathan & Weidong Yang // #310 1:14:01

1 I Am Once Again Asking "What is MLOps?" // Oleksandr Stasyk // #308 1:07:22

1 Agents of Innovation: AI-Powered Product Ideation with Synthetic Consumer Testing // Luca Fiaschi // #306 1:02:23

1 We're All Finetuning Incorrectly // Tanmay Chopra // #304 1:00:30






1 From Rules to Reasoning Engines // George Mathew // #296 1:05:26


1 Machine Learning, AI Agents, and Autonomy // Egor Kraev // #282 1:05:20


1 Unleashing Unconstrained News Knowledge Graphs to Combat Misinformation // Robert Caulk // #279 1:15:24

1 AI-Driven Code: Navigating Due Diligence & Transparency in MLOps // Matt van Itallie // #275 57:01


1 LLMs to agents: The Beauty & Perils of Investing in GenAI // VC Panel // Agents in Production 33:24



1 GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296 1:06:24

1 Future of Software, Agents in the Enterprise, and Inception Stage Company Building // Eliot Durbin // #293 54:26

1 The Agent Landscape - Lessons Learned Putting Agents Into Production 1:08:40

1 Evolving Workflow Orchestration // Alex Milowski // #291 1:14:34




1 Navigating Machine Learning Careers: Insights from Meta to Consulting // Ilya Reznik // #286 1:00:36
به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.