56 subscribers
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده
The Role of Infrastructure in ML // Niels Bantilan // #197
Manage episode 390903752 series 3241972
MLOps podcast #197 with Niels Bantilan, Chief Machine Learning Engineer at Union, The Role of Infrastructure in ML Leveraging Open Source brought to us by Union. // Abstract When we start out building and deploying models in a new organization, life is simple: all I need to do is grab some data, iterate on a model that fits the data well and performs reasonably well on some held-out test set. Then, if you’re fortunate enough to get to the point where you want to deploy it, it’s fairly straightforward to wrap it in an app framework and host it on a cloud server. However, once you get past this stage, you’re likely to find yourself needing: More scalable data processing framework Experiment tracking for models Heavier duty CPU/GPU hardware Versioning tools to link models, data, code, and resource requirements Monitoring tools for tracking data and model quality There’s a rich ecosystem of open-source tools that solves each of these problems and more: but how do you unify all of them together into a single view? This is where orchestration tools like Flyte can help. Flyte not only allows you to compose data and ML pipelines, but it also serves as “infrastructure as code” so that you can leverage the open-source ecosystem and unify purpose-built tools for different parts of the ML lifecycle on a single platform. ML systems are not just models: they are the models, data, and infrastructure combined. // Bio Niels is the Chief Machine Learning Engineer at Union.ai, and core maintainer of Flyte, an open-source workflow orchestration tool, author of UnionML, an MLOps framework for machine learning microservices, and creator of Pandera, a statistical typing and data testing tool for scientific data containers. His mission is to help data science and machine learning practitioners be more productive. He has a Masters in Public Health with a specialization in sociomedical science and public health informatics, and prior to that a background in developmental biology and immunology. His research interests include reinforcement learning, AutoML, creative machine learning, and fairness, accountability, and transparency in automated systems. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://github.com/cosmicBboy, https://union.ai/Flyte: https://flyte.org/ MLOps vs ML Orchestration // Ketan Umare // MLOps Podcast #183 - https://youtu.be/k2QRNJXyzFg --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Niels on LinkedIn: https://www.linkedin.com/in/nbantilan/ Timestamps: [00:00] Niels' preferred coffee [00:17] Takeaways [03:45] Shout out to our Premium Brand Partner, Union! [04:30] Pandera [08:12] Creating a company [14:22] Injecting ML for Data [17:30] ML for Infrastructure Optimization [22:17] AI Implementation Challenges [24:25] Generative DevOps movement [28:27] Pushing Limits: Code Responsibility [29:46] Orchestration in OpenAI's Dev Day [34:27] MLOps Stack: Layers & Challenges [42:45] Mature Companies Embrace Kubernetes [45:29] Horizon Challenges [47:24] Flexible Integration for Resources [49:10] MLOps Reproducibility Challenges [53:14] MLOps Maturity Spectrum [57:48] First-Class Citizens in Design [1:00:16] Delegating for Efficient Collaboration [1:04:55] Wrap up
446 قسمت
Manage episode 390903752 series 3241972
MLOps podcast #197 with Niels Bantilan, Chief Machine Learning Engineer at Union, The Role of Infrastructure in ML Leveraging Open Source brought to us by Union. // Abstract When we start out building and deploying models in a new organization, life is simple: all I need to do is grab some data, iterate on a model that fits the data well and performs reasonably well on some held-out test set. Then, if you’re fortunate enough to get to the point where you want to deploy it, it’s fairly straightforward to wrap it in an app framework and host it on a cloud server. However, once you get past this stage, you’re likely to find yourself needing: More scalable data processing framework Experiment tracking for models Heavier duty CPU/GPU hardware Versioning tools to link models, data, code, and resource requirements Monitoring tools for tracking data and model quality There’s a rich ecosystem of open-source tools that solves each of these problems and more: but how do you unify all of them together into a single view? This is where orchestration tools like Flyte can help. Flyte not only allows you to compose data and ML pipelines, but it also serves as “infrastructure as code” so that you can leverage the open-source ecosystem and unify purpose-built tools for different parts of the ML lifecycle on a single platform. ML systems are not just models: they are the models, data, and infrastructure combined. // Bio Niels is the Chief Machine Learning Engineer at Union.ai, and core maintainer of Flyte, an open-source workflow orchestration tool, author of UnionML, an MLOps framework for machine learning microservices, and creator of Pandera, a statistical typing and data testing tool for scientific data containers. His mission is to help data science and machine learning practitioners be more productive. He has a Masters in Public Health with a specialization in sociomedical science and public health informatics, and prior to that a background in developmental biology and immunology. His research interests include reinforcement learning, AutoML, creative machine learning, and fairness, accountability, and transparency in automated systems. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://github.com/cosmicBboy, https://union.ai/Flyte: https://flyte.org/ MLOps vs ML Orchestration // Ketan Umare // MLOps Podcast #183 - https://youtu.be/k2QRNJXyzFg --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Niels on LinkedIn: https://www.linkedin.com/in/nbantilan/ Timestamps: [00:00] Niels' preferred coffee [00:17] Takeaways [03:45] Shout out to our Premium Brand Partner, Union! [04:30] Pandera [08:12] Creating a company [14:22] Injecting ML for Data [17:30] ML for Infrastructure Optimization [22:17] AI Implementation Challenges [24:25] Generative DevOps movement [28:27] Pushing Limits: Code Responsibility [29:46] Orchestration in OpenAI's Dev Day [34:27] MLOps Stack: Layers & Challenges [42:45] Mature Companies Embrace Kubernetes [45:29] Horizon Challenges [47:24] Flexible Integration for Resources [49:10] MLOps Reproducibility Challenges [53:14] MLOps Maturity Spectrum [57:48] First-Class Citizens in Design [1:00:16] Delegating for Efficient Collaboration [1:04:55] Wrap up
446 قسمت
همه قسمت ها
×
1 The Creator of FastAPI’s Next Chapter // Sebastián Ramírez // #324 1:09:37

1 A Candid Conversation Around MCP and A2A // Rahul Parundekar and Sam Partee // #316 SF Live 1:04:42

1 Making AI Reliable is the Greatest Challenge of the 2020s // Alon Bochman // #312 1:01:37

1 Behavior Modeling, Secondary AI Effects, Bias Reduction & Synthetic Data // Devansh Devansh // #311 1:01:35

1 GraphBI: Expanding Analytics to All Data Through the Combination of GenAI, Graph, & Visual Analytics // Paco Nathan & Weidong Yang // #310 1:14:01

1 I Am Once Again Asking "What is MLOps?" // Oleksandr Stasyk // #308 1:07:22

1 Agents of Innovation: AI-Powered Product Ideation with Synthetic Consumer Testing // Luca Fiaschi // #306 1:02:23

1 We're All Finetuning Incorrectly // Tanmay Chopra // #304 1:00:30

1 From Rules to Reasoning Engines // George Mathew // #296 1:05:26

1 GenAI Traffic: Why API Infrastructure Must Evolve... Again // Erica Hughberg // #296 1:06:24

1 Future of Software, Agents in the Enterprise, and Inception Stage Company Building // Eliot Durbin // #293 54:26

1 The Agent Landscape - Lessons Learned Putting Agents Into Production 1:08:40

1 Evolving Workflow Orchestration // Alex Milowski // #291 1:14:34




1 Navigating Machine Learning Careers: Insights from Meta to Consulting // Ilya Reznik // #286 1:00:36


1 Machine Learning, AI Agents, and Autonomy // Egor Kraev // #282 1:05:20


1 Unleashing Unconstrained News Knowledge Graphs to Combat Misinformation // Robert Caulk // #279 1:15:24

1 AI-Driven Code: Navigating Due Diligence & Transparency in MLOps // Matt van Itallie // #275 57:01


1 LLMs to agents: The Beauty & Perils of Investing in GenAI // VC Panel // Agents in Production 33:24



1 The Impact of UX Research in the AI Space // Lauren Kaplan // #272 1:08:19


1 Boosting LLM/RAG Workflows & Scheduling w/ Composable Memory and Checkpointing // Bernie Wu // #270 55:18

1 How to Systematically Test and Evaluate Your LLMs Apps // Gideon Mendels // #269 1:01:42

1 The AI Dream Team: Strategies for ML Recruitment and Growth // Jelmer Borst and Daniela Solis // #267 58:42

1 Unpacking 3 Types of Feature Stores // Simba Khadder // #265 1:07:42

1 Who's MLOps for Anyway? // Jonathan Rioux // #261 1:10:14


1 Building in Production Human-centred GenAI Solutions // Mohamed Abusaid & Mara Pometti// #177 1:02:42

1 MLOps for GenAI Applications // Harcharan Kabbay // #256 1:07:18

1 Design and Development Principles for LLMOps // Andy McMahon // #254 1:10:17


1 Red Teaming LLMs // Ron Heichman // #252 1:09:52



1 Evaluating the Effectiveness of Large Language Models: Challenges and Insights // Aniket Singh // #248 35:40

1 Extending AI: From Industry to Innovation // Sophia Rowland & David Weik // #247 1:01:36

1 All Data Scientists Should Learn Software Engineering Principles // Catherine Nelson // #245 52:54


1 Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241 57:21

1 How to Build Production-Ready AI Models for Manufacturing // [Exclusive] LatticeFlow Roundtable 56:37

1 Managing Small Knowledge Graphs for Multi-agent Systems // Tom Smoker // #236 1:04:40

1 Just when we Started to Solve Software Docs, AI Blew Everything Up // Dave Nunez // #235 1:01:38




1 Data Engineering in the Federal Sector // Shane Morris // #223 1:03:22

1 What Business Stakeholders Want to See from the ML Teams // Peter Guagenti // #222 1:21:27

1 MLOps - Design Thinking to Build ML Infra for ML and LLM Use Cases // Amritha Arun Babu & Abhik Choudhury // #221 1:00:17

1 4 Years of the MLOps Community // Demetrios Brinkmann // #220 1:04:29

1 The Art and Science of Training LLMs // Bandish Shah and Davis Blalock // #219 1:15:11

1 [Exclusive] Zilliz Roundtable // Why Purpose-built Vector Databases Matter for Your Use Case 59:00

1 The Real E2E RAG Stack // Sam Bean, Rewind AI // #217 1:10:06


1 Becoming an AI Evangelist // Alex Volkov // #215 1:14:31


1 Data Governance and AI // Alexandra Diem // #212 1:05:45

1 Powering MLOps: The Story of Tecton's Rift // Matt Bleifer & Mike Eastham // #209 1:03:57


1 The Myth of AI Breakthroughs // Jonathan Frankle // #205 1:10:02

1 Pioneering AI Models for Regional Languages // Aleksa Gordić // #203 1:04:23

1 Small Data, Big Impact: The Story Behind DuckDB // Hannes Mühleisen & Jordan Tigani // #202 1:08:34

1 Language, Graphs, and AI in Industry // Paco Nathan // #201 1:18:28

به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.