Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241

MLOps.community

Player FM - Internet Radio Done Right

56 subscribers

اضافه شده در four سال پیش

محتوای ارائه شده توسط Demetrios. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Demetrios یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Squid Game: The Official Podcast

1
Keys and Knives - S3 Ep 1 26:28

۶ weeks پیش26:28

پخش در آینده

لیست ها

پسندیدن

دوست داشته شد

26:28

Squid Game is back—and this time, the knives are out. In the thrilling Season 3 premiere, Player 456 is spiraling and a brutal round of hide-and-seek forces players to kill or be killed. Hosts Phil Yu and Kiera Please break down Gi-hun’s descent into vengeance, Guard 011’s daring betrayal of the Game, and the shocking moment players are forced to choose between murdering their friends… or dying. Then, Carlos Juico and Gavin Ruta from the Jumpers Jump podcast join us to unpack their wild theories for the season. Plus, Phil and Kiera face off in a high-stakes round of “Hot Sweet Potato.” SPOILER ALERT! Make sure you watch Squid Game Season 3 Episode 1 before listening on. Play one last time. IG - @SquidGameNetflix X (f.k.a. Twitter) - @SquidGame Check out more from Phil Yu @angryasianman , Kiera Please @kieraplease and the Jumpers Jump podcast Listen to more from Netflix Podcasts . Squid Game: The Official Podcast is produced by Netflix and The Mash-Up Americans.…

حدود یک سال پیش 57:21

MP3•خانه قسمت

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/ Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // MLOps podcast #241 with Boris Selitser, Co-Founder and CTO/CPO of Okareo. A big thank you to LatticeFlow for sponsoring this episode! LatticeFlow - https://latticeflow.ai/ // Abstract Explore the evolving landscape of building LLM applications, focusing on the critical roles of synthetic data and agent evaluations. Discover how synthetic data enhances model behavior description, prototyping, testing, and fine-tuning, driving robustness in LLM applications. Learn about the latest methods for evaluating complex agent-based systems, including RAG-based evaluations, dialog-level assessments, simulated user interactions, and adversarial models. This talk delves into the specific challenges developers face and the tradeoffs involved in each evaluation approach, providing practical insights for effective AI development. // Bio Boris is the Co-Founder and CTO/CPO at Okareo. Okareo is a full-cycle platform for developers to evaluate and customize AI/LLM applications. Before Okareo, Boris was Director of Product at Meta/Facebook, leading teams building internal platforms and ML products. Examples include a copyright classification system across the Facebook apps and an engagement platform for over 200K developers, 500K+ creators, and 12M+ Oculus users. Boris has a bachelor’s in Computer Science from UC Berkeley. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://docs.okareo.com/blog/data_loop https://docs.okareo.com/blog/agent_eval The Real E2E RAG Stack // Sam Bean // MLOps Podcast #217 - https://youtu.be/8uZst7pgOw0

RecSys at Spotify // Sanket Gupta // MLOps Podcast #232 - https://youtu.be/byH-ARJA4gk --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Boris on LinkedIn: https://www.linkedin.com/in/selitser/

Timestamps: [00:00] Boris' preferred coffee [00:37] Takeaways [02:32] Please like, share, leave a review, and subscribe to our MLOps channels! [02:48] Software Engineering and Data Science [06:01] AI Transformative Potential Explained [10:31] Prompt Injection Protection Strategies [17:03] Agent's metrics for Jira [24:11] Data and Metrics Evolution [27:54] Evaluation Focus Enhances Systems [31:22 - 32:52] LatticeFlow AD [32:55] Custom Evaluation and Synthetic Data [36:23] Synthetic data for expansion, evaluation, and map [41:06] Diverse agents' personalities for readiness [44:25] Agent functions [46:17] Optimizing Routing Agents [50:04] Adapting to tool output for decision-making [52:56] Agent framework evolution [55:41] Agent framework for delivering value [57:03] Wrap up

457 قسمت

Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241

MLOps.community

56 subscribers

published حدود یک سال پیش

اشتراک گذاری

MP3•خانه قسمت

457 قسمت

همه قسمت ها

MLOps.community

1
The Hidden Bottlenecks Slowing Down AI Agents 47:59

۳ روز پیش47:59

47:59

Demetrios chats with Paul van der Boor and Bruce Martens from Process about the real bottlenecks in AI agent development—not tools, but evaluation and feedback. They unpack when to build vs. buy, the tradeoffs of external vendors, and how internal tools like Copilot are reshaping workflows. Guest speakers:Paul van der Boor - VP AI at Prosus GroupBruce Martens - AI Engineer at Prosus Group Host:Demetrios Brinkmann - Founder of MLOps Community ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/]…

MLOps.community

1
9 Commandments for Building AI Agents 1:20:33

۷ روز پیش1:20:33

1:20:33

Building AI agents that actually get things done is harder than it looks. Demetrios, Paul, and Dmitri break down what makes agents effective—from smart planning and memory to treating tools, systems, and even people as components. They cover the "react" loop, budgeting for long tasks, sandboxing, and learning from experience. It’s a sharp, practical look at what it really takes to design useful, adaptive AI agents. Guest speakers:Paul van der Boor - VP AI at Prosus GroupDmitri Jarnikov - Senior Director of Data Science at Prosus Group Host:Demetrios Brinkmann - Founder of MLOps Community ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/]…

MLOps.community

1
Enterprise AI Adoption Challenges 1:05:00

10 روز پیش1:05:00

1:05:00

Building AI Agents that work is no small feat. In Agents in Production [Podcast Limited Series] - Episode Six, Paul van der Boor and Sean Kenny share how they scaled AI across 100+ companies with Toqan—a tool born from a Slack experiment and grown into a powerful productivity platform. From driving adoption and building super users to envisioning AI employees of the future, this conversation cuts through the hype and gets into what it really takes to make AI work in the enterprise. Guest speakers: Paul van der Boor - VP AI at Prosus Group Sean Kenny - Senior Product Manager at Prosus Group Host: Demetrios Brinkmann - Founder of MLOps Community ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/]…

MLOps.community

1
Real-time Feature Generation at Lyft // Rakesh Kumar // #334 58:04

14 روز پیش58:04

58:04

Real-time Feature Generation at Lyft // MLOps Podcast #334 with Rakesh Kumar, Senior Staff Software Engineer at Lyft. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract This session delves into real-time feature generation at Lyft. Real-time feature generation is critical for Lyft where accurate up-to-the-minute marketplace data is paramount for optimal operational efficiency. We will explore how the infrastructure handles the immense challenge of processing tens of millions of events per minute to generate features that truly reflect current marketplace conditions. Lyft has built this massive infrastructure over time, evolving from a humble start and a naive pipeline. Through lessons learned and iterative improvements, Lyft has made several trade-offs to achieve low-latency, real-time feature delivery. MLOps plays a critical role in managing the lifecycle of these real-time feature pipelines, including monitoring and deployment. We will discuss the practicalities of building and maintaining high-throughput, low-latency real-time feature generation systems that power Lyft’s dynamic marketplace and business-critical products. // Bio Rakesh Kumar is a Senior Staff Software Engineer at Lyft, specializing in building and scaling Machine Learning platforms. Rakesh has expertise in MLOps, including real-time feature generation, experimentation platforms, and deploying ML models at scale. He is passionate about sharing his knowledge and fostering a culture of innovation. This is evident in his contributions to the tech community through blog posts, conference presentations, and reviewing technical publications. // Related Links Website: https://englife101.io/ https://eng.lyft.com/search?q=rakesh https://eng.lyft.com/real-time-spatial-temporal-forecasting-lyft-fa90b3f3ec24 https://eng.lyft.com/evolution-of-streaming-pipelines-in-lyfts-marketplace-74295eaf1eba Streaming Ecosystem Complexities and Cost Management // Rohit Agrawal // MLOps Podcast #302 - https://youtu.be/0axFbQwHEh8 ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Rakesh on LinkedIn: /rakeshkumar1007/ Timestamps: [00:00] Rakesh preferred coffee [00:24] Real-time machine learning [04:51] Latency tricks explanation [09:28] Real-time problem evolution [15:51] Config management complexity [18:57] Data contract implementation [23:36] Feature store [28:23] Offline vs online workflows [31:02] Decision-making in tech shifts [36:54] Cost evaluation frequency [40:48] Model feature discussion [49:09] Hot shard tricks [55:05] Pipeline feature bundling [57:38] Wrap up…

MLOps.community

1
AI Agent Development Tradeoffs You NEED to Know 57:06

17 روز پیش57:06

57:06

Sherwood Callaway, tech lead at 11X, joins us to talk about building digital workers—specifically Alice (an AI sales rep) and Julian (a voice agent)—that are shaking up sales outreach by automating complex, messy tasks. He looks back on his YC days at OpKit, where he first got his hands dirty with voice AI, and compares the wild ride of building voice vs. text agents. We get into the use of Langgraph Cloud, integrating observability tools like Langsmith and Arize, and keeping hallucinations in check with regular Evals. Sherwood and Demetrios wrap up with a look ahead: will today's sprawling AI agent stacks eventually simplify? // Bio Sherwood Callaway is an emerging leader in the world of AI startups and AI product development. He currently serves as the first engineering manager at 11x, a series B AI startup backed by Benchmark and Andreessen Horowitz, where he oversees technical work on "Alice", an AI sales rep that outperforms top human SDRs. Alice is an advanced agentic AI working in production and at scale. Under Sherwood’s leadership, the system grew from initial prototype to handling over 1 million prospect interactions per month across 300+ customers, leveraging partnerships with OpenAI, Anthropic, and LangChain while maintaining consistent performance and reliability. Alice is now generating eight figures in ARR. Sherwood joined 11x in 2024 through the acquisition of his YC-backed startup, Opkit, where he built and commercialized one of the first-ever AI phone calling solutions for a specific industry vertical (healthcare). Prior to Opkit, he was the second infrastructure engineer at Brex, where he designed, built, and scaled the production infrastructure that supported Brex’s application and engineering org through hypergrowth. He currently lives in San Francisco, CA. // Related Links ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Sherwood on LinkedIn: /sherwoodcallaway/ #aiengineering Timestamps: [00:00] AI Takes Over Health Calls [05:05] What Can Agents Really Do? [08:25] Who’s in Charge—User or Agent? [11:20] Why Graphs Matter in Agents [15:03] How Complex Should Agents Be? [18:33] The Hidden Cost of Model Upgrades [21:57] Inside the LLM Agent Loop [25:08] Turning Agents into APIs [29:06] Scaling Agents Without Meltdowns [30:04] The Monorepo Tangle, Explained [34:01] Building Agents the Open Source Way [38:49] What Production-Ready Agents Look Like [41:23] AI That Fixes Code on Its Own [43:26] Tracking Agent Behavior with OpenTelemetry [46:43] Running Agents Locally with Phoenix [52:55] LangGraph Meets Arise for Agent Control [53:29] Hunting Hallucinations in Agent Traces [56:45] Off-Script Insights Worth Hearing…

MLOps.community

1
From the Legal Trenches to Tech // Nick Coleman // #332 35:51

24 روز پیش35:51

35:51

From the Legal Trenches to Tech // MLOps Podcast #332 with Nick Coleman, Attorney/Founder of LexMed. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Nick Coleman shares his journey from high-volume Social Security disability practice to founding LexMed, a legal tech startup leveraging AI to transform how attorneys handle complex cases. He'll discuss LexMed's dual AI platforms: Hearing Echo, which automates transcription and analysis of disability hearings with speaker identification and critical testimony validation, and ChartVision, which combines human medical abstraction with AI to extract and map medical evidence to disability criteria. Nick will explain how "vibe coding" has dramatically reduced friction between his subject matter expertise and technical implementation, enabling rapid prototyping that preserves legal insights through development. By bridging domain knowledge and technology, LexMed has created solutions that address the real-world challenges he experienced firsthand in his high-volume disability practice, offering valuable lessons for AI implementation in other specialized fields. // Bio Nick Coleman is the founder and CEO of LexMed, a legal tech startup applying advanced AI to transform the practice of law. As a Social Security disability attorney with extensive appellate experience, Nick identified critical inefficiencies in legal workflows that technology could solve. LexMed's flagship product, Hearing Echo, leverages speech recognition and natural language processing to automate the transcription and analysis of disability hearing audio, dramatically improving case management for attorneys. Nick holds an AV Preeminent rating from Martindale-Hubbell, has been recognized as a Super Lawyers Rising Star, and serves on the Arkansas Bar Artificial Intelligence Task Force. With deep expertise at the intersection of law and technology, Nick is passionate about democratizing access to justice through innovative AI solutions. // Related Links Website: www.lexmed.ai ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Nick on LinkedIn: /nicklcoleman/ Timestamps: [00:00] Disability Claims Advocacy [00:29] AI Native Startup [02:08] Disability Claims Process [07:56] Tech Journey [10:52] AI in Document Review [13:57] Building a Case for Appeal [19:26] Medical Claims Language Model [23:37] Tech-Driven Compliance Solutions [30:31] Claim Prioritization Strategy [34:57] Wrap up…

MLOps.community

1
The Rise of Sovereign AI and Global AI Innovation in a World of US Protectionism // Frank Meehan // MLOps Podcast #331 54:13

28 روز پیش54:13

54:13

The Rise of Sovereign AI and Global AI Innovation in a World of US Protectionism // MLOps Podcast #331 with Frank Meehan, Founder and CEO of Frontier One AI. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract “The awakening of every single country is that they have to control their AI intelligence and not outsource their data" - Jensen Huang. Sovereign AI is rapidly becoming a fundamental national utility, much like defense, energy or telecoms. Nations worldwide recognize that AI sovereignty—having control over your AI infrastructure, data, and models—is essential for economic progress, security, and especially independence - especially when the US is pushing protectionism and trying to prevent global AI innovation. Of course this has the opposite effect - DeepSeek created by a Hedge Fund in China; India building the world's largest AI data centre (3 GW), and global software teams scaling, learning and building faster than ever before. However most countries lack the talent, financing and experience to implement Sovereign AI for their requirements - and it is our belief at Frontier One, that one of the biggest markets for AI applications, cloud services and GPUs will be global governments. We see it already - with $10B of GPUs in 2024 bought directly by governments - and it's rapidly expanding. We will talk about what Sovereign AI is - both infrastructure and software details / why it is crucial for a nation / how to get involved as part of the MLOps community. // Bio Co-Founder of Frontier One - building Sovereign AI Factories and Cloud software for global markets. Frank is a 2X CEO | 2X CMO (with 2X exits + 1 IPO NYSE), Board Director (Spotify, Siri) and Investor (SparkLabs Group) with 20+ years of experience in creating and growing leading brands, products and companies. Chair of Improvability, automating due diligence and reporting for corporates, foundations and Governments with AI. Co-founder and partner at SparkLabs Group - investors in OpenAI, Anthropic, 88 Rising, Discord, Animoca, Andela, Vectara, Kneron, Messari, Lifesum + 400 companies in our portfolio. Investment Committee and LP at SparkLabs Cultiv8 with 56 investments in consumer food and regenerative agriculture companies. Co-founder and CMO - later CEO - of Equilibrium AI (Singapore), building it to one of the leading ESG and Carbon data management platforms globally. Equilibrium was acquired by FiscalNote in 2021, where he joined the senior leadership team, running the ESG business globally, and helping the company IPO in 2022 on the NYSE at $1.1B valuation. Board director at Spotify (2009-2012); Siri (2009-2010 exited to Apple); Lifesum (leading AI health app with 50 million users), seed investor in 88 Rising (Asia’s leading independent music label); CEO/CMO and co-founder at INQ Mobile (mobile internet pioneer); and Global Director for devices and products at 3 Mobile. Started as a software developer with Ericsson Mobile in Sweden, after graduating from KTH in Stockholm and the University of Sydney with a Bachelor of Mechanical Engineering, and Master of Science in Fluid Mechanics. // Related Links https://www.frontierone.ai/ and https://www.sparklabsgroup.com ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register] MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Frank on LinkedIn: /frankmeehan/…

MLOps.community

1
A New Way of Building with AI 1:04:49

۴ weeks پیش1:04:49

1:04:49

Thanks to MLflow for supporting this episode — the platform helping teams track, manage, and deploy ML and GenAI projects with ease. Try it free at mlflow.org . What if AI could build and maintain your software—like a co-worker who never forgets state? In this episode, Jiquan Ngiam chats with Demetrios about agents that actually do the work: parsing emails, updating spreadsheets, and reshaping how we design software itself. Less hype, more hands-on AI—tune in for a glimpse at the future of truly personalized computing. // Bio Jiquan Ngiam is the Co-Founder and CEO of Lutra AI, with deep expertise in artificial intelligence and machine learning. He was previously at Google Brain, Coursera, and in the Stanford CS Ph.D. program advised by Andrew Ng. He helped develop the first online courses in Machine Learning, and is now building agentic AI systems that can complete tasks for us. // Related Links https://www.youtube.com/@LutraAI #api #llm #lutra #costefficiency #latentspace ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [https://shop.mlops.community/] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Jiquan on LinkedIn: /jngiam/ Timestamps: [00:00] Agents That Actually Do Work [08:21] Building Tables With AI Help [12:54] Guardrails for Smarter Code [16:35 - 18:00] MLFlow Ad[18:30] What’s Next for MCP? [23:23] AI as Your Data Conductor [31:13] Rethinking AI + Data Stacks [32:10] Sandbox Security, Real Risks [40:48] Smarter Reviews, Powered by Use [46:08] Cost vs. Quality in AI [52:00] Podcast Editing Gets Creative [56:27] Transparent UIs, Powered by AI [01:00:28] Can AI Learn Good Taste? [01:04:45] Peeking Into Wild AI Futures…

MLOps.community

1
Inside Uber’s AI Revolution - Everything about how they use AI/ML 45:23

۵ weeks پیش45:23

45:23

Kai Wang joins the MLOps Community podcast LIVE to share how Uber built and scaled its ML platform, Michelangelo. From mission-critical models to tools for both beginners and experts, he walks us through Uber’s AI playbook—and teases plans to open-source parts of it. // Bio Kai Wang is the product lead of the AI platform team at Uber, overseeing Uber's internal end-to-end ML platform called Michelangelo that powers 100% Uber's business-critical ML use cases. // Related Links Uber GenAI: https://www.uber.com/blog/from-predictive-to-generative-ai/ #uber #podcast #ai #machinelearning ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Kai on LinkedIn: /kai-wang-67457318/ Timestamps: [00:00] Rethinking AI Beyond ChatGPT [04:01] How Devs Pick Their Tools [08:25] Measuring Dev Speed Smartly [10:14] Predictive Models at Uber [13:11] When ML Strategy Shifts [15:56] Smarter Uber Eats with AI [19:29] Summarizing Feedback with ML [23:27] GenAI That Users Notice [27:19] Inference at Scale: Michelangelo [32:26] Building Uber’s AI Studio [33:50] Faster AI Agents, Less Pain [39:21] Evaluating Models at Uber [42:22] Why Uber Open-Sourced Machanjo [44:32] What Fuels Uber’s AI Team…

MLOps.community

1
The Missing Data Stack for Physical AI 52:42

۵ weeks پیش52:42

52:42

The Missing Data Stack for Physical AI // MLOps Podcast #328 with Nikolaus West, CEO of Rerun. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract Nikolaus West, CEO of Rerun, breaks down the challenges and opportunities of physical AI—AI that interacts with the real world. He explains why traditional software falls short in dynamic environments and how visualization, adaptability, and better tooling are key to making robotics and spatial computing more practical. // Bio Niko is a second-time founder and software engineer with a computer vision background from Stanford. He’s a fanatic about bringing great computer vision and robotics products to the physical world. // Related Links Website: rerun.io ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [ https://go.mlops.community/slack ] Follow us on X/Twitter [ @mlopscommunity ]( https://x.com/mlopscommunity ) or [LinkedIn]( https://go.mlops.community/linkedin )] Sign up for the next meetup: [ https://go.mlops.community/register ] MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Niko on LinkedIn: /NikolausWest Timestamps: [00:00] Niko's preferred coffee [00:35] Physical AI vs Robotics Debate [04:40] IoT Hype vs Reality [12:16] Physical AI Lifecycle Overview [20:05] AI Constraints in Robotics [23:42] Data Challenges in Robotics [33:37] Open Sourcing AI Tools [39:36] Rerun Platform Integration [40:57] Data Integration for Insights [45:02] Data Pipelines and Quality [49:19] Robotics Design Trade-offs [52:25] Wrap up…

MLOps.community

1
AI Reliability, Spark, Observability, SLAs and Starting an AI Infra Company 1:37:22

۶ weeks پیش1:37:22

1:37:22

LLMs are reshaping the future of data and AI—and ignoring them might just be career malpractice. Yoni Michael and Kostas Pardalis unpack what’s breaking, what’s emerging, and why inference is becoming the new heartbeat of the data pipeline. // Bio Kostas Pardalis Kostas is an engineer-turned-entrepreneur with a passion for building products and companies in the data space. He’s currently the co-founder of Typedef. Before that, he worked closely with the creators of Trino at Starburst Data on some exciting projects. Earlier in his career, he was part of the leadership team at Rudderstack, helping the company grow from zero to a successful Series B in under two years. He also founded Blendo in 2014, one of the first cloud-based ELT solutions. Yoni Michael Yoni is the Co-Founder of typedef, a serverless data platform purpose-built to help teams process unstructured text and run LLM inference pipelines at scale. With a deep background in data infrastructure, Yoni has spent over a decade building systems at the intersection of data and AI — including leading infrastructure at Tecton and engineering teams at Salesforce. Yoni is passionate about rethinking how teams extract insight from massive troves of text, transcripts, and documents — and believes the future of analytics depends on bridging traditional data pipelines with modern AI workflows. At Typedef, he’s working to make that future accessible to every team, without the complexity of managing infrastructure. // Related Links Website: https://www.typedef.ai https://techontherocks.show https://www.cpard.xyz ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Kostas on LinkedIn: /kostaspardalis/ Connect with Yoni on LinkedIn: / yonimichael / Timestamps: [00:00] Breaking Tools, Evolving Data Workloads [06:35] Building Truly Great Data Teams [10:49] Making Data Platforms Actually Useful [18:54] Scaling AI with Native Integration [24:04] Empowering Employees to Build Agents [28:17] Rise of the AI Sherpa [36:09] Real AI Infrastructure Pain Points [38:05] Fixing Gaps Between Data, AI [46:04] Smarter Decisions Through Better Data [50:18] LLMs as Human-Machine Interfaces [53:40] Why Summarization Still Falls Short [01:01:15] Smarter Chunking, Fixing Text Issues [01:09:08] Evaluating AI with Canary Pipelines [01:11:46] Finding Use Cases That Matter [01:17:38] Cutting Costs, Keeping AI Quality [01:25:15] Aligning MLOps to Business Outcomes [01:29:44] Communities Thrive on Cross-Pollination [01:34:56] Evaluation Tools Quietly Consolidating…

MLOps.community

1
Greg Kamradt: Benchmarking Intelligence | ARC Prize 48:30

۷ weeks پیش48:30

48:30

What makes a good AI benchmark? Greg Kamradt joins Demetrios to break it down—from human-easy, AI-hard puzzles to wild new games that test how fast models can truly learn. They talk hidden datasets, compute tradeoffs, and why benchmarks might be our best bet for tracking progress toward AGI. It’s nerdy, strategic, and surprisingly philosophical. // Bio Greg has mentored thousands of developers and founders, empowering them to build AI-centric applications.By crafting tutorial-based content, Greg aims to guide everyone from seasoned builders to ambitious indie hackers.Greg partners with companies during their product launches, feature enhancements, and funding rounds. His objective is to cultivate not just awareness, but also a practical understanding of how to optimally utilize a company's tools.He previously led Growth @ Salesforce for Sales & Service Clouds in addition to being early on at Digits, a FinTech Series-C company. // Related Links Website: https://gregkamradt.com/ YouTube channel: https://www.youtube.com/@DataIndependent ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Greg on LinkedIn: /gregkamradt/ Timestamps: [00:00] Human-Easy, AI-Hard [05:25] When the Model Shocks Everyone [06:39] “Let’s Circle Back on That Benchmark…” [09:50] Want Better AI? Pay the Compute Bill [14:10] Can We Define Intelligence by How Fast You Learn? [16:42] Still Waiting on That Algorithmic Breakthrough [20:00] LangChain Was Just the Beginning [24:23] Start With Humans, End With AGI [29:01] What If Reality’s Just... What It Seems? [32:21] AI Needs Fewer Vibes, More Predictions [36:02] Defining Intelligence (No Pressure) [36:41] AI Building AI? Yep, We're Going There [40:13] Open Source vs. Prize Money Drama [43:05] Architecting the ARC Challenge [46:38] Agent 57 and the Atari Gauntlet…

MLOps.community

1
Bridging the Gap Between AI and Business Data // Deepti Srivastava // #325 57:13

۷ weeks پیش57:13

57:13

Bridging the Gap Between AI and Business Data // MLOps Podcast #325 with Deepti Srivastava, Founder and CEO at Snow Leopard. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract I’m sure the MLOps community is probably aware – it's tough to make AI work in enterprises for many reasons, from data silos, data privacy and security concerns, to going from POCs to production applications. But one of the biggest challenges facing businesses today, that I particularly care about, is how to unlock the true potential of AI by leveraging a company’s operational business data. At Snow Leopard, we aim to bridge the gap between AI systems and critical business data that is locked away in databases, data warehouses, and other API-based systems, so enterprises can use live business data from any data source – whether it's database, warehouse, or APIs – in real time and on demand, natively. In this interview, I'd like to cover Snow Leopard’s intelligent data retrieval approach that can leverage business data directly and on-demand to make AI work. // Bio Deepti is the founder and CEO of Snow Leopard AI, a platform that helps teams build AI apps using their live business data, on-demand. She has nearly 2 decades of experience in data platforms and infrastructure. As Head of Product at Observable, Deepti led the 0→1 product and GTM strategy in the crowded data analytics market. Before that, Deepti was the founding PM for Google Spanner, growing it to thousands of internal customers (Ads, PlayStore, Gmail, etc.), before launching it externally as a seminal cloud database service. Deepti started her career as a distributed systems engineer in the RAC database kernel at Oracle. // Related Links Website: https://www.snowleopard.ai/ AI SQL Data Analyst // Donné Stevenson - https://youtu.be/hwgoNmyCGhQ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [ https://go.mlops.community/slack ] Follow us on X/Twitter [ @mlopscommunity ]( https://x.com/mlopscommunity ) or [LinkedIn]( https://go.mlops.community/linkedin )] Sign up for the next meetup: [ https://go.mlops.community/register ] MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Deepti on LinkedIn: /thedeepti/ Timestamps: [00:00] Deepti's preferred coffee [00:49] MLflow vs Kubeflow Debate [04:58] GenAI Data Integration Challenges [09:02] GenAI Sidecar Spicy Takes [14:07] Troubleshooting LLM Hallucinations [19:03] AI Overengineering and Hype [25:06] Self-Serve Analytics Governance [33:29] Dashboards vs Data Quality [37:06] Agent Database Context Control [43:00] LLM as Orchestrator [47:34] Tool Call Ownership Clarification [51:45] MCP Server Challenges [56:52] Wrap up…

MLOps.community

1
The Creator of FastAPI’s Next Chapter // Sebastián Ramírez // #324 1:09:37

۷ weeks پیش1:09:37

1:09:37

The Creator of FastAPI’s Next Chapter // MLOps Podcast #324 with Sebastián Ramírez, Developer at FastAPI Labs. Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // Abstract The creator of FastAPI is back with a new chapter—FastAPI Cloud. From building one of the most loved dev tools to launching a company, Sebastián Ramírez shares how open source, developer experience, and a dash of humor are shaping the future of APIs. // Bio Sebastián Ramírez (also known as Tiangolo) is the creator of FastAPI, Typer, SQLModel, Asyncer, and several other widely used open-source tools. He has collaborated with companies and teams around the world—from Latin America to the Middle East, Europe, and the United States—building a range of products and custom solutions focused on APIs, data processing, distributed systems, and machine learning. Today, he works full time on FastAPI and its growing ecosystem. // Related Links Website: https://tiangolo.com/ FastAPI: https://fastapi.tiangolo.com/ FastAPI Cloud: https://fastapicloud.com/ FastAPI for Machine Learning // Sebastián Ramírez // MLOps Coffee Sessions #96 - https://youtu.be/NpvRhZnkEFg ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore Join our Slack community [https://go.mlops.community/slack] Follow us on X/Twitter [ @mlopscommunity ]( https://x.com/mlopscommunity ) or [LinkedIn]( https://go.mlops.community/linkedin )] Sign up for the next meetup: [ https://go.mlops.community/register ] MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Tiangolo on LinkedIn: /tiangolo Timestamps: [00:00] Sebastián's preferred coffee [00:15] Takeaways [01:43] Why Pydantic is Awesome [06:47] ML Background and FastAPI [10:44] NASA FastAPI Emojis [15:21] FastAPI Cloud Journey [26:07] FastAPI Cloud Open-Source Balance [31:45] Basecamp Design Philosophy [35:30] AI Abstraction Strategies [42:56] Engineering vs Developer Experience [51:40] Dogfooding and Docs Strategy [59:44] Code Simplicity and Trust [1:04:26] Scaling Without Losing Vision [1:08:20] FastAPI Cloud Signup [1:09:23] Wrap up…

MLOps.community

1
Everything Hard About Building AI Agents Today 47:02

۸ weeks پیش47:02

47:02

Willem Pienaar and Shreya Shankar discuss the challenge of evaluating agents in production where "ground truth" is ambiguous and subjective user feedback isn't enough to improve performance. The discussion breaks down the three "gulfs" of human-AI interaction—Specification, Generalization, and Comprehension—and their impact on agent success. Willem and Shreya cover the necessity of moving the human "out of the loop" for feedback, creating faster learning cycles through implicit signals rather than direct, manual review.The conversation details practical evaluation techniques, including analyzing task failures with heat maps and the trade-offs of using simulated environments for testing. Willem and Shreya address the reality of a "performance ceiling" for AI and the importance of categorizing problems your agent can, can learn to, or will likely never be able to solve. // Bio Shreya Shankar PhD student in data management for machine learning. Willem Pienaar Willem Pienaar, CTO of Cleric, is a builder with a focus on LLM agents, MLOps, and open source tooling. He is the creator of Feast, an open source feature store, and contributed to the creation of both the feature store and MLOps categories. Before starting Cleric, Willem led the open source engineering team at Tecton and established the ML platform team at Gojek, where he built high scale ML systems for the Southeast Asian decacorn. // Related Links https://www.google.com/about/careers/applications/?utm_campaign=profilepage&utm_medium=profilepage&utm_source=linkedin&src=Online/LinkedIn/linkedin_page https://cleric.ai/ ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~ Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore MLOps Swag/Merch: [ https://shop.mlops.community/ ] Connect with Demetrios on LinkedIn: /dpbrinkm Connect with Shreya on LinkedIn: /shrshnk Connect with Willem on LinkedIn: /willempienaar Timestamps: [00:00] Trust Issues in AI Data [04:49] Cloud Clarity Meets Retrieval [09:37] Why Fast AI Is Hard [11:10] Fixing AI Communication Gaps [14:53] Smarter Feedback for Prompts [19:23] Creativity Through Data Exploration [23:46] Helping Engineers Solve Faster [26:03] The Three Gaps in AI [28:08] Alerts Without the Noise [33:22] Custom vs General AI [34:14] Sharpening Agent Skills [40:01] Catching Repeat Failures [43:38] Rise of Self-Healing Software [44:12] The Chaos of Monitoring AI…

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

به بیش از 500 موضوع گوش کنید

56 subscribers

Apple 2025 MacBook Air 13-inch Laptop with M4 chip: Built for Apple Intelligence, 13.6-inch Liquid Retina Display, 16GB Unified Memory, 256GB SSD Storage, 12MP Center Stage Camera, Touch ID; Midnight

Microsoft Office Home 2024 | Classic Apps: Word, Excel, PowerPoint | One-Time Purchase for 1 PC/MAC | Instant Download | Formerly Home & Student 2021 [PC/Mac Online Code]

Amazon Basics Multipurpose Copy Printer Paper, 8.5 x 11 Inches, 20 lb, 1 Ream, (500 Sheets), 92 Bright, White

پادکست هایی که ارزش شنیدن دارند

MLOps.community « » Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241

Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241

پادکست هایی که ارزش شنیدن دارند

به Player FM خوش آمدید!

TERRO Ant Killer Bait Stations T300B - Liquid Bait to Eliminate Ants - 12 Count Stations for Effective Indoor Ant Control

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

Tubi: Watch Free Movies & TV Shows

Minecraft

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

راهنمای مرجع سریع

MLOps.community « »
Navigating the AI Frontier: The Power of Synthetic Data and Agent Evaluations in LLM Development // Boris Selitser // #241