Artwork

محتوای ارائه شده توسط O'Reilly Media. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط O'Reilly Media یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Tools for machine learning development

39:24
 
اشتراک گذاری
 

Manage episode 248276633 series 61203
محتوای ارائه شده توسط O'Reilly Media. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط O'Reilly Media یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

In this week’s episode of the Data Show, we’re featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning (ML).

Here are a few highlights:

Tools for productive collaboration

A data catalog, at a high level, basically answers questions around the data that’s available and who is using it so an enterprise can understand access patterns. … The term “data catalog” is generally used when you’ve gotten to the point where you have a team of data scientists and you need a place where they can use libraries in a setting where they can collaborate, and where they can share not only models but maybe even data pipelines and features. The more advanced data science platforms will have automation tools built in. … The ideal scenario is the data science platform is not just for prototyping, but also for pushing things to production.

Tools for ML development

We have tools for software development, and now we’re beginning to hear about tools for machine learning development—there’s a company here at Strata called Comet.ml, and there’s another startup called Verta.ai. But what has really caught my attention is an open source project from Databricks called MLflow. When it first came out, I thought, ‘Oh, yeah, so we don’t have anything like this. Might have a decent chance of success.’ But I didn’t pay close attention until recently; fast forward to today, there are 80 contributors for 40 companies and 200+ companies using it.

What’s good about MLflow is that it has three components and you’re free to pick and choose—you can use one, two, or three. Based on their surveys, the most popular component is the one for tracking and managing machine learning experiments. It’s designed to be useful for individual data scientists, but it’s also designed to be used by teams of data scientists, so they have documented use-cases of MLflow where you have a company managing thousands of models and productions.

  continue reading

168 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 248276633 series 61203
محتوای ارائه شده توسط O'Reilly Media. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط O'Reilly Media یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

In this week’s episode of the Data Show, we’re featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning (ML).

Here are a few highlights:

Tools for productive collaboration

A data catalog, at a high level, basically answers questions around the data that’s available and who is using it so an enterprise can understand access patterns. … The term “data catalog” is generally used when you’ve gotten to the point where you have a team of data scientists and you need a place where they can use libraries in a setting where they can collaborate, and where they can share not only models but maybe even data pipelines and features. The more advanced data science platforms will have automation tools built in. … The ideal scenario is the data science platform is not just for prototyping, but also for pushing things to production.

Tools for ML development

We have tools for software development, and now we’re beginning to hear about tools for machine learning development—there’s a company here at Strata called Comet.ml, and there’s another startup called Verta.ai. But what has really caught my attention is an open source project from Databricks called MLflow. When it first came out, I thought, ‘Oh, yeah, so we don’t have anything like this. Might have a decent chance of success.’ But I didn’t pay close attention until recently; fast forward to today, there are 80 contributors for 40 companies and 200+ companies using it.

What’s good about MLflow is that it has three components and you’re free to pick and choose—you can use one, two, or three. Based on their surveys, the most popular component is the one for tracking and managing machine learning experiments. It’s designed to be useful for individual data scientists, but it’s also designed to be used by teams of data scientists, so they have documented use-cases of MLflow where you have a company managing thousands of models and productions.

  continue reading

168 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع