Artwork

محتوای ارائه شده توسط Jonas Christensen. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Jonas Christensen یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

The Dos and Don’ts of Synthetic Data with Minhaaj Rehman

43:56
 
اشتراک گذاری
 

Manage episode 374320790 series 2951995
محتوای ارائه شده توسط Jonas Christensen. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Jonas Christensen یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Ever heard of ‘synthetic data’?

Synthetic data is data that is artificially created (from statistical models), rather than generated by actual events. It contains all the characteristics of production data, minus the sensitive stuff.

By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated, according to Gartner.

The reason organisations may use synthetic data over actual data is because you can get it more quickly, easily and cheaply.

But there are concerns with this approach, because synthetic data is based on models and algorithms designed by humans and their biases.

More data doesn’t necessarily equal better data.

Is synthetic data a brilliant tool for improving data quality, reducing data acquisition costs, managing privacy and reducing overfitting?

Or does synthetic data put us on a slippery slope of hard-to-interrogate models that are technically replacing fact with fiction?

To answer these questions, I recently spoke to Minhaaj Rehman, who is CEO & Chief Data Scientist at Psyda, an AI-enabled academic and industrial research agency.

In this episode of Leaders of Analytics, you will learn:

  • What synthetic data is and how it is generated
  • The most common uses for synthetic data
  • The arguments for and against using synthetic data
  • When synthetic data is most helpful and when it is most risky
  • How to implement best practices for mitigating the risks associated with synthetic data, and much more.

Episode timestamps:

00:00 Intro

03:00 What Psyda Does

04:23 Academic Work and Modern Education

06:38 Getting into Data Science

11:30 What is Synthetic Data

13:30 Common Applications for Synthetic Data

18:50 Pros & Cons of using Synthetic Data

21:29 Risks of using Synthetic Data

23:48 When should Synthetic Data be Used

29:23 Synthetic Data is Cleaner than Real Data

34:05 Using Synthetic Data for Risk Mitigation

36:05 Resources on Learning More about Synthetic Data

38:05 Human Biases in Decision Making

Connect with Minhaaj:

Minhaaj on LinkedIn: https://www.linkedin.com/in/minhaaj/

Minhaaj's website and podcast: https://minhaaj.com/

  continue reading

61 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 374320790 series 2951995
محتوای ارائه شده توسط Jonas Christensen. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Jonas Christensen یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Ever heard of ‘synthetic data’?

Synthetic data is data that is artificially created (from statistical models), rather than generated by actual events. It contains all the characteristics of production data, minus the sensitive stuff.

By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated, according to Gartner.

The reason organisations may use synthetic data over actual data is because you can get it more quickly, easily and cheaply.

But there are concerns with this approach, because synthetic data is based on models and algorithms designed by humans and their biases.

More data doesn’t necessarily equal better data.

Is synthetic data a brilliant tool for improving data quality, reducing data acquisition costs, managing privacy and reducing overfitting?

Or does synthetic data put us on a slippery slope of hard-to-interrogate models that are technically replacing fact with fiction?

To answer these questions, I recently spoke to Minhaaj Rehman, who is CEO & Chief Data Scientist at Psyda, an AI-enabled academic and industrial research agency.

In this episode of Leaders of Analytics, you will learn:

  • What synthetic data is and how it is generated
  • The most common uses for synthetic data
  • The arguments for and against using synthetic data
  • When synthetic data is most helpful and when it is most risky
  • How to implement best practices for mitigating the risks associated with synthetic data, and much more.

Episode timestamps:

00:00 Intro

03:00 What Psyda Does

04:23 Academic Work and Modern Education

06:38 Getting into Data Science

11:30 What is Synthetic Data

13:30 Common Applications for Synthetic Data

18:50 Pros & Cons of using Synthetic Data

21:29 Risks of using Synthetic Data

23:48 When should Synthetic Data be Used

29:23 Synthetic Data is Cleaner than Real Data

34:05 Using Synthetic Data for Risk Mitigation

36:05 Resources on Learning More about Synthetic Data

38:05 Human Biases in Decision Making

Connect with Minhaaj:

Minhaaj on LinkedIn: https://www.linkedin.com/in/minhaaj/

Minhaaj's website and podcast: https://minhaaj.com/

  continue reading

61 قسمت

すべてのエピソード

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع