Artwork

محتوای ارائه شده توسط Real Python. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Real Python یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Speeding Up Your DataFrames With Polars

57:57
 
اشتراک گذاری
 

Manage episode 353028026 series 2637014
محتوای ارائه شده توسط Real Python. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Real Python یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

How can you get more performance from your existing data science infrastructure? What if a DataFrame library could take advantage of your machine’s available cores and provide built-in methods for handling larger-than-RAM datasets? This week on the show, Liam Brannigan is here to discuss Polars.

Liam is an experienced data scientist working in finance, technology, and environmental analysis. He’s recently started contributing to the documentation for Polars and developing a training course for the library.

We talk about the library’s overall speed and lack of additional dependencies. Liam explains the advantages of lazy vs eager mode and which to choose when performing data exploration or attempting to load a dataset larger than your RAM.

We also discuss potential barriers to switching to Polars from a pandas workflow. Across our conversation, we explore several other libraries and technologies, including Apache Arrow, DuckDB, query optimization, and the “rustification” of Python tools.

Course Spotlight: Graph Your Data With Python and ggplot

In this course, you’ll learn how to use ggplot in Python to build data visualizations with plotnine. You’ll discover what a grammar of graphics is and how it can help you create plots in a very concise and consistent way.

Show Topics:

  • 00:00:00 – Introduction
  • 00:02:06 – Liam’s background and intro to Polars
  • 00:03:37 – Hurdles to switching to Polars
  • 00:05:23 – Creating training resources
  • 00:08:15 – No index
  • 00:09:46 – Data science 2025 predictions
  • 00:12:02 – Contributions to Polars
  • 00:15:07 – Eager vs lazy mode & query optimization
  • 00:19:25 – Sponsor: Anaconda Nucleus
  • 00:20:00 – Apache Arrow and parquet
  • 00:24:43 – DuckDB and column orientation
  • 00:29:27 – The “rustification” of libraries
  • 00:34:49 – Video Course Spotlight
  • 00:36:16 – GPUs and memory requirements
  • 00:45:49 – No additional library requirements
  • 00:47:37 – Development of the ecosystem
  • 00:51:33 – Chaining operations
  • 00:53:39 – How can people follow your work?
  • 00:54:51 – What are you excited about in the world of Python?
  • 00:56:09 – What do you want to learn next?
  • 00:56:58 – Thanks and goodbye

Show Links:

Level up your Python skills with our expert-led courses:

Support the podcast & join our community of Pythonistas

  continue reading

272 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 353028026 series 2637014
محتوای ارائه شده توسط Real Python. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Real Python یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

How can you get more performance from your existing data science infrastructure? What if a DataFrame library could take advantage of your machine’s available cores and provide built-in methods for handling larger-than-RAM datasets? This week on the show, Liam Brannigan is here to discuss Polars.

Liam is an experienced data scientist working in finance, technology, and environmental analysis. He’s recently started contributing to the documentation for Polars and developing a training course for the library.

We talk about the library’s overall speed and lack of additional dependencies. Liam explains the advantages of lazy vs eager mode and which to choose when performing data exploration or attempting to load a dataset larger than your RAM.

We also discuss potential barriers to switching to Polars from a pandas workflow. Across our conversation, we explore several other libraries and technologies, including Apache Arrow, DuckDB, query optimization, and the “rustification” of Python tools.

Course Spotlight: Graph Your Data With Python and ggplot

In this course, you’ll learn how to use ggplot in Python to build data visualizations with plotnine. You’ll discover what a grammar of graphics is and how it can help you create plots in a very concise and consistent way.

Show Topics:

  • 00:00:00 – Introduction
  • 00:02:06 – Liam’s background and intro to Polars
  • 00:03:37 – Hurdles to switching to Polars
  • 00:05:23 – Creating training resources
  • 00:08:15 – No index
  • 00:09:46 – Data science 2025 predictions
  • 00:12:02 – Contributions to Polars
  • 00:15:07 – Eager vs lazy mode & query optimization
  • 00:19:25 – Sponsor: Anaconda Nucleus
  • 00:20:00 – Apache Arrow and parquet
  • 00:24:43 – DuckDB and column orientation
  • 00:29:27 – The “rustification” of libraries
  • 00:34:49 – Video Course Spotlight
  • 00:36:16 – GPUs and memory requirements
  • 00:45:49 – No additional library requirements
  • 00:47:37 – Development of the ecosystem
  • 00:51:33 – Chaining operations
  • 00:53:39 – How can people follow your work?
  • 00:54:51 – What are you excited about in the world of Python?
  • 00:56:09 – What do you want to learn next?
  • 00:56:58 – Thanks and goodbye

Show Links:

Level up your Python skills with our expert-led courses:

Support the podcast & join our community of Pythonistas

  continue reading

272 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش