Player FM - Internet Radio Done Right
33 subscribers
Checked 1d ago
اضافه شده در eight سال پیش
محتوای ارائه شده توسط The Data Flowcast. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Data Flowcast یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده
T
The Laylee Emadi Podcast — For Speakers, Coaches, and Course Creators

Are unpaid speaking gigs worth it or a complete waste of your time? In this episode, I’m pulling back the curtain on a question that every industry educator and speaker eventually asks: “Should I take the stage if I’m not getting paid?” You’ll hear from powerhouse speakers and event hosts like Jordan Gill, Elizabeth Henson, Amber Housely, Kristina Bartold, Jamie Fischer, Heather Leicy, Amanda Smith, and Tomasha Suber, who share the real stories and strategies behind their decisions to say yes (or no) to unpaid gigs. These clips are part of a larger speaker interview series inside of Sought-After Speaker System, my course for speakers. Check out the full uncut interviews inside the course. Want to become the kind of speaker event hosts are excited to put on stage? My Sought-After Speaker System shows you how to craft irresistible pitches, build relationships with organizers, and create signature talks that make you the obvious choice. LEARN MORE Are Unpaid Speaking Gigs Worth It? Here’s How to Know When to Say Yes This is a question every industry educator wrestles with at some point. Whether you’re just starting out or scaling your speaking career, unpaid opportunities can feel like a gamble. But after interviewing some of the smartest speakers and event hosts in the business, one thing became crystal clear: unpaid doesn’t have to mean unprofitable . The truth is, unpaid speaking gigs can absolutely work in your favor—if you know what to look for, how to prepare, and how to measure their value strategically. Let’s break it down. Why Speakers Say Yes to Unpaid Gigs (Strategically) Think of it as a Marketing Expense Jordan Gill, who completed 11 speaking engagements in a single year, only accepted one completely unpaid opportunity and she’d do it again. Why? Because the audience was perfectly aligned and the event delivered real value beyond compensation. Instead of viewing unpaid gigs as losses, she budgets for them like she would Facebook ads or other marketing efforts. Her advice: “If I’d pay for ads to reach this audience, I’ll consider paying to be in the room with them instead.” That’s a powerful mindset shift. It reframes speaking as an investment , not a favor. You Need a Funnel That Converts Unpaid gigs are only worthwhile if they convert. Elizabeth Henson evaluates each event through the lens of whether her funnel is likely to perform. She’s not pitching from stage—she’s offering a quiz that seamlessly fits into her keynote. From there, she drives attendees into her ecosystem and into her higher-ticket offers. “If I get one sale from a talk, that usually covers the cost of travel,” she explained. “And that’s all I need.” The key takeaway here: Exposure without a sales system is just vanity. Don’t say yes to a stage if you don’t have a clear plan to turn attention into action. Other Forms of Value Beyond a Paycheck The Real ROI Is in Relationships Amber Housely emphasizes what many overlook: the relationships behind the scenes. From networking with other speakers to connecting with attendees and hosts, unpaid gigs often create the kind of opportunities that snowball into future collaborations, invites, and referrals. If the event puts you in a room with aligned people, decision-makers, or potential collaborators: that’s value. The Power of Staying Present Kristina Bartold brought up something that might surprise newer speakers: attending the full event matters. She’s seen a significant drop in breakout room attendance when speakers only show up for their time slot and leave. By sticking around, you deepen connections and maximize visibility. You don’t just show up—you become part of the experience. How to Decide: A Framework for Evaluating Unpaid Gigs If you’re considering an unpaid speaking gig, here are five questions to ask yourself before you say yes: 1. Is the audience made up of my ideal clients or buyers? You don’t need thousands in the room—just the right people. 2. Do I have a clear funnel or follow-up system in place? If you’re not pitching, how are you inviting them into your world? 3. Will I receive any valuable assets from the event? This includes high-quality video, photography, testimonials, or repurposable content. 4. Can I build meaningful relationships with speakers or hosts? Think long-term. One connection can lead to countless future opportunities. 5. Does this opportunity align with my current business goals or season? Sometimes you’re in a growth phase. Sometimes you need revenue now. Your decision should reflect that. Why Boundaries Matter (And Why You Should Share Them) Tomasha’s perspective is a masterclass in professional clarity. She no longer entertains unpaid gigs that don’t clearly communicate expectations from the start. She encourages other speakers to hold their standards just as firmly. Her take? If a gig isn’t aligned with your values, goals, or boundaries, say no with confidence. And remember: saying no can still be a way of serving— it creates space for another speaker who might benefit more from the opportunity . You Can Always Say No. But You Can Also Say Yes (Intentionally) There’s no right or wrong answer when it comes to unpaid speaking. What matters is that your decision is intentional. Amanda Smith shared how she and another speaker creatively structured a speaking partnership with no paycheck but tons of value: travel as a personal retreat, stage access, aligned audience, and media opportunities. The key? Clear communication and mutual benefit. Bottom line: There are dozens of ways to create value and dozens of ways to waste your time. Knowing the difference is what sets you apart. Should You Take That Unpaid Speaking Opportunity? Unpaid speaking gigs aren’t automatically good or bad. But when you understand your goals, have systems in place, and communicate clearly, they can open doors you didn’t even know were there. So the next time an unpaid opportunity lands in your inbox, don’t dismiss it right away. Use this framework to decide if it’s actually strategic . And if you’re ready to take your speaking career to the next level with a clear path to pitching, speaking, and selling, I’ve got something just for you. Want to become the kind of speaker event hosts are excited to put on stage? My Sought-After Speaker System shows you how to craft irresistible pitches, build relationships with organizers, and create signature talks that make you the obvious choice. LEARN MORE Mentioned in this Episode The Sought-After Speaker System Connect with the Guests Elizabeth Henson: elizabethhenson.co Jordan Gill: systemssavedme.com Amber Housley: amberhousley.com Kristina Bartold: highvibewomen.ca Joanna Waterfall: instagram.com/joannawaterfall Jamie & Heather: instagram.com/theconquercommunity Amanda Smith: dallasgirlgang.com Tomasha Suber: tomashasuber.com Looking for the Transcript? DOWNLOAD NOW The post 224: Are Unpaid Speaking Gigs Worth It? appeared first on Laylee Emadi | Coach for Creative Educators .…
Season One Teaser
Manage episode 198197231 series 2053958
محتوای ارائه شده توسط The Data Flowcast. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Data Flowcast یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
A sneak peek at our upcoming podcast about Apache Airflow. Featured in this clip (in order of appearance): Pete DeJoy - Product Specialist at Astronomer Patrick Atwater - Water Data Projects Manager at ARGO Labs Maksime Pecherskiy - Chief Data Officer of the City of San Diego Bolke de Bruin - Head of Advanced Analytics at ING
…
continue reading
78 قسمت
Manage episode 198197231 series 2053958
محتوای ارائه شده توسط The Data Flowcast. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط The Data Flowcast یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
A sneak peek at our upcoming podcast about Apache Airflow. Featured in this clip (in order of appearance): Pete DeJoy - Product Specialist at Astronomer Patrick Atwater - Water Data Projects Manager at ARGO Labs Maksime Pecherskiy - Chief Data Officer of the City of San Diego Bolke de Bruin - Head of Advanced Analytics at ING
…
continue reading
78 قسمت
همه قسمت ها
×T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The life sciences industry relies on data accuracy, regulatory insight and quality intelligence. Building a unified system that keeps these elements aligned is no small feat. In this episode, we welcome Shankar Mahindar , Senior Data Engineer II at Redica Systems . We discuss how the team restructures its data platform with Airflow to strengthen governance, reduce compliance risk and improve customer experience. Key Takeaways: 00:00 Introduction. 01:53 A focused analytics platform reduces compliance risk in life sciences. 07:31 A centralized warehouse orchestrated by Airflow strengthens governance. 09:12 Managed orchestration keeps attention on analytics and outcomes. 10:32 A modern transformation stack enables scalable modeling and operations. 11:51 Event-driven pipelines improve data freshness and responsiveness. 14:13 Asset-oriented scheduling and versioning enhance reliability and change control. 16:53 Observability and SLAs build confidence in data quality and freshness. 21:04 Priorities include partitioned assets and streamlined developer tooling. Resources Mentioned: Shankar Mahindar https://www.linkedin.com/in/shankar-mahindar-83a61b137/ Redica Systems | LinkedIn https://www.linkedin.com/company/redicasystems/ Redica Systems | Website https://redica.com Apache Airflow https://airflow.apache.org/ Astronomer https://www.astronomer.io/ Snowflake https://www.snowflake.com/ AWS https://aws.amazon.com/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
1 How Airflow and AI Power Investigative Journalism at the Financial Times with Zdravko Hvarlingov 24:28
The Financial Times leverages Airflow and AI to uncover powerful stories hidden within vast, unstructured data. In this episode, Zdravko Hvarlingov , Senior Software Engineer at the Financial Times , discusses building multi-tenant Airflow systems and AI-driven pipelines that surface stories that might otherwise be missed. Zdravko walks through entity extraction and fuzzy matching, linking the UK Register of Members’ Financial Interests with Companies House, and how this work cuts weeks of manual analysis to minutes. Key Takeaways: 00:00 Introduction. 02:12 What computational journalism means for day-to-day newsroom work. 05:22 Why a shared orchestration platform supports consistent, scalable workflows. 08:30 Tradeoffs of one centralized platform versus many separate instances. 11:52 Using pipelines to structure messy sources for faster analysis. 14:14 Turning recurring disclosures into usable data for investigations. 16:03 Applying lightweight ML and matching to reveal entities and links. 18:46 How automation reduces manual effort and shortens time to insight. 20:41 Practical improvements that make backfilling and reliability easier. Resources Mentioned: Zdravko Hvarlingov https://www.linkedin.com/in/zdravko-hvarlingov-3aa36016b/ Financial Times | LinkedIn https://www.linkedin.com/company/financial-times/ Financial Times | Website https://www.ft.com/ Apache Airflow https://airflow.apache.org/ UK Register of Members’ Financial Interests https://www.parliament.uk/mps-lords-and-offices/standards-and-financial-interests/parliamentary-commissioner-for-standards/registers-of-interests/register-of-members-financial-interests/ UK Companies House https://www.gov.uk/government/organisations/companies-house Doppler https://www.doppler.com/ Kubernetes https://kubernetes.io/ Airflow Kubernetes Executor https://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.html GitHub https://github.com/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The shift from monolithic to decentralized data workflows changes how teams build, connect and scale pipelines. In this episode, we feature Oscar Ligthart , Lead Data Engineer, and Rodrigo Loredo , Lead Analytics Engineer, both at Vinted , as we unpack their YAML-driven abstraction that generates Airflow DAGs and standardizes cross-team orchestration. Key Takeaways: 00:00 Introduction. 05:28 Challenges of decentralization. 06:45 YAML-based generator standardizes pipelines and dependencies. 12:28 Declarative assets and sensors align cross-DAG dependencies. 17:29 Task-level callbacks enable auto-recovery and clear ownership. 21:39 Standardized building blocks simplify upgrades and maintenance. 24:52 Platform focus frees domain work. 26:49 Container-only standardization prevents sprawl. Resources Mentioned: Oscar Ligthart https://www.linkedin.com/in/oscar-ligthart/ Rodrigo Loredo https://www.linkedin.com/in/rodrigo-loredo-410a16134/ Vinted | LinkedIn https://www.linkedin.com/company/vinted/ Vinted | Website https://www.vinted.com/?srsltid=AfmBOor87MGR_eLOauCO93V9A-aLDaAhGYx9cnu_oN8s1SAXMlCRuhW7 Apache Airflow https://airflow.apache.org/ Kubernetes https://kubernetes.io/ dbt https://www.getdbt.com/ Google Cloud Vertex AI https://cloud.google.com/vertex-ai Airflow Datasets & Assets (concepts) https://www.astronomer.io/docs/learn/airflow-datasets Airflow Summit https://airflowsummit.org/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The shift from simple cron jobs to orchestrated AI-powered workflows is reshaping how startups scale. For a small team, these transitions come with unique challenges and big opportunities. In this episode, Naseem Shah , Head of Engineering at Xena Intelligence , shares how he built data pipelines from scratch, adopted Apache Airflow and transformed Amazon review analysis with LLMs. Key Takeaways: 00:00 Introduction. 03:28 The importance of building initial products that support growth and investment. 06:16 The process of adopting new tools to improve reliability and efficiency. 09:29 Approaches to learning complex technologies through practice and fundamentals. 13:57 Trade-offs small teams face when balancing performance and costs. 18:40 Using AI-driven approaches to generate insights from large datasets. 22:38 How unstructured data can be transformed into actionable information. 25:55 Moving from manual tasks to fully automated workflows. 28:05 Orchestration as a foundation for scaling advanced use cases. Resources Mentioned: Naseem Shah https://www.linkedin.com/in/naseemshah/ Xena Intelligence | LinkedIn https://www.linkedin.com/company/xena-intelligence/ Xena Intelligence | Website https://xenaintelligence.com/ Apache Airflow https://airflow.apache.org/ Google Cloud Composer https://cloud.google.com/composer Techstars https://www.techstars.com/ Docker https://www.docker.com/ AWS SQS https://aws.amazon.com/sqs/ PostgreSQL https://www.postgresql.org/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
1 Scaling Geospatial Workflows With Airflow at Overture Maps Foundation and Wherobots with Alex Iannicelli and Daniel Smith 24:03
Using Airflow to orchestrate geospatial data pipelines unlocks powerful efficiencies for data teams. The combination of scalable processing and visual observability streamlines workflows, reduces costs and improves iteration speed. In this episode, Alex Iannicelli , Staff Software Engineer at Overture Maps Foundation , and Daniel Smith , Senior Solutions Architect at Wherobots , join us to discuss leveraging Apache Airflow and Apache Sedona to process massive geospatial datasets, build reproducible pipelines and orchestrate complex workflows across platforms. Key Takeaways: 00:00 Introduction. 03:22 How merging multiple data sources supports comprehensive datasets. 04:20 The value of flexible configurations for running pipelines on different platforms. 06:35 Why orchestration tools are essential for handling continuous data streams. 09:45 The importance of observability for monitoring progress and troubleshooting issues. 11:30 Strategies for processing large, complex datasets efficiently. 13:27 Expanding orchestration beyond core pipelines to automate frequent tasks. 17:02 Advantages of using open-source operators to simplify integration and deployment. 20:32 Desired improvements in orchestration tools for usability and workflow management. Resources Mentioned: Alex Iannicelli https://www.linkedin.com/in/atiannicelli/ Overture Maps Foundation | LinkedIn https://www.linkedin.com/company/overture-maps-foundation/ Overture Maps Foundation | Website https://overturemaps.org Daniel Smith https://www.linkedin.com/in/daniel-smith-analyst/ Wherobots | LinkedIn https://www.linkedin.com/company/wherobots Wherobots | Website https://www.wherobots.com Apache Airflow https://airflow.apache.org/ Apache Sedona https://sedona.apache.org/ Github repo https://github.com/wherobots/airflow-providers-wherobots Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
PepsiCo’s data platform drives insights across finance, marketing and data science. Delivering stability, scalability and developer delight is central to its success, and engineering leadership plays a key role in making this possible. In this episode, Kunal Bhattacharya , Senior Manager of Data Platform Engineering at PepsiCo , shares how his team manages Airflow at scale while ensuring security, performance and cost efficiency. Key Takeaways: 00:00 Introduction. 02:31 Enabling developer delight by extending platform capabilities. 03:56 Role of Snowflake, dbt and Airflow in PepsiCo’s data stack. 06:10 Local developer environments built using official Airflow Helm charts. 07:13 Pre-staging and PR environments as testing playgrounds. 08:08 Automating labeling and resource allocation via DAG factories. 12:16 Cost optimization through pod labeling and Datadog insights. 14:01 Isolating dbt engines to improve performance across teams. 16:12 Wishlist for Airflow 3: Improved role-based grants and database modeling. Resources Mentioned: Kunal Bhattacharya https://www.linkedin.com/in/kunaljubce/ PepsiCo | LinkedIn https://www.linkedin.com/company/pepsico/ PepsiCo | Website https://www.pepsico.com Apache Airflow https://airflow.apache.org/ Snowflake https://www.snowflake.com dbt https://www.getdbt.com Kubernetes https://kubernetes.io Great Expectations https://greatexpectations.io Monte Carlo https://www.montecarlodata.com Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The orchestration of data workflows at scale requires both flexibility and security. At Pattern, decoupling scheduling from orchestration has reshaped how data teams manage large-scale pipelines. In this episode, we are joined by William Graham , Senior Data Engineer at Pattern , who explains how his team leverages Apache Airflow alongside their open-source tool Heimdall to streamline scheduling, orchestration and access management. Key Takeaways: 00:00 Introduction. 02:44 Structure of Pattern’s data teams across acquisition, engineering and platform. 04:27 How Airflow became the central scheduler for batch jobs. 08:57 Credential management challenges that led to decoupling scheduling and orchestration. 12:21 Heimdall simplifies multi-application access through a unified interface. 13:15 Standardized operators in Airflow using Heimdall integration. 17:13 Open-source contributions and early adoption of Heimdall within Pattern. 21:01 Community support for Airflow and satisfaction with scheduling flexibility. Resources Mentioned: William Graham https://www.linkedin.com/in/willgraham2/ Pattern | LinkedIn https://www.linkedin.com/company/pattern-hq/ Pattern | Website https://pattern.com Apache Airflow https://airflow.apache.org Heimdall on GitHub https://github.com/patterninc/heimdall Netflix Genie https://netflix.github.io/genie/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The evolution of Airflow continues to shape data orchestration and monitoring strategies. Leveraging it beyond traditional ETL use cases opens powerful new possibilities for proactive support and internal operations. In this episode, we are joined by Collin McNulty , Sr. Director of Global Support at Astronomer , who shares insights from his journey into data engineering and the lessons learned from leading Astronomer’s Customer Reliability Engineering (CRE) team. Key Takeaways: 00:00 Introduction. 03:07 Lessons learned in adapting to major platform transitions. 05:18 How proactive monitoring improves reliability and customer experience. 08:10 Using automation to enhance internal support processes. 12:09 Why keeping systems current helps avoid unnecessary issues. 15:14 Approaches that strengthen system reliability and efficiency. 18:46 Best practices for simplifying complex orchestration dependencies. 23:24 Anticipated innovations that expand orchestration capabilities. Resources Mentioned: Collin McNulty https://www.linkedin.com/in/collin-mcnulty/ Astronomer | LinkedIn https://www.linkedin.com/company/astronomer/ Astronomer | Website https://www.astronomer.io Apache Airflow https://airflow.apache.org/ Prometheus https://prometheus.io/ Splunk https://www.splunk.com/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The shift to a unified data platform is reshaping how pharmaceutical companies manage and orchestrate data. Establishing standards across regions and teams ensures scalability and efficiency in handling large-scale analytics. In this episode, Evgenii Prusov , Senior Data Platform Engineer of Daiichi Sankyo Europe GmbH , joins us to discuss building and scaling a centralized data platform with Airflow and Astronomer. Key Takeaways: 00:00 Introduction. 02:49 Building a centralized data platform for 15 European countries. 05:19 Adopting SaaS to manage Airflow from day one. 07:01 Leveraging Airflow for data orchestration across products. 08:16 Teaching non-Python users how to work with Airflow is challenging. 12:25 Creating a global data community across Europe, the US and Japan. 14:04 Monthly calls help share knowledge and align regional teams. 15:47 Contributing to the open-source Airflow project as a way to deepen expertise. 16:32 Desire for more guidelines, debugging tutorials and testing best practices in Airflow. Resources Mentioned: Evgenii Prusov https://www.linkedin.com/in/prusov/ Daiichi Sankyo Europe GmbH | LinkedIn https://www.linkedin.com/company/daiichi-sankyo-europe-gmbh/ Daiichi Sankyo Europe GmbH | Website https://www.daiichi-sankyo.eu Apache Airflow https://airflow.apache.org/ Astronomer https://www.astronomer.io/ Snowflake https://www.snowflake.com/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
StyleSeat is revolutionizing how beauty and wellness professionals grow their businesses through data-driven tools. From streamlining scheduling to optimizing marketing, their platform empowers professionals to focus on their craft while expanding their client base. In this episode, Paschal Onuorah , Senior Data Engineer at StyleSeat , shares how the company leverages Airflow, dbt, and Cosmos to drive marketplace intelligence, improve client connections and deliver measurable growth for professionals. Key Takeaways: 00:00 Introduction. 05:44 The role of the data engineering team in driving business success. 08:52 Leveraging technology for real-time business intelligence. 10:52 Data-driven strategies for improving marketing outcomes. 13:05 How adopting the right tools can increase revenue growth. 14:25 Advantages of simplifying and integrating technical workflows. 18:45 Benefits of multi-environment configurations for development and production. 20:17 Foundational skills and best practices for learning Airflow effectively. 22:33 Opportunities for deeper tool integration and improved data visualization. Resources Mentioned: Paschal Onuorah https://www.linkedin.com/in/onuorah-paschal/ StyleSeat | LinkedIn https://www.linkedin.com/company/styleseat/ StyleSeat | Website https://www.styleseat.com Apache Airflow https://airflow.apache.org/ dbt https://www.getdbt.com/ Astronomer Cosmos https://www.astronomer.io/cosmos/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The evolution of orchestration in Airflow continues with innovations that address both scalability and security. From improving executor reliability to enabling remote execution, these advancements reshape how organizations manage data pipelines. In this episode, we’re joined by Ian Buss , Principal Software Engineer at Astronomer, and Piotr Chomiak , Principal Product Manager at Astronomer , who share insights into the Astro Executor and remote execution. Key Takeaways: 00:00 Introduction. 04:13 How product leadership drives scalability for enterprise needs. 08:23 Architectural changes that improve reliability and remove bottlenecks. 10:15 Metrics that enhance visibility into system performance. 12:54 The role of remote execution in addressing security requirements. 15:56 Differences between open-source solutions and managed offerings. 19:04 Broad industry adoption and applicability of remote execution. 20:39 Future advancements in language support and multi-tenancy. Resources Mentioned: Ian Buss https://www.linkedin.com/in/ian-buss/ Piotr Chomiak https://www.linkedin.com/in/piotr-chomiak-b1955624/ Astronomer | Website https://www.astronomer.io Apache Airflow https://airflow.apache.org/ Airflow Slack Community https://airflow.apache.org/community/ Beyond Analytics conference https://astronomer.io/beyond/dataflowcast Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Scaling 2,000+ data pipelines isn’t easy. But with the right tools and a self-hosted mindset, it becomes achievable. In this episode, Sébastien Crocquevieille , Data Engineer at Numberly , unpacks how the team scaled their on-prem Airflow setup using open-source tooling and Kubernetes. We explore orchestration strategies, UI-driven stakeholder access and Airflow’s evolving features. Key Takeaways: 00:00 Introduction. 02:13 Overview of the company’s operations and global presence. 04:00 The tech stack and structure of the data engineering team. 04:24 Running nearly 2,000 DAGs in production using Airflow. 05:42 How Airflow’s UI empowers stakeholders to self-serve and troubleshoot. 07:05 Details on the Kubernetes-based Airflow setup using Helm charts. 09:31 Transition from GitSync to NFS for DAG syncing due to performance issues. 14:11 Making every team member Airflow-literate through local installation. 17:56 Using custom libraries and plugins to extend Airflow functionality. Resources Mentioned: Sébastien Crocquevieille https://www.linkedin.com/in/scroc/ Numberly | LinkedIn https://www.linkedin.com/company/numberly/ Numberly | Website https://numberly.com/ Apache Airflow https://airflow.apache.org/ Grafana https://grafana.com/ Apache Kafka https://kafka.apache.org/ Helm Chart for Apache Airflow https://airflow.apache.org/docs/helm-chart/stable/index.html Kubernetes https://kubernetes.io/ GitLab https://about.gitlab.com/ KubernetesPodOperator – Airflow https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html Beyond Analytics Conference https://astronomer.io/beyond/dataflowcast Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Managing financial data at scale requires precise orchestration and proactive monitoring to maintain operational efficiency. In this episode, we are joined by Adeolu Adegboye , Data Engineer at Moniepoint Group , who shares how his team uses data pipelines and workflow automation to manage high volumes of transactions, ensure timely alerts and support diverse stakeholders across the business. Key Takeaways: (00:00) Introduction. (02:48) The role of data engineering in supporting all business operations. (04:17) Leveraging workflow orchestration to manage daily processes. (05:20) Proactively monitoring for anomalies to prevent potential issues. (08:12) Simplifying complex insights for non-technical teams. (13:01) Improving efficiency through dynamic and parallel workflows. (14:19) Optimizing system performance to handle large-scale operations. (17:19) Exploring creative and innovative uses for workflow automation. Resources Mentioned: Adeolu Adegboye https://www.linkedin.com/in/adeolu-adegboye/ Moniepoint Group | LinkedIn https://www.linkedin.com/company/moniepoint-inc/ Moniepoint Group | Website https://www.moniepoint.com Apache Airflow https://airflow.apache.org/ ClickHouse https://clickhouse.com/ Grafana https://grafana.com/ Beyond Analytics Conference https://astronomer.io/beyond/dataflowcast Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The evolution of Airflow has reached a milestone with the introduction of remote execution in Airflow 3, enabling flexible orchestration across distributed environments. In this episode, Jens Scheffler , Test Execution Cluster Technical Architect at Bosch , shares insights on how his team’s need for large-scale, cross-environment testing influenced the development of the Edge Executor and shaped this major release. Key Takeaways: (02:39) The role of remote execution in supporting large-scale testing needs. (04:44) How community support contributed to the Edge Executor’s development. (08:41) Navigating network and infrastructure limitations within secure environments. (13:25) Transitioning from database-heavy processes to an API-driven model. (14:16) How the new task SDK in Airflow 3 improves distributed task execution. (16:54) What is required to set up and configure the Edge Executor. (19:36) Managing multiple queues to optimize tasks across different environments. (23:30) Examples of extreme distance use cases for edge execution. Resources Mentioned: Jens Scheffler https://www.linkedin.com/in/jens-scheffler/ Bosch | LinkedIn https://www.linkedin.com/company/bosch/ Bosch | Website https://www.bosch.com/ Apache Airflow https://airflow.apache.org/ Edge Executor (Edge3 Provider Package) https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/index.html Astronomer’s Astro Executor https://www.astronomer.io/docs/astro/astro-executor/ Beyond Analytics Conference https://astronomer.io/beyond/dataflowcast Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Managing modern data platforms means navigating a web of complex infrastructure, competing team needs and evolving security standards. For data teams to truly thrive, infrastructure must become both accessible and compliant without sacrificing velocity or reliability. In this episode, we’re joined by Cory O’Daniel , CEO and Co-Founder at Massdriver , and Jacob Ferriero , Senior Software Engineer at Astronomer , to unpack what it takes to make data platform engineering scalable, sustainable and secure. They share lessons from years of experience working with DevOps, ML teams and platform engineers and discuss how Airflow fits into the orchestration layer of today’s data stacks. Key Takeaways: (03:27) Making infrastructure accessible without deep ops knowledge. (07:23) Distinct personas and responsibilities across data teams. (09:53) Infrastructure hurdles specific to ML workloads. (11:13) Compliance and governance shaping platform design. (13:27) Tooling mismatches between teams cause friction. (15:13) Airflow’s orchestration role within broader system architecture. (22:10) Creating reusable infrastructure patterns for consistency. (24:13) Enabling secure access without slowing down development. (26:55) Opportunities to improve Airflow with event-driven and reliability tooling. Resources Mentioned: Cory O’Daniel https://www.linkedin.com/in/coryodaniel/ Massdriver | LinkedIn https://www.linkedin.com/company/massdriver/ Massdriver | Website https://www.massdriver.cloud/ Jacob Ferriero https://www.linkedin.com/in/jacob-ferriero/ Astronomer https://www.linkedin.com/company/astronomer/ Apache Airflow https://airflow.apache.org/ Prequel https://www.prequel.co/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Telemetry has the potential to guide the future of Airflow, but only if it’s implemented transparently and with community trust. In this episode, we’re joined by Bolke de Bruin , Director at Metyis and a long-time Airflow PMC member. Bolke discusses how telemetry has been handled in the past, why it matters now and what it will take to get it right. Key Takeaways: (03:20) The role of foundations in establishing credibility and sustainability. (04:52) Why data collection is critical to open-source project direction. (07:24) Lessons learned from previous approaches to user data collection. (10:23) The current state of telemetry in the project. (10:53) Community trust as a prerequisite for technical implementation. (12:54) The importance of managing sensitive data within trusted ecosystems. (16:37) Ethical considerations in balancing participation and access. (18:45) Forward-looking ideas for improving workflow design and usability. Resources Mentioned: Bolke de Bruin https://www.linkedin.com/in/bolke/ Metyis | LinkedIn https://www.linkedin.com/company/metyis/ Metyis | Website http://www.metyis.com Apache Airflow https://airflow.apache.org/ Airflow Summit https://airflowsummit.org/ Airflow Dev List https://lists.apache.org/list.html?dev@airflow.apache.org https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Contributing to open-source projects can be daunting, but it can also unlock unexpected innovation. This episode showcases how one engineer’s journey with Apache Airflow led to impactful UI enhancements and infrastructure solutions at scale. Shubham Raj , Software Engineer II at Cloudera , shares how his team built a drag-and-drop DAG editor for non-coders, contributions which helped shape the Airflow 3.0 Ul and introduced features like external XCom control and bulk APls. Key Takeaways: (02:30) Day-to-day responsibilities building platforms that simplify orchestration. (05:27) Factors that make onboarding into large open-source projects accessible. (07:35) The value of improved user interfaces for task state visibility and control. (09:49) Enabling faster debugging by exposing internal data through APIs. (13:00) Balancing frontend design goals with backend functionality. (14:19) Creating workflow editors that lower the barrier to entry. (16:54) Supporting a variety of task types within a visual DAG builder. (19:32) Common infrastructure challenges faced by orchestration users. (20:37) Addressing dependency management across distributed environments. Resources Mentioned: Shubham Raj https://www.linkedin.com/in/shubhamrajofficial/ Cloudera | LinkedIn https://www.linkedin.com/company/cloudera/ Cloudera | Website https://www.cloudera.com/ Apache Airflow https://airflow.apache.org/ 2023 Airflow Summit https://airflowsummit.org/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Managing data pipelines at scale is not just a technical challenge. It is also an organizational one. At Lyft, success means empowering dozens of teams to build with autonomy while enforcing governance and best practices across thousands of workflows. In this episode, we speak with Yunhao Qing , Software Engineer at Lyft , about building a governed data-engineering platform powered by Airflow that balances flexibility, standardization and scale. Key Takeaways: (03:17) Supporting internal teams with a centralized orchestration platform. (04:54) Migrating to a managed service to reduce infrastructure overhead. (06:04) Embedding platform-level governance into custom components. (08:02) Consolidating and regulating the creation of custom code. (09:48) Identifying and correcting inefficient workflow patterns. (11:17) Replacing manual workarounds with native platform features. (14:32) Preparing teams for major version upgrades. (16:03) Leveraging asset-based scheduling for smarter triggers. (18:13) Envisioning GenAI and semantic search for future productivity. Resources Mentioned: Yunhao Qing https://www.linkedin.com/in/yunhao-qing Lyft | LinkedIn https://www.linkedin.com/company/lyft/ Lyft | Website https://www.lyft.com/ Apache Airflow https://airflow.apache.org/ Astronomer https://www.astronomer.io/ Kubernetes https://kubernetes.io/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Understanding the complexities of Apache Airflow can be daunting for newcomers and seasoned data engineers. But with the right guidance, mastering the tool becomes an achievable milestone. In this episode, Marc Lamberti , Head of Customer Education at Astronomer , joins us to share his journey from Udemy instructor to driving education at Astronomer, and how he's helping over 100,000 learners demystify Airflow. Key Takeaways: (02:36) Early exposure to Airflow while addressing inefficiencies in data workflows. (04:10) Common barriers to implementing open source tools in enterprise settings. (06:18) The shift from part-time teaching to a full-time focus on Airflow education. (07:53) A modular, guided approach to structuring educational content. (09:57) The value of highlighting underused Airflow features for broader adoption. (12:35) Certifications as a method to assess readiness and uncover knowledge gaps. (13:25) Coverage of essential Airflow concepts in the Fundamentals exam. (16:07) The DAG Authoring exam’s emphasis on practical, advanced features. (20:08) A call for more visible integration of Airflow with AI workflows. Resources Mentioned: Marc Lamberti https://www.linkedin.com/in/marclamberti/ Astronomer | LinkedIn https://www.linkedin.com/company/astronomer/ Astronomer Academy https://academy.astronomer.io/ Airflow Fundamentals Certification https://www.astronomer.io/certification/ DAG Authoring Certification https://academy.astronomer.io/plan/astronomer-certification-dag-authoring-for-apache-airflow-exam The Complete Hands-On Introduction to Airflow https://www.udemy.com/course/the-complete-hands-on-course-to-master-apache-airflow/?utm_source=adwords&utm_medium=udemyads&utm_campaign=Search_DSA_Beta_Prof_la.EN_cc.ROW-English&campaigntype=Search&portfolio=ROW-English&language=EN&product=Course&test=&audience=DSA&topic=&priority=Beta&utm_content=deal4584&utm_term=_._ag_162511579404_._ad_696197165418_._kw__._de_c_._dm__._pl__._ti_dsa-1677053911088_._li_9061346_._pd__._&matchtype=&gad_source=1&gad_campaignid=21168154305&gbraid=0AAAAADROdO3MpljfP-gssiYSmDEPdhZV9&gclid=Cj0KCQjw097CBhDIARIsAJ3-nxdjZA6G5-Y0-akk6Huksy2PLb04t92J4iNfUSIbMdrSAla_tb-o2N8aArOeEALw_wcB&couponCode=PMNVD3025 https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
1 Embracing Data Mesh and SQL Sensors for Scalable Workflows at lastminute.com with Alberto Crespi 30:09
The flexibility of Airflow plays a pivotal role in enabling decentralized data architectures and empowering cross-functional teams. In this episode, we speak with Alberto Crespi , Data Architect at lastminute.com , who shares how his team scales Airflow across 12 teams while supporting both vertical and horizontal structures under a data mesh approach. Key Takeaways: (02:17) Defining responsibilities within data architecture teams. (04:15) Consolidating multiple orchestrators into a single solution. (07:00) Scaling Airflow environments with shared infrastructure and DevOps practices. (10:59) Managing dependencies and readiness using SQL sensors. (14:23) Enhancing visibility and response through Slack-integrated monitoring. (19:28) Extending Airflow’s flexibility to run legacy systems. (22:28) Integrating transformation tools into orchestrated pipelines. (25:54) Enabling non-engineers to contribute to pipeline development. (27:33) Fostering adoption through collaboration and communication. Resources Mentioned: Alberto Crespi https://www.linkedin.com/in/crespialberto/ lastminute.com | Website https://lastminute.com Apache Airflow https://airflow.apache.org/ dbt Labs https://www.getdbt.com/ Astronomer Cosmos https://github.com/astronomer/astronomer-cosmos GitLab Slack https://slack.com/ Kubernetes https://kubernetes.io/ Confluence https://www.atlassian.com/software/confluence Slack https://slack.com/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Innovation in orchestration is redefining how engineers approach both traditional ETL pipelines and emerging AI workloads. Understanding how to harness Airflow’s flexibility and observability is essential for teams navigating today’s evolving data landscape. In this episode, Anu Pabla , Principal Engineer at The ODP Corporation , joins us to discuss her journey from legacy orchestration patterns to AI-native pipelines and why she sees Airflow as the future of AI workload orchestration. Key Takeaways: (03:43) Engaging with external technology communities fosters innovation. (05:05) Mentoring early-career engineers builds confidence in a complex tech landscape. (07:51) Orchestration patterns continue to evolve with modern data needs. (08:41) Managing AI workflows requires structured and flexible orchestration. (10:35) High-quality, meaningful data remains foundational across use cases. (15:08) Community-driven open source tools offer lasting value. (16:59) Self-healing systems support both legacy and AI pipelines. (20:20) Orchestration platforms can drive future AI-native workloads. Resources Mentioned: Anu Pabla https://www.linkedin.com/in/atomicap/ The ODP Corporation https://www.linkedin.com/company/the-odp-corporation/ The ODP Corporation | Website https://www.theodpcorp.com/homepage Apache Airflow https://airflow.apache.org/ LlamaIndex https://www.llamaindex.ai/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The orchestration layer is foundational to building robust AI- and ML-powered data pipelines, especially in complex hybrid enterprise environments. IBM’s partnership with Astronomer reflects a strategic alignment to simplify and scale Airflow-based workflows across industries. In this episode, we’re joined by IBM ’s Senior Product Manager, BJ Adesoji , and GTM PM and Growth Leader , Ryan Yackel . We discuss how IBM customers are using Airflow in production, the challenges they face at scale and what the new IBM–Astronomer collaboration unlocks. Key Takeaways: (03:09) The growing importance of orchestration tools in enterprise environments. (04:48) How organizations are expanding orchestration beyond traditional use cases. (05:24) Common patterns across industries adopting orchestration platforms. (07:16) Why orchestration is essential for supporting business-critical workloads. (10:00) The role of orchestration in compliance and regulatory processes. (13:02) Challenges enterprises face when managing orchestration infrastructure. (14:58) Opportunities to simplify and centralize orchestration at scale. (19:11) The value of integrating orchestration with broader data toolchains. (20:54) How AI is shaping the future of orchestrated data workflows. Resources Mentioned: BJ Adesoji https://www.linkedin.com/in/bj-soji/ Ryan Yackel https://www.linkedin.com/in/ryanyackel/ IBM | LinkedIn https://www.linkedin.com/company/databand-ai/ IBM Databand https://www.ibm.com/products/databand IBM DataStage https://www.ibm.com/products/datastage IBM watsonx.governance https://www.ibm.com/products/watsonx-governance IBM Knowledge Catalog https://www.ibm.com/products/knowledge-catalog Apache Airflow https://airflow.apache.org/ watsonx Orchestrate https://www.ibm.com/products/watsonx-orchestrate Domino https://domino.ai/ Astronomer https://www.astronomer.io/ Snowflake https://www.snowflake.com/en/ dbt Labs https://www.getdbt.com/ Amazon SageMaker https://aws.amazon.com/sagemaker/ Cloudera https://www.cloudera.com/ MongoDB https://www.mongodb.com/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Efficient orchestration and maintainability are crucial for data engineering at scale. Gil Reich , Data Developer for Data Science at Wix , shares how his team reduced code duplication, standardized pipelines, and improved Airflow task orchestration using a Python-based framework built within the data science team. In this episode, Gil explains how this internal framework simplifies DAG creation, improves documentation accuracy, and enables consistent task generation for machine learning pipelines. He also shares lessons from complex DAG optimization and maintaining testable code. Key Takeaways: (03:23) Code duplication creates long-term problems. (08:16) Frameworks bring order to complex pipelines. (09:41) Shared functions cut down repetitive code. (17:18) Auto-generated docs stay accurate by design. (22:40) On-demand DAGs support real-time workflows. (25:08) Task-level sensors improve run efficiency. (27:40) Combine local runs with automated tests. (30:09) Clean code helps teams scale faster. Resources Mentioned: Gil Reich https://www.linkedin.com/in/gilreich/ Wix | LinkedIn https://www.linkedin.com/company/wix-com/ Wix | Website https://www.wix.com/ DS DAG Framework https://airflowsummit.org/slides/2024/92-refactoring-dags.pdf Apache Airflow https://airflow.apache.org/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
1 Modernizing Legacy Data Systems With Airflow at Procter & Gamble with Adonis Castillo Cordero 22:13
Legacy architecture and AI workloads pose unique challenges at scale, especially in a global enterprise with complex data systems. In this episode, we explore strategies to proactively monitor and optimize pipelines while minimizing downstream failures. Adonis Castillo Cordero , Senior Automation Manager at Procter & Gamble , joins us to share actionable best practices for dependency mapping, anomaly detection and architecture simplification using Apache Airflow. Key Takeaways: (03:13) Integrating legacy data systems into modern architecture. (05:51) Designing workflows for real-time data processing. (07:57) Mapping dependencies early to avoid pipeline failures. (09:02) Building automated monitoring into orchestration frameworks. (12:09) Detecting anomalies to prevent performance bottlenecks. (15:24) Monitoring data quality to catch silent failures. (17:02) Prioritizing responses based on impact severity. (18:55) Simplifying dashboards to highlight critical metrics. Resources Mentioned: Adonis Castillo Cordero https://www.linkedin.com/in/adoniscc/ Procter & Gamble | LinkedIn https://www.linkedin.com/company/procter-and-gamble/ Procter & Gamble | Website http://www.pg.com Apache Airflow https://airflow.apache.org/ OpenLineage https://openlineage.io/ Azure Monitor https://azure.microsoft.com/en-us/products/monitor/ AWS Lookout for Metrics https://aws.amazon.com/lookout-for-metrics/ Monte Carlo https://www.montecarlodata.com/ Great Expectations https://greatexpectations.io/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Building reliable data pipelines starts with maintaining strong data quality standards and creating efficient systems for auditing, publishing and monitoring. In this episode, we explore the real-world patterns and best practices for ensuring data pipelines stay accurate, scalable and trustworthy. Joseph Machado , Senior Data Engineer at Netflix , joins us to share practical insights gleaned from supporting Netflix’s Ads business as well as over a decade of experience in the data engineering space. He discusses implementing audit publish patterns, building observability dashboards, defining in-band and separate data quality checks, and optimizing data validation across large-scale systems. Key Takeaways: . (03:14) Supporting data privacy and engineering efficiency within data systems. (10:41) Validating outputs with reconciliation checks to catch transformation issues. (16:06) Applying standardized patterns for auditing, validating and publishing data. (19:28) Capturing historical check results to monitor system health and improvements. (21:29) Treating data quality and availability as separate monitoring concerns. (26:26) Using containerization strategies to streamline pipeline executions. (29:47) Leveraging orchestration platforms for better visibility and retry capability. (31:59) Managing business pressure without sacrificing data quality practices. (35:46) Starting simple with quality checks and evolving toward more complex frameworks. Resources Mentioned: Joseph Machado https://www.linkedin.com/in/josephmachado1991/ Netflix | LinkedIn https://www.linkedin.com/company/netflix/ Netflix | Website https://www.netflix.com/browse Start Data Engineering https://www.startdataengineering.com/ Apache Airflow https://airflow.apache.org/ dbt Labs https://www.getdbt.com/ Great Expectations https://greatexpectations.io/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Creating consistency across data pipelines is critical for scaling engineering teams and ensuring long-term maintainability. In this episode, Snir Israeli , Senior Data Engineer at Next Insurance , shares how enforcing coding standards and investing in developer experience transformed their approach to data engineering. He explains how implementing automated code checks, clear documentation practices and a scoring system helped drive alignment across teams, improve collaboration and reduce technical debt in a fast-growing data environment. Key Takeaways: (02:59) Inconsistencies in code style create challenges for collaboration and maintenance. (04:22) Programmatically enforcing rules helps teams scale their best practices. (08:55) Performance improvements in data pipelines lead to infrastructure cost savings. (13:22) Developer experience is essential for driving adoption of internal tools. (19:44) Dashboards can operationalize standards enforcement and track progress over time. (22:49) Standardization accelerates onboarding and reduces friction in code reviews. (25:39) Linting rules require ongoing maintenance as tools and platforms evolve. (27:47) Starting small and involving the team leads to better adoption and long-term success. Resources Mentioned: Snir Israeli https://www.linkedin.com/in/snir-israeli/ Next Insurance | LinkedIn https://www.linkedin.com/company/nextinsurance/ Next Insurance | Website https://www.nextinsurance.com/ Apache Airflow https://airflow.apache.org/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Airflow’s adaptability is driving Tekmetric’s ability to unify complex data workflows, deliver accurate insights and support both internal operations and customer-facing services — all within a rapidly growing startup environment. In this episode, Ipsa Trivedi , Lead Data Engineer at Tekmetric , shares how her team is standardizing pipelines while supporting unique customer needs. She explains how Airflow enables end-to-end data services, simplifies orchestration across varied sources and supports scalable customization. Ipsa also highlights early wins with Airflow, its intuitive UI and the team's roadmap toward data quality, observability and a future self-serve data platform. Key Takeaways: (02:26) Powering auto shops nationwide with a unified platform. (05:17) A new data team was formed to centralize and scale insights. (07:23) Flexible, open source and made to fit — Airflow wins. (10:42) Pipelines handle anything from email to AWS. (12:15) Custom DAGs fit every team’s unique needs. (17:01) Data quality checks are built into the plan. (18:17) Self-serve data mesh is the end goal. (19:59) Airflow now fits so well, there's nothing left on the wishlist. Resources Mentioned: Ipsa Trivedi https://www.linkedin.com/in/ipsatrivedi/ Tekmetric | LinkedIn https://www.linkedin.com/company/tekmetric/ Tekmetric | Website https://www.tekmetric.com/ Apache Airflow https://airflow.apache.org/ AWS RDS https://aws.amazon.com/free/database/?trk=fc551e06-56b0-418c-9ddd-5c9dba18569b&sc_channel=ps&ef_id=CjwKCAjwzMi_BhACEiwAX4YZULS4jV2Xpnpcac_Q3eS9BAg-klKUDyCt6XSdOul8BLHkmWzFFh4NXRoCGhQQAvD_BwE:G:s&s_kwcid=AL!4422!3!548989592596!e!!g!!amazon%20sql%20database!11543056228!112002958549&gclid=CjwKCAjwzMi_BhACEiwAX4YZULS4jV2Xpnpcac_Q3eS9BAg-klKUDyCt6XSdOul8BLHkmWzFFh4NXRoCGhQQAvD_BwE Astro by Astronomer https://www.astronomer.io/product/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The Airflow 3.0 release marks a significant leap forward in modern data orchestration, introducing architectural upgrades that improve scalability, flexibility and long-term maintainability. In this episode, we welcome Vikram Koka , Chief Strategy Officer at Astronomer , and Jed Cunningham , Principal Software Engineer at Astronomer , to discuss the architectural foundations, new features and future implications of this milestone release. They unpack the rationale behind DAG versioning and task execution interface, explain how Airflow now integrates more seamlessly within broader data ecosystems and share how these changes lay the groundwork for multi-cloud deployments, language-agnostic workflows and stronger enterprise security. Key Takeaways: (02:28) Modern orchestration demands new infrastructure approaches. (05:02) Removing legacy components strengthens system stability. (06:26) Major releases provide the opportunity to reduce technical debt. (08:31) Frontend and API modernization enable long-term adaptability. (09:36) Event-based triggers expand integration possibilities. (11:54) Version control improves visibility and execution reliability. (14:57) Centralized access to workflow definitions increases flexibility. (21:49) Decoupled architecture supports distributed and secure deployments. (26:17) Community collaboration is essential for sustainable growth. Resources Mentioned: Astronomer Website https://www.astronomer.io Apache Airflow https://airflow.apache.org/ Git Bundle https://git-scm.com/book/en/v2/Git-Tools-Bundling FastAPI https://fastapi.tiangolo.com/ React https://react.dev/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “ The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI .” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
The evolution of data orchestration at Instacart highlights the journey from fragmented systems to robust, standardized infrastructure. This transformation has enabled scalability, reliability and democratization of tools for diverse user personas. In this episode, we’re joined by Anant Agarwal , Software Engineer at Instacart , who shares insights into Instacart's Airflow journey, from its early adoption in 2019 to the present-day centralized cluster approach. Anant discusses the challenges of managing disparate clusters, the implementation of remote executors, and the strategic standardization of infrastructure and DAG patterns to streamline workflows. Key Takeaways: (03:49) The impact of external events on business growth and technological evolution. (04:31) Challenges of managing decentralized systems across multiple teams. (06:14) The importance of standardizing infrastructure and processes for scalability. (09:51) Strategies for implementing efficient and repeatable deployment practices. (12:17) Addressing diverse user personas with tailored solutions. (14:47) Leveraging remote execution to enhance flexibility and scalability. (18:36) Benefits of transitioning to a centralized system for organization-wide use. (20:57) Maintaining an upgrade cadence to stay aligned with the latest advancements. (23:35) Anticipation for new features and improvements in upcoming software versions. Resources Mentioned: Anant Agarwal https://www.linkedin.com/in/anantag/ Instacart | LinkedIn https://www.linkedin.com/company/instacart/ Instacart | Website https://www.instacart.com Apache Airflow https://airflow.apache.org/ AWS Amazon https://aws.amazon.com/ecs/ Terraform https://www.terraform.io/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “The Data Flowcast: Mastering Airflow for Data Engineering & AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
T
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
1 From ETL to Airflow: Transforming Data Engineering at Deloitte Digital with Raviteja Tholupunoori 27:42
Data orchestration at scale presents unique challenges, especially when aiming for flexibility and efficiency across cloud environments. Choosing the right tools and frameworks can make all the difference. In this episode, Raviteja Tholupunoori, Senior Engineer at Deloitte Digital , joins us to explore how Airflow enhances orchestration, scalability and cost efficiency in enterprise data workflows. Key Takeaways: (01:45) Early challenges in data orchestration before implementing Airflow. (02:42) Comparing Airflow with ETL tools like Talend and why flexibility matters. (04:24) The role of Airflow in enabling cloud-agnostic data processing. (05:45) Key lessons from managing dynamic DAGs at scale. (13:15) How hybrid executors improve performance and efficiency. (14:13) Best practices for testing and monitoring workflows with Airflow. (15:13) The importance of mocking mechanisms when testing DAGs. (17:57) How Prometheus, Grafana and Loki support Airflow monitoring. (22:03) Cost considerations when running Airflow on self-managed infrastructure. (23:14) Airflow’s latest features, including hybrid executors and dark mode. Resources Mentioned: Raviteja Tholupunoori https://www.linkedin.com/in/raviteja0096/?originalSubdomain=in Deloitte Digital https://www.linkedin.com/company/deloitte-digital/ Apache Airflow https://airflow.apache.org/ Grafana https://grafana.com/solutions/apache-airflow/monitor/ Astronomer Presents: Exploring Apache Airflow® 3 Roadshows https://www.astronomer.io/events/roadshow/ https://www.astronomer.io/events/roadshow/london/ https://www.astronomer.io/events/roadshow/new-york/ https://www.astronomer.io/events/roadshow/sydney/ https://www.astronomer.io/events/roadshow/san-francisco/ https://www.astronomer.io/events/roadshow/chicago/ Thanks for listening to “The Data Flowcast: Mastering Airflow for Data Engineering & AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations. #AI #Automation #Airflow #MachineLearning…
به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.




























