با برنامه Player FM !
SE Radio 696: Flavia Saldanha on Data Engineering for AI
Manage episode 521225985 series 215
Flavia Saldanha, a consulting data engineer, joins host Kanchan Shringi to discuss the evolution of data engineering from ETL (extract, transform, load) and data lakes to modern lakehouse architectures enriched with vector databases and embeddings. Flavia explains the industry's shift from treating data as a service to treating it as a product, emphasizing ownership, trust, and business context as critical for AI-readiness. She describes how unified pipelines now serve both business intelligence and AI use cases, combining structured and unstructured data while ensuring semantic enrichment and a single source of truth. She outlines key components of a modern data stack, including data marketplaces, observability tools, data quality checks, orchestration, and embedded governance with lineage tracking. This episode highlights strategies for abstracting tooling, future-proofing architectures, enforcing data privacy, and controlling AI-serving layers to prevent hallucinations. Saldanha concludes that data engineers must move beyond pure ETL thinking, embrace product and NLP skills, and work closely with MLOps, using AI as a co-pilot rather than a replacement.
Brought to you by IEEE Computer Society and IEEE Software magazine.
1057 قسمت
SE Radio 696: Flavia Saldanha on Data Engineering for AI
Software Engineering Radio - the podcast for professional software developers
Manage episode 521225985 series 215
Flavia Saldanha, a consulting data engineer, joins host Kanchan Shringi to discuss the evolution of data engineering from ETL (extract, transform, load) and data lakes to modern lakehouse architectures enriched with vector databases and embeddings. Flavia explains the industry's shift from treating data as a service to treating it as a product, emphasizing ownership, trust, and business context as critical for AI-readiness. She describes how unified pipelines now serve both business intelligence and AI use cases, combining structured and unstructured data while ensuring semantic enrichment and a single source of truth. She outlines key components of a modern data stack, including data marketplaces, observability tools, data quality checks, orchestration, and embedded governance with lineage tracking. This episode highlights strategies for abstracting tooling, future-proofing architectures, enforcing data privacy, and controlling AI-serving layers to prevent hallucinations. Saldanha concludes that data engineers must move beyond pure ETL thinking, embrace product and NLP skills, and work closely with MLOps, using AI as a co-pilot rather than a replacement.
Brought to you by IEEE Computer Society and IEEE Software magazine.
1057 قسمت
Toate episoadele
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.