Artwork

محتوای ارائه شده توسط Francesco Gadaleta. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Francesco Gadaleta یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

When AI Hears Thunder But Misses the Fear (Ep. 291)

46:37
 
اشتراک گذاری
 

Manage episode 512393744 series 3497898
محتوای ارائه شده توسط Francesco Gadaleta. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Francesco Gadaleta یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Sanjoy Chowdhury reveals AI's hidden weakness: while systems can see objects and hear sounds perfectly, they can't reason across senses like humans do. His research at University of Maryland College Park, including the Meerkat model and AVTrustBench, exposes why AI recognizes worried faces and thunder separately but fails to connect them—and what this means for self-driving cars and medical AI.

Sponsors

This episode is proudly sponsored by Amethix Technologies. At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve. With a focus on dual-use innovation, Amethix is shaping a future where intelligent machines extend human capability, not replace it. Discover more at https://amethix.com

This episode is brought to you by Intrepid AI. From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence. Whether it's in the sky, on the ground, or in orbit—if it's intelligent and mobile, Intrepid helps you build it. Learn more at intrepid.ai

Resources:

  1. The first audio-visual LLM with fine-grained understanding: Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time (Accepted at ECCV 2024)
  2. Benchmark for evaluating the robustness to adversarial attacks, compositional reasoning: AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs (Accepted at ICCV 2025)
  3. First audio-visual reasoning evaluation benchmark and test time reasoning distillation pipeline AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs Accepted at ICCV 2025
  4. For a detailed list of Sanjoy's work, please visit his webpage: https://schowdhury671.github.io/
  continue reading

297 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 512393744 series 3497898
محتوای ارائه شده توسط Francesco Gadaleta. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Francesco Gadaleta یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Sanjoy Chowdhury reveals AI's hidden weakness: while systems can see objects and hear sounds perfectly, they can't reason across senses like humans do. His research at University of Maryland College Park, including the Meerkat model and AVTrustBench, exposes why AI recognizes worried faces and thunder separately but fails to connect them—and what this means for self-driving cars and medical AI.

Sponsors

This episode is proudly sponsored by Amethix Technologies. At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve. With a focus on dual-use innovation, Amethix is shaping a future where intelligent machines extend human capability, not replace it. Discover more at https://amethix.com

This episode is brought to you by Intrepid AI. From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence. Whether it's in the sky, on the ground, or in orbit—if it's intelligent and mobile, Intrepid helps you build it. Learn more at intrepid.ai

Resources:

  1. The first audio-visual LLM with fine-grained understanding: Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time (Accepted at ECCV 2024)
  2. Benchmark for evaluating the robustness to adversarial attacks, compositional reasoning: AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs (Accepted at ICCV 2025)
  3. First audio-visual reasoning evaluation benchmark and test time reasoning distillation pipeline AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs Accepted at ICCV 2025
  4. For a detailed list of Sanjoy's work, please visit his webpage: https://schowdhury671.github.io/
  continue reading

297 قسمت

All episodes

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش