با برنامه Player FM !
SWE-bench & SWE-agent | Data Brew | Episode 44
Manage episode 477558040 series 2814833
In this episode, Kilian Lieret, Research Software Engineer, and Carlos Jimenez, Computer Science PhD Candidate at Princeton University, discuss SWE-bench and SWE-agent, two groundbreaking tools for evaluating and enhancing AI in software engineering.
Highlights include:
- SWE-bench: A benchmark for assessing AI models on real-world coding tasks.
- Addressing data leakage concerns in GitHub-sourced benchmarks.
- SWE-agent: An AI-driven system for navigating and solving coding challenges.
- Overcoming agent limitations, such as getting stuck in loops.
- The future of AI-powered code reviews and automation in software engineering.
43 قسمت
Manage episode 477558040 series 2814833
In this episode, Kilian Lieret, Research Software Engineer, and Carlos Jimenez, Computer Science PhD Candidate at Princeton University, discuss SWE-bench and SWE-agent, two groundbreaking tools for evaluating and enhancing AI in software engineering.
Highlights include:
- SWE-bench: A benchmark for assessing AI models on real-world coding tasks.
- Addressing data leakage concerns in GitHub-sourced benchmarks.
- SWE-agent: An AI-driven system for navigating and solving coding challenges.
- Overcoming agent limitations, such as getting stuck in loops.
- The future of AI-powered code reviews and automation in software engineering.
43 قسمت
همه قسمت ها
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.