Worst-Case Thinking in AI Alignment
بایگانی مجموعه ها ("فیدهای غیر فعال" status)
When? This feed was archived on February 21, 2025 21:08 (
Why? فیدهای غیر فعال status. سرورهای ما، برای یک دوره پایدار، قادر به بازیابی یک فید پادکست معتبر نبوده اند.
What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.
Manage episode 424744781 series 3498845
Alternative title: “When should you assume that what could go wrong, will go wrong?” Thanks to Mary Phuong and Ryan Greenblatt for helpful suggestions and discussion, and Akash Wasil for some edits. In discussions of AI safety, people often propose the assumption that something goes as badly as possible. Eliezer Yudkowsky in particular has argued for the importance of security mindset when thinking about AI alignment. I think there are several distinct reasons that this might be the right assumption to make in a particular situation. But I think people often conflate these reasons, and I think that this causes confusion and mistaken thinking. So I want to spell out some distinctions. Throughout this post, I give a bunch of specific arguments about AI alignment, including one argument that I think I was personally getting wrong until I noticed my mistake yesterday (which was my impetus for thinking about this topic more and then writing this post). I think I’m probably still thinking about some of my object level examples wrong, and hope that if so, commenters will point out my mistakes.
Original text:
https://www.lesswrong.com/posts/yTvBSFrXhZfL8vr5a/worst-case-thinking-in-ai-alignment
Narrated for AI Safety Fundamentals by Perrin Walker of TYPE III AUDIO.
---
A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.
فصل ها
1. Worst-Case Thinking in AI Alignment (00:00:00)
2. My list of reasons to maybe use worst-case thinking (00:01:26)
3. Differences between these arguments (00:09:07)
85 قسمت