Artwork

محتوای ارائه شده توسط The Nonlinear Fund. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط The Nonlinear Fund یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

AF - aimless ace analyzes active amateur: a micro-aaaaalignment proposal by Luke H Miles

1:45
 
اشتراک گذاری
 

بایگانی مجموعه ها ("فیدهای غیر فعال" status)

When? This feed was archived on October 23, 2024 10:10 (1y ago). Last successful fetch was on September 19, 2024 11:06 (1y ago)

Why? فیدهای غیر فعال status. سرورهای ما، برای یک دوره پایدار، قادر به بازیابی یک فید پادکست معتبر نبوده اند.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Manage episode 430041427 series 3337166
محتوای ارائه شده توسط The Nonlinear Fund. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط The Nonlinear Fund یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: aimless ace analyzes active amateur: a micro-aaaaalignment proposal, published by Luke H Miles on July 21, 2024 on The AI Alignment Forum.
This idea is so simple that I'm sure it's been had by someone somewhere.
Suppose we have some method to make really smart honest AIs that do not have goals. Let's say it's a yes/no oracle. Our aimless ace.
But we want to accomplish stuff! AIcorp wants the
printmoneynow.py. I'm worried that if I train the ace to fold fitted sheets and unkink the garden hose then it will go overboard and fold my femur bone and unkink the Yellowstone super volcano.
So let's use a bit-less-smart model to do tasks. The active amateur.
We'll ask the ace if the amateur is plotting something. If yes, we shut down the amateur. Every day we also ask "is today the last day I can shut down the amateur?"
(We do not generate code with the ace!)
So we get super-smart supervision of a decent task AI. The temptation (ie competitive pressure) to agentify the oracle is there still, but no longer overwhelming.
Maybe we can even have the amateur write ace_v2, have ace_v1 check ace_v2, switch to ace_v2, have amateur_v1 write amateur_v2, check it with ace_v2, then switch to amateur_v2.
If this looks like it's avoiding the hard parts of the alignment problem, that's because it is! But does something actually go wrong? Was it too big an assumption that we could build an aimless honest ace? (Or if you think the idea is good, then a bit of positive encouragement would be wonderful.)
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
  continue reading

392 قسمت

Artwork
iconاشتراک گذاری
 

بایگانی مجموعه ها ("فیدهای غیر فعال" status)

When? This feed was archived on October 23, 2024 10:10 (1y ago). Last successful fetch was on September 19, 2024 11:06 (1y ago)

Why? فیدهای غیر فعال status. سرورهای ما، برای یک دوره پایدار، قادر به بازیابی یک فید پادکست معتبر نبوده اند.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Manage episode 430041427 series 3337166
محتوای ارائه شده توسط The Nonlinear Fund. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط The Nonlinear Fund یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: aimless ace analyzes active amateur: a micro-aaaaalignment proposal, published by Luke H Miles on July 21, 2024 on The AI Alignment Forum.
This idea is so simple that I'm sure it's been had by someone somewhere.
Suppose we have some method to make really smart honest AIs that do not have goals. Let's say it's a yes/no oracle. Our aimless ace.
But we want to accomplish stuff! AIcorp wants the
printmoneynow.py. I'm worried that if I train the ace to fold fitted sheets and unkink the garden hose then it will go overboard and fold my femur bone and unkink the Yellowstone super volcano.
So let's use a bit-less-smart model to do tasks. The active amateur.
We'll ask the ace if the amateur is plotting something. If yes, we shut down the amateur. Every day we also ask "is today the last day I can shut down the amateur?"
(We do not generate code with the ace!)
So we get super-smart supervision of a decent task AI. The temptation (ie competitive pressure) to agentify the oracle is there still, but no longer overwhelming.
Maybe we can even have the amateur write ace_v2, have ace_v1 check ace_v2, switch to ace_v2, have amateur_v1 write amateur_v2, check it with ace_v2, then switch to amateur_v2.
If this looks like it's avoiding the hard parts of the alignment problem, that's because it is! But does something actually go wrong? Was it too big an assumption that we could build an aimless honest ace? (Or if you think the idea is good, then a bit of positive encouragement would be wonderful.)
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
  continue reading

392 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش