Open Pre-Trained Transformer Language Models (OPT): What does it take to train GPT-3?

Neural Search Talks — Zeta Alpha

Player FM - Internet Radio Done Right

اضافه شده در two سال پیش

محتوای ارائه شده توسط Zeta Alpha. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Zeta Alpha یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Good Moms Bad Choices

1
5 Ways To Rethink Dating 1:36:55

18 weeks پیش1:36:55

پخش در آینده

لیست ها

پسندیدن

دوست داشته شد

1:36:55

What’s up, Tribe, and welcome back to Good Moms Bad Choices! January was amazing, but its time to turn the page on the calendar and embrace beautiful new energy as we enter ‘The Journey of Love February.’ This month is all about the heart - join Erica and Milah to catch up and discuss what’s new in the world of motherhood, marriage, and amor! In this week’s episode, the ladies offer witty and sharp perspectives about personal growth in love, supporting your kids through their friend drama, and how to honor your true needs in a partnership. Mama Bear to the Rescue! The Good Moms discuss protective parenting and helping your kids fight their battles (8:00) Bad Choice of the Week: Help! My kids saw me in my lingerie! (20:00) My Happily Ever After: Erica and Milah discuss the prospect of marriage, dreams of becoming a housewife, and the top 5 ways to be confident in love (32:00) Yoni Mapping: Releasing Trauma and Increasing Pleasure (57:00) Its OK to fuck up, but also, what do you (really) bring to the table: The Good Moms have an honest discussion about finding accountability and growth before love (1:03:00) Watch This episode & more on YouTube! Catch up with us over at Patreon and get all our Full visual episodes, bonus content & early episode releases. Join our private Facebook group! Let us help you! Submit your advice questions, anonymous secrets or vent about motherhood anonymously! Submit your questions Connect With Us: @GoodMoms_BadChoices @TheGoodVibeRetreat @Good.GoodMedia @WatchErica @Milah_Mapp Official GMBC Music: So good feat Renee, Trip and http://www.anthemmusicenterprises.com Join us this summer in paradise at the Good Vibe Rest+Vibe Retreat in Costa Rica July 31- August 5 August 8 - August 13 See omnystudio.com/listener for privacy information.…

۳ سال پیش 47:12

M4A•خانه قسمت

Andrew Yates (Assistant Professor at the University of Amsterdam) and Sergi Castella i Sapé discuss the recent "Open Pre-trained Transformer (OPT) Language Models" from Meta AI (formerly Facebook). In this replication work, Meta developed and trained a 175 Billion parameter Transformer very similar to GPT-3 from OpenAI, documenting the process in detail to share their findings with the community. The code, pretrained weights, and logbook are available on their Github repository (links below).

Links

❓Feedback Form: https://scastella.typeform.com/to/rg7a5GfJ

📄 OPT paper: https://arxiv.org/abs/2205.01068

👾 Code: https://github.com/facebookresearch/metaseq

📒 Logbook: https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/chronicles/OPT175B_Logbook.pdf

✍️ OPT Official Blog Post: https://ai.facebook.com/blog/democratizing-access-to-large-scale-language-models-with-opt-175b/

OpenAI Embeddings API: https://openai.com/blog/introducing-text-and-code-embeddings/

Nils Reimers' critique of OpenAI Embeddings API: https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9

Timestamps:

00:00 Introduction and housekeeping: new feedback form, ACL conference highlights

02:42 The convergence between NLP and Neural IR techniques

06:43 Open Pretrained Transformer motivation and scope, reproducing GPT-3 and open-sourcing

08:16 Basics of OPT: architecture, pre-training objective, teacher forcing, tokenizer, training data

13:40 Preliminary experiments findings: hyperparameters, training stability, spikiness

20:08 Problems that appear at scale when training with 992 GPUs

23:01 Using temperature to check whether GPUs are working

25:00 Training the largest model: what to do when the loss explodes? (which happens quite often)

29:15 When they switched away from AdamW to SGD

32:00 Results: successful but not quite GPT-3 level.

Toxicity? 35:45 Replicability of Large Language Models research. Was GPT-3 replicable? What difference does it make?

37:25 What makes a paper replicable?

40:33 Directions in which large Language Models are applied to Information Retrieval

45:15 Final thoughts and takeaways

21 قسمت

#Tech #Zeta Alpha