Artwork

محتوای ارائه شده توسط Machine Learning Street Talk (MLST). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Machine Learning Street Talk (MLST) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Patrick Lewis (Cohere) - Retrieval Augmented Generation

1:13:46
 
اشتراک گذاری
 

Manage episode 440266070 series 2803422
محتوای ارائه شده توسط Machine Learning Street Talk (MLST). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Machine Learning Street Talk (MLST) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Dr. Patrick Lewis, who coined the term RAG (Retrieval Augmented Generation) and now works at Cohere, discusses the evolution of language models, RAG systems, and challenges in AI evaluation.

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmented generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

Key topics covered:

- Origins and evolution of Retrieval Augmented Generation (RAG)

- Challenges in evaluating RAG systems and language models

- Human-AI collaboration in research and knowledge work

- Word embeddings and the progression to modern language models

- Dense vs sparse retrieval methods in information retrieval

The discussion also explored broader implications and applications:

- Balancing faithfulness and fluency in RAG systems

- User interface design for AI-augmented research tools

- The journey from chemistry to AI research

- Challenges in enterprise search compared to web search

- The importance of data quality in training AI models

Patrick Lewis: https://www.patricklewis.io/

Cohere Command Models, check them out - they are amazing for RAG!

https://cohere.com/command

TOC

00:00:00 1. Intro to RAG

00:05:30 2. RAG Evaluation: Poll framework & model performance

00:12:55 3. Data Quality: Cleanliness vs scale in AI training

00:15:13 4. Human-AI Collaboration: Research agents & UI design

00:22:57 5. RAG Origins: Open-domain QA to generative models

00:30:18 6. RAG Challenges: Info retrieval, tool use, faithfulness

00:42:01 7. Dense vs Sparse Retrieval: Techniques & trade-offs

00:47:02 8. RAG Applications: Grounding, attribution, hallucination prevention

00:54:04 9. UI for RAG: Human-computer interaction & model optimization

00:59:01 10. Word Embeddings: Word2Vec, GloVe, and semantic spaces

01:06:43 11. Language Model Evolution: BERT, GPT, and beyond

01:11:38 12. AI & Human Cognition: Sequential processing & chain-of-thought

Refs:

1. Retrieval Augmented Generation (RAG) paper / Patrick Lewis et al. [00:27:45]

https://arxiv.org/abs/2005.11401

2. LAMA (LAnguage Model Analysis) probe / Petroni et al. [00:26:35]

https://arxiv.org/abs/1909.01066

3. KILT (Knowledge Intensive Language Tasks) benchmark / Petroni et al. [00:27:05]

https://arxiv.org/abs/2009.02252

4. Word2Vec algorithm / Tomas Mikolov et al. [01:00:25]

https://arxiv.org/abs/1301.3781

5. GloVe (Global Vectors for Word Representation) / Pennington et al. [01:04:35]

https://nlp.stanford.edu/projects/glove/

6. BERT (Bidirectional Encoder Representations from Transformers) / Devlin et al. [01:08:00]

https://arxiv.org/abs/1810.04805

7. 'The Language Game' book / Nick Chater and Morten H. Christiansen [01:11:40]

https://amzn.to/4grEUpG

Disclaimer: This is the sixth video from our Cohere partnership. We were not told what to say in the interview. Filmed in Seattle in June 2024.

  continue reading

195 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 440266070 series 2803422
محتوای ارائه شده توسط Machine Learning Street Talk (MLST). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Machine Learning Street Talk (MLST) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Dr. Patrick Lewis, who coined the term RAG (Retrieval Augmented Generation) and now works at Cohere, discusses the evolution of language models, RAG systems, and challenges in AI evaluation.

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmented generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

Key topics covered:

- Origins and evolution of Retrieval Augmented Generation (RAG)

- Challenges in evaluating RAG systems and language models

- Human-AI collaboration in research and knowledge work

- Word embeddings and the progression to modern language models

- Dense vs sparse retrieval methods in information retrieval

The discussion also explored broader implications and applications:

- Balancing faithfulness and fluency in RAG systems

- User interface design for AI-augmented research tools

- The journey from chemistry to AI research

- Challenges in enterprise search compared to web search

- The importance of data quality in training AI models

Patrick Lewis: https://www.patricklewis.io/

Cohere Command Models, check them out - they are amazing for RAG!

https://cohere.com/command

TOC

00:00:00 1. Intro to RAG

00:05:30 2. RAG Evaluation: Poll framework & model performance

00:12:55 3. Data Quality: Cleanliness vs scale in AI training

00:15:13 4. Human-AI Collaboration: Research agents & UI design

00:22:57 5. RAG Origins: Open-domain QA to generative models

00:30:18 6. RAG Challenges: Info retrieval, tool use, faithfulness

00:42:01 7. Dense vs Sparse Retrieval: Techniques & trade-offs

00:47:02 8. RAG Applications: Grounding, attribution, hallucination prevention

00:54:04 9. UI for RAG: Human-computer interaction & model optimization

00:59:01 10. Word Embeddings: Word2Vec, GloVe, and semantic spaces

01:06:43 11. Language Model Evolution: BERT, GPT, and beyond

01:11:38 12. AI & Human Cognition: Sequential processing & chain-of-thought

Refs:

1. Retrieval Augmented Generation (RAG) paper / Patrick Lewis et al. [00:27:45]

https://arxiv.org/abs/2005.11401

2. LAMA (LAnguage Model Analysis) probe / Petroni et al. [00:26:35]

https://arxiv.org/abs/1909.01066

3. KILT (Knowledge Intensive Language Tasks) benchmark / Petroni et al. [00:27:05]

https://arxiv.org/abs/2009.02252

4. Word2Vec algorithm / Tomas Mikolov et al. [01:00:25]

https://arxiv.org/abs/1301.3781

5. GloVe (Global Vectors for Word Representation) / Pennington et al. [01:04:35]

https://nlp.stanford.edu/projects/glove/

6. BERT (Bidirectional Encoder Representations from Transformers) / Devlin et al. [01:08:00]

https://arxiv.org/abs/1810.04805

7. 'The Language Game' book / Nick Chater and Morten H. Christiansen [01:11:40]

https://amzn.to/4grEUpG

Disclaimer: This is the sixth video from our Cohere partnership. We were not told what to say in the interview. Filmed in Seattle in June 2024.

  continue reading

195 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش