با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده


Zeta-Alpha-E5-Mistral: Finetuning LLMs for Retrieval (with Arthur Câmara)
Manage episode 450164769 series 3446693
In the 30th episode of Neural Search Talks, we have our very own Arthur Câmara, Senior Research Engineer at Zeta Alpha, presenting a 20-minute guide on how we fine-tune Large Language Models for effective text retrieval. Arthur discusses the common issues with embedding models in a general-purpose RAG pipeline, how to tackle the lack of retrieval-oriented data for fine-tuning with InPars, and how we adapted E5-Mistral to rank in the top 10 on the BEIR benchmark.
## Sources
InPars
- https://github.com/zetaalphavector/InPars
- https://dl.acm.org/doi/10.1145/3477495.3531863
- https://arxiv.org/abs/2301.01820
- https://arxiv.org/abs/2307.04601
Zeta-Alpha-E5-Mistral
- https://zeta-alpha.com/post/fine-tuning-an-llm-for-state-of-the-art-retrieval-zeta-alpha-s-top-10-submission-to-the-the-mteb-be
- https://huggingface.co/zeta-alpha-ai/Zeta-Alpha-E5-Mistral
NanoBEIR
21 قسمت
Manage episode 450164769 series 3446693
In the 30th episode of Neural Search Talks, we have our very own Arthur Câmara, Senior Research Engineer at Zeta Alpha, presenting a 20-minute guide on how we fine-tune Large Language Models for effective text retrieval. Arthur discusses the common issues with embedding models in a general-purpose RAG pipeline, how to tackle the lack of retrieval-oriented data for fine-tuning with InPars, and how we adapted E5-Mistral to rank in the top 10 on the BEIR benchmark.
## Sources
InPars
- https://github.com/zetaalphavector/InPars
- https://dl.acm.org/doi/10.1145/3477495.3531863
- https://arxiv.org/abs/2301.01820
- https://arxiv.org/abs/2307.04601
Zeta-Alpha-E5-Mistral
- https://zeta-alpha.com/post/fine-tuning-an-llm-for-state-of-the-art-retrieval-zeta-alpha-s-top-10-submission-to-the-the-mteb-be
- https://huggingface.co/zeta-alpha-ai/Zeta-Alpha-E5-Mistral
NanoBEIR
21 قسمت
همه قسمت ها
×



1 The Promise of Language Models for Search: Generative Information Retrieval 1:07:31

1 Task-aware Retrieval with Instructions 1:11:13

1 Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee 1:16:14


1 Evaluating Extrapolation Performance of Dense Retrieval: How does DR compare to cross encoders when it comes to generalization? 58:30


1 Few-Shot Conversational Dense Retrieval (ConvDR) w/ special guest Antonios Krasakis 1:23:11

1 Transformer Memory as a Differentiable Search Index: memorizing thousands of random doc ids works!? 1:01:40



1 Shallow Pooling for Sparse Labels: the shortcomings of MS MARCO 1:07:17
به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.