Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)
Similar Tracks
Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer
Umar Jamil
Legion Retreat 2024 - Low-Latency, High-Performance LLM Serving and Fine-tuning - Zhihao Jia
Legion Programming System
Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
Umar Jamil
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, with Patrick Lewis, Facebook AI
Natural Language Processing presentations