[LLM] InfiniGen: Efficient Generative Inference of LLMs with Dynamic KV Cache Management (OSDI 2024)
Similar Tracks
[Compression] BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023)
Data Lakehouse Systems for Data Science
[Memory] MemSnap µCheckpoints: A Data Single Level Store for Fearless Persistence (ASPLOS 2024)
Data Lakehouse Systems for Data Science
Machine Learning Tutorial | Machine Learning Basics | Machine Learning Algorithms | Simplilearn
Simplilearn
[Compression] The FastLanes Compression Layout: Decoding 100 Billion Integers per Second (VLDB 2023)
Data Lakehouse Systems for Data Science
[KV store] BonsaiKV: Towards Fast, Scalable, and Persistent Key-Value Stores (VLDB 2024)
Data Lakehouse Systems for Data Science
What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata
The Royal Institution
[LLM Serving] Llumnix: Dynamic Scheduling for Large Language Model Serving (OSDI 2024)
Data Lakehouse Systems for Data Science
[Memory] AIFM: High-Performance, Application-Integrated Far Memory (OSDI 2020)
Data Lakehouse Systems for Data Science