[LLM] InfiniGen: Efficient Generative Inference of LLMs with Dynamic KV Cache Management (OSDI 2024)
Similar Tracks
[Fault Tolerance] Exploiting Nil-Externality for Fast Replicated Storage (SOSP 2021)
Data Lakehouse Systems for Data Science
[KV store] FluidKV: Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint
Data Lakehouse Systems for Data Science
[LLM Serving] Llumnix: Dynamic Scheduling for Large Language Model Serving (OSDI 2024)
Data Lakehouse Systems for Data Science
Ghibli Coffee Shop ☕️ Music to put you in a better mood 🌿 lofi hip hop - lofi songs | study / relax
Lofi Coffee
Quiet Night: Deep Sleep Music with Black Screen - Fall Asleep with Ambient Music
Soothing Relaxation
[Linear Algebra] DistME: A Fast and Elastic Distributed Matrix Computation using GPUs (SIGMOD 2019)
Data Lakehouse Systems for Data Science