Similar Tracks
Kolmogorov Arnold Networks (KAN) Paper Explained - An exciting new paradigm for Deep Learning?
Neural Breakdown with AVB
Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
Umar Jamil
Rui Xiong, short talk, "Motivic Lefschetz Theorem for Twisted Milnor Hypersurfaces "
Schubert Seminar
Abstracting Failures Away From Stateful Dataflow Systems | KTH MSc Thesis Defense 2024
Aleksey Veresov
Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer
Umar Jamil
LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar Jamil
ML Interpretability: feature visualization, adversarial example, interp. for language models
Umar Jamil
Victor Blanco - Autocatalysis with Mathematical Optimization Lens
Autocatalysis in Reaction Networks Seminar (ARN)