Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Share:

Similar Tracks

Differential Inference: A Criminally Underused Tool Sasha Rush

Do we need Attention? A Mamba Primer Sasha Rush

State Space Models (SSMs) and Mamba Serrano.Academy

Linear Attention and Beyond (Interactive Tutorial with Songlin Yang) Sasha Rush

Long-Context LLM Extension Sasha Rush

Vision Transformer Quick Guide - Theory and Code in (almost) 15 min DeepFindr

But what are Hamming codes? The origin of error correction 3Blue1Brown

MAMBA and State Space Models explained | SSM explained AI Coffee Break with Letitia

Transformers (how LLMs work) explained visually | DL5 3Blue1Brown

How might LLMs store facts | DL7 3Blue1Brown

INFOCOM, IEEE ToN: Resource-aware deployment of dynamic DNNs over distributed systems Chetna Singhal

The Mamba in the Llama: Distilling and Accelerating Hybrid Models Sasha Rush

Visualizing transformers and attention | Talk for TNG Big Tech Day '24 Grant Sanderson

Illustrated Guide to Transformers Neural Network: A step by step explanation The AI Hacker

MedAI #41: Efficiently Modeling Long Sequences with Structured State Spaces | Albert Gu Stanford MedAI

But what is a neural network? | Deep learning chapter 1 3Blue1Brown

Recurrent Neural Networks (RNNs), Clearly Explained!!! StatQuest with Josh Starmer

How DeepSeek Changes the LLM Story Sasha Rush

What are Diffusion Models? Ari Seff

Reinforcement Learning: Machine Learning Meets Control Theory Steve Brunton