[Classic] Word2Vec: Distributed Representations of Words and Phrases and their Compositionality Share: Download MP3 Similar Tracks XLNet: Generalized Autoregressive Pretraining for Language Understanding Yannic Kilcher Linformer: Self-Attention with Linear Complexity (Paper Explained) Yannic Kilcher Word Embedding and Word2Vec, Clearly Explained!!! StatQuest with Josh Starmer [Classic] ImageNet Classification with Deep Convolutional Neural Networks (Paper Explained) Yannic Kilcher Word2Vec - Skipgram and CBOW The Semicolon Think Fast, Talk Smart: Communication Techniques Stanford Graduate School of Business Deep Ensembles: A Loss Landscape Perspective (Paper Explained) Yannic Kilcher Language Models are Open Knowledge Graphs (Paper Explained) Yannic Kilcher Big Bird: Transformers for Longer Sequences (Paper Explained) Yannic Kilcher 2024's Biggest Breakthroughs in Math Quanta Magazine The Trillion Dollar Equation Veritasium CDS2021 - word embeddings Dorien Herremans [Classic] Deep Residual Learning for Image Recognition (Paper Explained) Yannic Kilcher Model Tuning & Selection Christian Hildebrand Word2Vec Papers Explained From Scratch: Skip-Gram with Negative Sampling Papers With Video Reformer: The Efficient Transformer Yannic Kilcher Understanding Word2Vec Jordan Boyd-Graber Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained) Yannic Kilcher Rethinking Attention with Performers (Paper Explained) Yannic Kilcher Group Normalization (Paper Explained) Yannic Kilcher