ML Interpretability: feature visualization, adversarial example, interp. for language models

ML Interpretability: feature visualization, adversarial example, interp. for language models

Share:

Similar Tracks

LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch Umar Jamil

Why Does Diffusion Work Better than Auto-Regression? Algorithmic Simplicity

LongNet: Scaling Transformers to 1,000,000,000 tokens: Python Code + Explanation Umar Jamil

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math Umar Jamil

Implement Llama 3 From Scratch - PyTorch Uygar Kurt

Graph Neural Networks - a perspective from the ground up Alex Foo

Variational Autoencoder - Model, ELBO, loss function and maths explained easily! Umar Jamil

The Breakthrough Behind Modern AI Image Generators | Diffusion Models Part 1 Depth First

LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU Umar Jamil

BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token Umar Jamil

Llama 4 From Scratch in PyTorch - Vision Language Models + MoE Priyam Mazumdar

Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer Umar Jamil

Segment Anything - Model explanation with code Umar Jamil

Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW) Umar Jamil

Full ML Design Mock by ex-Meta Staff Engineer (with feedback) MLEpath

Towards Monosemanticity: Decomposing Language Models Into Understandable Components Arize AI

The Most Important Algorithm in Machine Learning Artem Kirsanov

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs) Stanford Online

DSPy Explained! Connor Shorten

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile Computerphile