AI by Hand ✍️

AI by Hand ✍️

Home
✨ Frontier
Certificates
Lectures
Walkthrough
Excel
Store 🛒

Frontier

RMS, Group, Layer, Batch Norm, Tensor Parallelism
Frontier Model Math by hand ✍️
5 hrs ago • 
Prof. Tom Yeh
3
RoPE vs PE in QKV Self-Attention
Frontier Model Math by hand ✍️
Sep 30 • 
Prof. Tom Yeh
15
MLP Parallelism: Data, Context, Row, Column, Pipeline
Frontier Model Math by hand ✍️
Sep 23 • 
Prof. Tom Yeh
12
Inference Batching, Request-vs-Token Level
Frontier Model Math by hand ✍️
Sep 16 • 
Prof. Tom Yeh
12
EmbeddingGemma, MRL, InfoNCE, Embed vs. Decode
Frontier Model Math by hand ✍️
Sep 9 • 
Prof. Tom Yeh
14
KV Cache, Prefill, Decode
Frontier Model Math by hand ✍️
Sep 1 • 
Prof. Tom Yeh
17
QLoRA, DoRA, BitFit, NF4 vs INT4
Frontier Model Math by hand ✍️
Aug 26 • 
Prof. Tom Yeh
15
LoRA, Fine-Tune, Pre-Train
Frontier Model Math by hand ✍️
Aug 18 • 
Prof. Tom Yeh
26
1
MXFP4, FP4, FP8
Frontier Model Math by hand ✍️
Aug 14 • 
Prof. Tom Yeh
16
MHA, MQA, GQA, MoE-A: More Attention!
Frontier Model Math by hand ✍️
Aug 11 • 
Prof. Tom Yeh
8
New GPT-OSS Trick to Ignore Tokens
Frontier Model Math by hand ✍️
Aug 9 • 
Prof. Tom Yeh
10
"Expert Choice" Mixture of Experts (MoE)
Frontier Model Math by hand ✍️
Aug 5 • 
Prof. Tom Yeh
9
© 2025 Tom Yeh
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture