AI by Hand ✍️

AI by Hand ✍️

Feature Extraction + Head

Fine-Tuning series: 6 of 8

Prof. Tom Yeh's avatar
Prof. Tom Yeh
Apr 24, 2026
∙ Paid

Fine-Tuning Series:

  1. Weight Update

  2. Pretrain vs Fine-Tune

  3. Full Fine-Tuning

  4. Freezing Layers

  5. Linear Probe

  6. Feature Extraction + Head

  7. Adapter Layers

  8. LoRA

A feature head is a small trainable MLP bolted onto a frozen pretrained backbone. Think of it as pursuing a PhD on top of a master's degree. The master's — your pretrained backbone — stays exactly as it was, with no review. You aren't re-taking Linear Algebra or Probability; you're building something specialized on top of it: the PhD adds its own coursework, its own nonlinearity, and its own thesis layer.

Paid members: open the interactive diagram below ↓

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Tom Yeh · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture