AI by Hand ✍️

AI by Hand ✍️

SiLU

Activation series: 7 of 12

Prof. Tom Yeh's avatar
Prof. Tom Yeh
May 11, 2026
∙ Paid

Activation Series:

  1. Softmax

  2. Sigmoid

  3. Tanh

  4. ReLU

  5. Leaky ReLU

  6. ELU

  7. SiLU

  8. GELU

  9. Log-Sum-Exp

  10. Softplus

  11. GLU

  12. SwiGLU

SiLU (Sigmoid Linear Unit, also called Swish) is the activation inside the feed-forward layers of Llama, Mistral, Mixtral, Gemma, and most modern open-weight LLMs. The whole idea is one move on top of sigmoid: use the sigmoid value as the fraction of the input that passes through.

Paid members: open the interactive diagram below ↓

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Tom Yeh · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture