AI by Hand ✍️

AI by Hand ✍️

New GPT-OSS Trick to Ignore Tokens

Frontier AI Drawings: 3 of 13

Prof. Tom Yeh's avatar
Prof. Tom Yeh
Aug 09, 2025
∙ Paid

Frontier AI Drawings: the series

  1. "Expert Choice" Mixture of Experts (MoE)

  2. MHA, MQA, GQA, MoE-A: More Attention!

  3. New GPT-OSS Trick to Ignore Tokens

  4. MXFP4, FP4, FP8

  5. LoRA, Fine-Tune, Pre-Train

  6. QLoRA, DoRA, BitFit, NF4 vs INT4

  7. KV Cache, Prefill, Decode

  8. EmbeddingGemma, MRL, InfoNCE, Embed vs. Decode

  9. Inference Batching, Request-vs-Token Level

  10. MLP Parallelism: Data, Context, Row, Column, Pipeline

  11. RoPE vs PE in QKV Self-Attention

  12. RMS, Group, Layer, Batch Norm, Tensor Parallelism

  13. Qwen 3

Big news this week: OpenAI released its latest open-source model, GPT-OSS, along with a tech report, likely a prelude to the long-awaited GPT-5.

Instead of one early-access issue per week, here’s a bonus issue to respond quickly to what may become a reference point in future architectures.

Over the next few issues, I’ll prioritize architectures and techniques cited in the OpenAI tech reports.

We’re starting with: how transformers can ignore tokens.

Drawings

I created four new drawings to explore different approaches to handling irrelevant tokens in attention:

  1. Baseline Attention — the standard Softmax formulation, which always assigns some attention to every token.

  2. Learned Bias in the Denominator — the new method used by OpenAI to allow true “ignoring.”

  3. Off-by-One Softmax — adds a fixed bias to suppress weak matches.

  4. Sink Tokens — introduces a special token to absorb low-similarity attention.

Each one solves the same limitation, each in a different way. You’ll learn how they work, why they matter, and what tradeoffs they introduce.


Page 1 of 4

Become a member to access the rest of the drawings.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Tom Yeh · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture