Registration
This lecture was already given. Recording is uploaded to our YouTube channel. See above.
Register: https://by-hand.ai/deepseek/register
Date: 1/28/2025 (Tuesday)
Time: 11am (Mountain Time)
Below you will find information to help you get the most out of the lecture:
Lecture Outline
Transformer
Attention
Self-Attention
Multi-Head Attention
Multi-Head Latent Attention (DeepSeek)
Feed-Forward
Single Expert
Mixture of Experts
Sparse Mixture of Experts
Shared+Routed Mixture of Experts (DeepSeek)
RoPE
Recommended Study Methods
[Best] Method 1: To get the best out of my lecture, many of my students have been recommending using two screens:
First screen: Watch the live lecture in full screen
Second screen: Download the blank Excel workbook and follow along as much as you can.
Method 2: Many students also reported that they would just sit back and watch the live lecture attentively, then wait for the recording to be posted later on, watch the lecture again, and pause at various moments along the way. This method took significantly more time but could be helpful for thorough understanding of the material. Another downside is that it usually took me and my staff quite some time to edit the recording before we can post it.
Q/A
Q: Who is this lecture for?
A: Originally for my students in the computer vision course but now for anyone with similar technical background and interest.
Q: What is my focus?
A: How it works, rather than what it can do. Many others have already commented on what it can do (i.e., benchmark results). I like to take you inside the blackbox to understand how it works instead.
Q: Which algorithms?
A: Multi-head Latent Attention + Mixture of Experts + RoPE
Q: How about RL (DeepSeek-o1)?
A: That would be too much for one lecture. Perhaps another lecture in the future.
Behind the Scene
This is how I studied DeepSeek deeply seeking for deeper understanding by sketching the diagram by hand and matching each visual component to the corresponding math equation. Pun intended. 😄
Download Excel
During my lecture, I plan to show you how to build a simplified version of the DeepSeek model using Excel.
You can download the blank Excel workbook below:
Notes from the Community
from Diana Wolf Torres
Yay!! Thanks for the details Dr. Yeh, happy to be going back to school
How do we get the recording? Please advise