GANs need no introduction. Invented by Ian Goodfellow in 2014, GANs are responsible for starting the field of Generative AI for visual data.
Walkthrough
Goal: Generate realistic 4-D data from 2-D noise.
[1] Given
↳ 4 noise vectors in 2D (N)
↳ 4 real data vectors in 4D (X)
[2] 🟩 Generator: First Layer
↳ Multiply the noise vectors with weights and biases to obtain new feature vectors
[3] 🟩 Generator: ReLU
↳ Apply the ReLU activation function, which has the effect of suppressing negative values. In this exercise, -1 and -2 are crossed out and set to 0.
[4] 🟩 Generator: Second Layer
↳ Multiply the features with weights and biases to obtain new feature vectors.
↳ ReLU is applied. But since every value is positive, there's no effect.
↳ These new feature vectors are the "Fake" data (F) generated by this simple 2-layer Generator network.
[5] 🟦 Discriminator: First Layer
↳ Feed both Fake data (F) and real data (X) to the first linear layer
↳ Multiply F and X with weights and biases to obtain new feature vectors.
↳ ReLU is applied. But since every value is positive, there's no effect.
[6] 🟦 Discriminator: Second Layer
↳ Multiply the features with one set of weights and bias to obtain new features.
↳ The intended effect is to reduce to just one feature value per data vector.
[7] 🟦 Discriminator: Sigmoid σ
↳ Convert features (Z) to probability values (Y) using the Sigmoid function
↳ 1 means the Discriminator is 100% confident the data is real.
↳ 0 means the Discriminator is 100% confident the data is fake.
[8] 🏋️ Training: 🟦 Discriminator
↳ Compute the loss gradients of the Discriminator by the simple equation of Y - YD. Why so simple? Because when we use sigmoid and binary entropy loss together, the math magically simplifies to this equation.
↳ YD are the target predictions from the Discriminator's perspective. The Discriminator must learn to predict 0 for the four Fake data (F) and 1 for the four Real data (X). YD=[0,0,0,0,1,1,1,1].
↳ Note that the Discriminator's loss involves both the Fake data and Real data.
↳ With the loss gradients computed, we can kickoff the back propagation process to update the Discriminator's weights and biases (blue borders).
[9] 🏋️ Training: 🟩 Generator
↳ Compute the loss gradients of the Generator by the simple equation of Y - YG.
↳ YG are the target predictions from the Generator's perspective. The Generator must fool the Discriminator into predicting 1 for the four Fake data (F). YG=[1,1,1,1].
↳ Note that the Generator's loss involves only the Fake data.
↳ With the loss gradients computed, we can kickoff the back propagation process to update the Generator's weights and biases (green borders).