A friendly introduction to
Generative Adversarial Networks
Luis Serrano
thispersondoesnotexist.com
Python (no packages!)
www.github.com/luisguiserrano/gans
General idea
Generative Adversarial Networks
DiscriminatorGenerator
Generative Adversarial Networks
DiscriminatorGenerator
Nope!
Generative Adversarial Networks
DiscriminatorGenerator
Nope!
Generative Adversarial Networks
DiscriminatorGenerator
Nope!
Generative Adversarial Networks
DiscriminatorGenerator
Nope!
Yup!
Generative Adversarial Networks
DiscriminatorGenerator
AHA!! ?
DiscriminatorGenerator
Real images
?AHA!!
Build the simplest GAN
Slanted Land
Slanted people 2x2 screens
1-layer
Neural networks
Faces
No faces (noise)
Tell them apart
Faces
Noise
0.25 0
1 0.75
0.25 1
0.5 0.75
0.75 0.5
0.75 0
1 1
0 0.75
0.75 0
0 0.75
1 0.25
0.25 0.75
1 0
0 1
0.75 0
0.25 0.75
Faces
Noise
10
Building the Discriminator
Building the discriminator
Faces Noise
1 0
0 1
0.25 1
0.5 0.75
Big
Big
small
small
any
any
any
any
Building the discriminator
Faces Noise
1 0
0 1
0.25 1
0.5 0.75
+
+
+
+
--
- -
1*1 + 0*(-1) + 0*(-1) + 1*1
= 2
0.25*1 + 1*(-1) + 0.5*(-1) + 0.75*1
= -0.5Threshold = 1
More than 1: face. Less than 1: no face
Discriminator
1 0
0 1
+1
-1
+1
-1
1
-1
Bias
1*1
+0*(-1)
+0*(-1)
+1*1
-1
1σ( ) = 0.73
σ(x) =
1
1 + e−x
1
1
0.73
Discriminator
1Bias
0.25*1
+1*(-1)
+0.5*(-1)
+0.75*1
-1
-0.5σ( )
0.25 1
0.5 0.75
= 0.37
+1
-1
+1
-1
-1
Building the Generator
Building the generator
Faces Noise
1 0
0 1
0.25 1
0.5 0.75
Big
Big
small
small
any
any
any
any
Generator
0.85 0.15
0.15 0.85
Bias
1
-1
+1
-1
-1
+1
+1
+1 -1
z
0 1
0.7
+1*0.7 + 1
+1*0.7 + 1
-1*0.7 - 1
-1*0.7 - 1
1.7
-1.7
-1.7
1.7
σ( )
σ( )
σ( )
σ( )
= 0.85
= 0.15
= 0.85
= 0.15
The training process:
Error functions
Log-loss error function
Label: 1
Prediction: 0.1
Error: large
Label: 1
Prediction: 0.9
Error: small
−ln(0.1) = 2.3
−ln(0.9) = 0.1
Error = -ln(prediction)
Log-loss error function
Label: 0
Prediction: 0.1
Error: small
Label: 0
Prediction: 0.9
Error: large −ln(0.1) = 2.3
−ln(0.9) = 0.1
Error = -ln(1 - prediction)
Summary
If we want a prediction to be 1:
Log-loss = -ln(prediction)
y = − ln(x)
High error
Low error
Summary
If we want a prediction to be 0:
Log-loss = -ln(1-prediction)
y = − ln(1 − x)
High error
Low error
The training process:
Backpropagation
Backpropagation
Error
Prediction
Backpropagation
Error
Prediction
Error plot
Training the generator and the discriminator
σ
1
Bias
z
σ
σ
σ
σ
1
z
0 1
0.25 1
0.5 0.75
0.68
Want 0
Error = -ln(1-0.68)
Want 1
Error = -ln(0.68)
Generated
image
Bias
Generator Discriminator
−ln (D(G(z))) −ln (1 − D(G(z)))
σ
1
Bias
0.44
Error = -ln(0.44)
1 0
0 1
Real image
Want 1
Discriminator
0
z
σ
σ
σ
σBias
1
z
1
Generator
−ln (D(x))
Repeat many times…
σ
1
Bias
z
σ
σ
σ
σBias
1
z
0 1
Generator Discriminator
σ
1
Bias
z
σ
σ
σ
σBias
1
z
0 1
1 0
0 1
Real image
Generator Discriminator
After many of these iterations (epochs)
σ
1
Bias
z
σ
σ
σ
σBias
1
z
0 1
Generator Discriminator
0.7
0.6
0.8
0
-0.5
2.5
2.8
-3.4
-2.9
-0.3
-0.4
0.4
-0.9
z
σ
σ
σ
σBias
1
Generator
z
0 1
Math and Code
www.github.com/luisguiserrano/gans
Discriminator
1 0
0 1
w1
w2
w4
w3
1
b
Bias
σ
x1
x2
x3
x4
Prediction
D(x) = σ(x1w1+x2w2+x3w3+x4w4 + b)
∂E
∂wi
=
∂E
∂D
⋅
∂D
∂wi
Loss function (error) from images
E = − ln(D(x))
=
−1
D(x)
⋅ σ(
4
∑
j=1
xjwj + b)[1 − σ(
4
∑
j=1
xjwj + b)]xi
=
−1
D(x)
⋅ D(x)[1 − D(x)]xi
Derivatives
∂E
∂b
=
∂E
∂D
⋅
∂D
∂b
= − [1 − D(x)]
= − [1 − D(x)]xi
Discriminator
0.25 1
0.5 0.75
w1
w2
w4
w3
1
b
Bias
σ
x1
x2
x3
x4
∂E
∂wi
=
∂E
∂D
⋅
∂D
∂wi
Loss function (error) from noise
E = − ln(1 − D(x))
=
1
1 − D(x)
⋅ σ(
4
∑
j=1
xjwj + b)[1 − σ(
4
∑
j=1
xjwj + b)]xi
=
1
1 − D(x)
⋅ D(x)[1 − D(x)]xi
Derivatives
∂E
∂b
=
∂E
∂D
⋅
∂D
∂b
= D(x)
Prediction
D(x) = σ(x1w1+x2w2+x3w3+x4w4 + b)
= D(x)xi
Generator Predictions
G(z) = (G1, G2, G3, G4)
∂E
∂wi
=
∂E
∂D
⋅
∂D
∂G
⋅
∂G
∂z
Loss function (error)
E = − ln(D(G(z))
Derivatives
=
−1
D(G(z))
⋅ σ(
4
∑
j=1
Giwi + b)[1 − σ(
4
∑
j=1
Giwi + b)]G(z) ⋅ σ(wiz + bi)[1 − σ(wiz + bi]z
=
−1
D(G(z))
⋅ D(G(z))[1 − D(G(z))] ⋅ Gi(1 − Gi)z
= (σ(v1z+c1), σ(v2z+c2), σ(v3z+c3), σ(v4z+c4))
D(G(z)) = σ(G1w1 + G2w2 + G3w3 + G4w4 + b)
= − [1 − D(G(z))] ⋅ Gi(1 − Gi)z
Biases
1
z
v1
v2
v4
v3
c1
c2
c4
c3
σ
σ
σ
σ
G1 G2
G3 G4
∂E
∂b
= − [1 − D(G(z))] ⋅ Gi(1 − Gi)
Error function plots
Acknowledgements
With a little help from my friends…
Sahil Juneja
@sjuneja90
Diego Gomez Mosquera
https://medium.com/@diegoalejogm
Alejandro Perdomo
@aperdomoortiz
https://medium.com/@diegoalejogm
Conclusion
https://www.manning.com/books/grokking-machine-learning
Discount code: serranoyt
Grokking Machine
Learning
By Luis G. Serrano
Thank you!
@luis_likes_math
Subscribe, like,
share, comment!
youtube.com/c/LuisSerrano
http://serrano.academy

Generative Adversarial Networks (GANs)