This document summarizes generative models like VAEs and GANs. It begins with an introduction to information theory, defining key concepts like entropy and maximum likelihood estimation. It then explains generative models as estimating the joint distribution P(X,Y) compared to discriminative models estimating P(Y|X). VAEs are discussed as maximizing the evidence lower bound (ELBO) to estimate the latent variable distribution P(Z|X), allowing generation of new X values. GANs are also covered, defining their minimax game between a generator G and discriminator D, with G learning to generate samples resembling the real data distribution Pemp.