Generalized Loss Functions for Generative Adversarial Networks
Unsupervised learning , Generative models , Machine learning , Artificial intelligence , Optimization , Information theory
This thesis investigates the use of parameterized families of information-theoretic measures to generalize the loss functions of generative adversarial networks (GANs) under the objective of improving performance. A new generator loss function, called least kth-order GAN (LkGAN), is introduced, generalizing the least squares GANs (LSGANs) by using a kth order absolute error distortion measure with k greater than or equal to 1 (which recovers the LSGAN loss function when k=2). It is shown that minimizing this generalized loss function under an (unconstrained) optimal discriminator is equivalent to minimizing the kth-order Pearson-Vajda divergence. A novel loss function for the original GANs using Renyi information measures with parameter alpha is next presented. The GAN's generator loss function is generalized in terms of Renyi cross-entropy functionals. For any alpha > 0, this generalized loss function is shown to preserve the equilibrium point satisfied by the original GAN loss based on the Jensen-Renyi divergence, a natural extension of the Jensen-Shannon divergence. It is also proved that the Renyi-centric loss function reduces to the original GANs loss function as alpha approaches 1. Experimental results implemented on the MNIST and CelebA datasets under both DCGANs and StyleGANs architectures, indicate that the proposed LkGAN and RenyiGAN systems confer performance benefits by virtue of the extra degrees of freedom provided by the parameters k and alpha, respectively. More specifically, experiments show improvements with regard to the quality of the generated images as measured by the Frechet Inception Distance (FID) score and demonstrated by training stability and extensive simulations.