Image Generation Using Variational Autoencoder — With Code — Part 1

Praveen Krishna Murthy
3 min readJun 19, 2021

--

What if we have less images and would like to generate more with slight changes in original ones. This blog tells more about it.

Source: Google

This blog talks about Autoencoder and its application in order to understand the variational autoencoder which will be discussed in part-2.

The contents of this blog are as follows

  1. Introduction
  2. Autoencoder
  3. Applications of Autoencoder
  4. Variational Autoencoder

Introduction

In the lot of applications we will be looking at the whole bunch of data applied to the Machine learning architecture. But this data can be represented by much simpler form and lower dimenional space than the actual data we are looking at. So lot of techniques in Machine Learning tries to compress the data to much simpler form. One most common technique used in the papers is Variational Autoencoder. We will get into a bit technical, also i will explain how this architecture can be used to generate new images. Hope you are ready to dive-in.

Before we jump into the Variational Autoencoders, first we will understand the normal Autoencoders.

Autoencoders

I will assume you have the general knowledge of neural netwok, cost function and backpropagation that is performed to train the neural network. Here what exactly it does is, it takes some kind of input data which is very high dimensionality, run it through the neural network and it tries to compress the data into smaller representation. It does this with 2 principal components, Encoder and Decoder.

Encoder is simply bunch of layers. It can be fully connected layers or convolutional layers. These layers take the input and compress the input into smaller representation which is less dimension than input dimension. This is called the bottleneck. From the bottleneck the input is again reconstructed by the decoder. The reconstruction loss is calculated and gradients of decoder and encoder are adjusted and forward pass is done in iterative manner until the most proper bottle neck is obtained.

Fig:1 — Autoencoder Architecture

So if you see this representation of compressed data, this is a form of data-specifi compression. Google is now trying to implement this in mobiles so that image is compressed and sent to mobile, the decoder network at mobile end would be able to reconstruct back the high resolution image. The Image Segmentation also works more or less the same way. The image is forward passed through the convolution layers and deconvolution is performed reconstructing the image. But this time only the segmented data is reconstructed. Segmentation is grouping of similar pixels and we teach the same to deconvoution layers.

Applications

Denoise Images: Denoising the images is one big application of the autoencoder. Start with MNSIT data, add noise to the images. Pass these noisy images through encoder now the bottle neck consists of noisy image representation. Now instead of reconstruction noisy image , we can reconstruct the denoised image. Fig 2 shows the repersentation of the denoising the image using autoencoder.

Autoencoder based Image Denoising

Face Reconstruction: Imagine the data being used as input consists of black spots on the face or the image pixel is missing. Now use the autoencoder to represent the compressed data of these faulty image and while reconstructing use the actual data to calculate the loss. This makes the network to act as Face reconstructor when it is trained to with minimum loss.

Neural painting: Similar to face reconstruction expect the part removed or pixel missing is in the painting.

Water Mark Removal : The network also serves in application such as removing the water mark.

References:

https://www.youtube.com/watch?v=9zKuYvjFFS8

--

--

Praveen Krishna Murthy
Praveen Krishna Murthy

Written by Praveen Krishna Murthy

ML fanatic | Book lover | Coffee | Learning from Chaos

No responses yet