Abstract

In this project, I explored two techniques for restoring black-and-white images of faces – principal component analysis ("Eigenfaces") and using convolutional autoencoders. The first method consists of writing an image data set as a matrix, where each row is a flattened image vector, and finding its eigenvectors, which can then be used to approximate the initial version of a noisy image by reconstructing it from the image eigenvectors. The second method involves training a deep network consisting of an encoder stack, which maps the input image to a smaller vector, and a decoder stack, which recreates the input image from the vector produced by the encoder.

Both methods produce compressed versions of the dataset and use them for approximating the initial version of a deteriorated image. Principal component analysis is fast and does not involve training a network. Autoencoders, however, are more flexible, since they can be trained to ignore specific kinds of noise in the training data. 

 

The Linear Method

My interest in image restoration was sparked by Tim Chartier's When Life is Linear [2]. The author performs principal component analysis on 12 black and white portraits of American presidents (including John Fitzgerald Kennedy), finds 6 "eigenfaces", and uses them to restore a modified image of JFK.

 

Fig.1 A modified version of an image of JFK (left) and an approximation of the original with principal component analysis. From When Life is Linear by Tim Chartier, 2015, Washington: American Mathematical Society.

Chartier's finds the principal components of a face image dataset consisting of 12 black-and-white portraits American presidents with similar pose and lighting and a blank background. He then approximates the query image, a modified version of an image in the original dataset, as the sum of the average image in the dataset and a linear combination of 6 principal components. The coefficient in front of each principal component in this approximation is determined by a dot product of the principal component and the query image.

 

I tried to reproduce the method on a subset of the CMU Faces dataset, one only including portraits with a straight pose and no sunglasses.

Fig. 2 Left to right: the average image and the first six "eigenfaces" in the subset of the CMU dataset.

 

 

 

Fig. 3 Left to right: original image in dataset, manually deteriorated image, restoration using principal component analysis.

 

My initial results were a bit underwhelming, but not entirely surprising – I was working with a small dataset (62 images) which had a significant amount of variation in pose and background. I tried to find out if working with a bigger dataset that is better suited for principal component analysis could produce better results.

 

For my second experiment, I used the LFWcrop dataset, a cropped, black-and-white version of Labeled Faces in the Wild. This second dataset has 1070 images that feature a significant amount of pose variation but less background variation, because the faces are cropped in the image.

Fig. 4 Left to right: the average image and the first six "eigenfaces" of the LFWcrop dataset.

 

Fig. 5 Left to right: original picture of Bill Gates in the dataset, manually deteriorated version, result of "eigenface" approximation performed on the deteriorated image.

 

The results were much better on the larger, cleaner dataset. However, one problem with the "eigenface" approach is that, provided a large enough dataset, the algorithm can use eigenfaces to model any detail in the query image, including noise. Here is an illustration of the "eigenface" algorithm's ability to recreate detail:

Fig. 6 (Left) picture of me (not included in the dataset used for finding the "eigenfaces") and an approximation of it using "eigenfaces" extracted from the LFWcrop dataset.

 

However, the PCA approach will not discriminate between desirable and undesirable detail in the image and will try to recreate both:

 

Fig. 7 (Left) deteriorated picture of me and an attempt to restore it with "eigenfaces" extracted from the LFWcrop dataset.

 

A more powerful approach would enable restoration that keeps desired features and ignores undesired ones. 

 

The Non-Linear Method

An alternative approach to image restoration is using autoencoders. An autoencoder is a neural network that can learn to map an input image to a smaller dimension vector and create an approximation of the original image from the latent vector. Autoencoders are commonly used for data denoising – the audoencoder is fed the original images as the desired output and noisy versions of them as the input, so that it learns to extract the important features from a noisy image.

 

An autoencoder consists of two parts – an encoder stack, which learns to map an image to a smaller dimension vector, and a decoder, which learns to recreate the original image from the latent vector. For my image restoration task, I implemented a convolutional autoencoder, one that is commonly used for image denoising. Its encoder stack has a convolutional layer with 64 filters, a 2x2 max pooling layer, another convolutional layer with 64 filters, and a final 2x2 max pooling layer which produces the encoding. The decoder stack has a convolutional layer with 64 filters, a 2x2 upsampling layer, another convolutional layer with 64 filters, another 2x2 upsampling layer, and a final convolutional layer with 1 filter that produces the query image approximation.

 

For training, I decided I would like to use noise that is similar to the mustache I drew on Bill Gates' portrait. The mustache most closely resembles two completely black horizontal lines, so I implemented a method that draws between 1 and 5 random black lines on a query image with a preference for horizontal lines (70% of the time). 

 

Fig. 8 Example of the image deterioration I tried to model for the training.

 

 

 

 

 

Fig. 9 Left: noisy testset image input, right: autoencoder output

 

Here is the autoencoder performance on the manually deteriorated Bill Gates image:

Fig. 10 Left: manually deteriorated image input, right: autoencoder output

 

The autoencoder does a fairly good job at ignoring the black boxes in the foreground and reconstructing the face behind them. The autoencoder outputs are a bit blurry in this implementation, but it is worth keeping in mind that I worked with fairly small images, 64x64 pixels; we could reasonably expect less of a difference in overall image clarity between input and output in an implementation based on larger image size.

 

Discussion

Both principal component analysis and autoencoders can be used for image restoration. Autoencoders, however, yield much better performance when we can model the particular type of noise that we want to remove, owing to their capacity of learning the important features in an image. Principal component analysis and autoencoders have many other applications in image processing. They are essentially methods of learning a particular set of features from a data set and finding a more compact representation of them.  

 

Because autoencoders produce a more compact representation of images, they can be used for image compression. In practice, however, hard-coded approaches to image compression, such as JPEG, are much more common. One of the biggest problems of traditional autoencoders is that they require input images of fixed size. The authors of [1] propose a new autoencoding method that addresses the fixed-size problem by attending over the image in a window of fixed size and producing codewords that are stored in a variable-size array. The authors show how the autoencoder approach produces a higher quality reconstruction of the image at fewer bits per pixel. In addition, they argue for the usefulness of this learned approach over hard-coded approaches citing the growing diversity of image-processing needs.

 

[3] presents a cool use of image compression – content-based image retrieval. The authors propose a method of mapping images to short binary codes that are then used for hashing in content-based image retrieval. In order to map the images to good binary codes, the algorithm needs to extract meaningful features from the image, and it does that using – you guessed it – autoencoders. The authors show that measures of distance in the binary code space are much more semantically useful than ones computed at pixel level. And the ability to produce meaningful short codes for each image makes for an extremely powerful semantic-hashing tool, one that can extract a meaningful image in a length of time independent of the database size.

 

Autoencoding and principal component analysis are methods of data compression – methods that can be used in a wide variety of applications, from image restoration to content-based image retrieval.

 

Conclusion

In this project, I explored using principal component analysis and autoencoders for image restoration. I found that both can restore an image based on a dataset of similar images, but also that autoencoders produce significantly better results when the type of noise in the data is known beforehand and can be used for training.

 

Linear Image Restoration (PCA):

Non-Linear Image Restoration (Autoencoder):

 

References

[1] A. K. Ashok & N. Palani (2018). Autoencoders with Variable Sized Latent Vector for Image Compression. CVPR Workshops. Retrieved         from http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w50/Ashok_Autoencoders_with_Variable_CVPR_2018_paper.pdf

[2] T. Chartier. (2015). When life is linear : From computer graphics to bracketology. Retrieved from https://ebookcentral.proquest.com

[3] A. Krizhevsky & G. E. Hinton (2011). Using very deep autoencoders for content-based image retrieval. ESANN. Retrieved from https://pdfs.semanticscholar.org/64b5/4bdf023624da4f261cdd18ac57716658e81f.pdf

 

Datasets

Mitchell, T. (1999). CMU Face Images Data Set. Retrieved from:

http://archive.ics.uci.edu/ml/datasets/cmu+face+images

Sanderson, C. (2009). LFWcrop Face Dataset. Retrieved from:

http://conradsanderson.id.au/lfwcrop/


While working on this project, I discussed my ideas with Prof. Maxwell and Maan Qraitem. I also consulted the following online resources:

Eigenfaces:
https://sandipanweb.wordpress.com/2018/01/06/eigenfaces-and-a-simple-face-detector-with-pca-svd-in-python/

PCA and SVD:
https://medium.com/@jonathan_hui/machine-learning-singular-value-decomposition-svd-principal-component-analysis-pca-1d45e885e491

Introduction to Autoencoders:
https://towardsdatascience.com/deep-inside-autoencoders-7e41f319999f

Using Autoencoders:
https://www.learnopencv.com/understanding-autoencoders-using-tensorflow-python/

Building Autoencoders:
https://blog.keras.io/building-autoencoders-in-keras.html