What are autoencoders

Ruben Cañadas 06/03/2024 Technology

Autoencoders are a type of neural network architecture that belongs to the group of unsupervised learning methods. This architecture extracts the most important features of the input, eliminating the rest of little relevance.

This allows autoencoders to learn a reduced representation of information, making it a perfect method for compressing information.

The complete information can be reconstructed thanks to the autoencoder architecture. This makes it a generative algorithm.

Generative and discriminative models

The statistical algorithms used in artificial intelligence can be classified into generative methods and discriminative methods.

Discriminative algorithms are those that learn the representation of the function that maps the independent variables to the dependent variable of interest. That is, a car and motorcycle classifier is a discriminative model since, given some input variables, in this case the pixels of the image, it is capable of discriminating whether it is a car or a motorcycle.

A sales prediction model is also a discriminative model since, based on some variables, it is capable of predicting the value of sales at a given time.

In other words, discriminative algorithms learn the probability of Y given X: P(Y|X)

On the other hand, variational autoencoders, latent Dirichlet allocation or Boltzmann machines generate information from an internal representation called latent space. In this case, the model learns the joint probability P(X,Y).

Autoencoders as generative and discriminative models

As we have seen, one of the strong points of autoencoders is learning a new, more compressed representation of information.

To perform this task, the autoencoder learns to generate its own input. This may seem trivial, but if we add noise or limit the size of the internal representation we can generate a model that learns the most important characteristics of our information and can be compressed efficiently.

Most autoencoders are not considered generative models. However, as we will see later, there is one in particular called variational autoencoder or VAE that is capable of learning the distribution of independent variables that are capable of generating with high probability an output very similar to our training data. In this case, VAEs are considered generative algorithms.

Parts and architecture of the autoencoder

As we will see in the next section, there are different types of autoencoders. However, they all share a basic architecture.

The first part is what is known as the encoder or encoder and is responsible for compressing the information. This is achieved by stacking different layers of neurons, reducing the number of neurons used in each of them.

The second part of the architecture is the latent space, which is the minimum space in which information is represented within the neural network.

The last part is known as the decoder or decoder and has the same shape as the encoder but in the opposite way. In each layer of the network the number of neurons is increased until the last layer has the same number of neurons as the first layer.

Types of autoencoders

There are different structures that autoencoders can adopt. Each of them has a specific name and specializes in a specific task.

Stacked autoencoder

Stacked autoencoders or deep autoencoders is simply the architecture that uses different layers of neurons to learn more complex representations of information.

They have a symmetrical structure with respect to the latent space layer and are really efficient at reconstructing information.

Variational autoencoder

Variational autoencoders or VAEs are generative algorithms in which a continuous distribution of training data is inferred through a field of statistics known as variational inference.

The encoder generates a mean and a standard deviation with which a Gaussian distribution can be generated from which the decoder can obtain samples and generate the latent space. From this, the decoder can create fictitious data very similar to real data.

Sparse autoencoder

This architecture aims to limit the number of neurons that are activated to force the algorithm to learn more complex patterns. This is achieved through the sparsity constraint, where a term is added to the cost function that forces the autoencoder to decrease the number of active neurons during training.

Denoising autoencoders

A regularization technique to avoid trivial learning and force the network to learn complex patterns is to add noise from a random Gaussian distribution to alter the input data, forcing relevant patterns to be obtained from the information.

Autoencoder applications

Autoencoders have multiple uses in different sectors. We will mention a few, but there are many more. Many problems can be approached in such a way that an autoencoder architecture is capable of solving them.

Information compression : The main characteristic of this type of algorithm is that they are capable of extracting the most important characteristics of the information and creating a new representation of it in a reduced dimension.
Fraud detection - Autoencoders have been used by companies like PayPal to build fraud detection systems by extracting the key features that determine whether a certain transaction is fraudulent or not.
Image generation : Another possible application is image generation. This can be very useful for design tasks and for the generation of new freely distributed image datasets.