Stop everything : The God Father of Deep Learning published a paper !

Geoffrey Hinton who has been called “The God Father of Deep Learning” published an article last week. Hinton is an important researcher in the machine learning universe, so when he published an article people were quite excited !

Who is he ?

Hinton is one of the researchers who worked on backpropagation (the original article here from 1986). Then backpropagation became the method the most used to train a deep learning model.

But more recently he explained that he doesn’t believe that backpropagation is the best way to do AI. He thinks that it is a method that works but not the best one. More about his point of view here.

Other concerns appears recently about the fact that by changing a few pixel in an image could totally biased a deep learning classification model. Several papers have been published on the subject here, here, or here !

What is the article about ?

The article aims to reduce the influence of single pixel and to preserve the spatial relationship of elements. In a nutshell : to be more robust.

The article introduces a capsule model called CapsNet containing 3 layers : it has two convolutional layers and one dense layer. This is the architecture of CapsNet :

a) Convolutional layer

The first layer is a traditional convolutional layer.

b) Capsule layer

The second layer is a convolutional capsule layer containing 32 channels of 8D capsules. A capsule layer is basically a layer containing other layers. We apply a convolutional operation 32 times and concatenate all these layers.

c) Routing algorithm and Digitcaps

Then the final layer is Digitcaps, it use routing-by-agreement algorithm. Hinton replaced the MaxPooling by a routing algorithm. Instead of squashing the output of a unit it squashed the entire vector.

Implementations

There is an implementation in Tensorflow available and one in Keras.

Results

Hinto achived the state-of-art performance on MNIST and claims that his algorithm is way better than a classic convolutional network on overlapping digits. He gives exemples on these kinds of digits :

 

Conclusion

It was a quick overview of this exciting new paper of Hinton, I am looking forward to use the implementations on some new datasets ! 🙂

Principal Component Analysis (PCA) implemented with PyTorch

What is PCA ?

PCA is an algorithm capable of finding patterns in data, it is used to reduce the dimension of the data.

If X is a matrix of size (m, n). We want to find an encoding fonction f such as f(X) = C where C is a matrix of size (m,l) with l < m and a decoding fonction g that can approximately reconstruct X  such as g(C) ≈ X

C is a representation of X in a lower dimension, we want to find f so that the loss of information in minimal.

PCA implementation steps

This article requires to know what is SVD and eigen decomposition if you want to understand each step. However if you don’t you can still read it to use the implementation !

Data preprocessing

We suppose that X is a Numpy array containing the data. k is the number of components we want after the transformation.

k = 3
X = torch.from_numpy(iris.data)

We need to standardize the data :

X_mean = torch.mean(X,0)
X = X - X_mean.expand_as(X)

Perform Singular Value Decomposition

With torch.SVD() we obtain the singular value decomposition: V the eigenvectors of X and S the eigenvalues in decreasing order. So U[:,:k] corresponds to the k largest eigenvalues.

U,S,V = torch.svd(torch.t(X))
C = torch.mm(X,U[:,:k])

Visualization

We will use our  PCA function

def PCA(data, k=2):
 # preprocess the data
 X = torch.from_numpy(data)
 X_mean = torch.mean(X,0)
 X = X - X_mean.expand_as(X)

 # svd
 U,S,V = torch.svd(torch.t(X))
 return torch.mm(X,U[:,:k])

Now we will visualize the PCA on the IRIS dataset from scikit learn

iris = datasets.load_iris()

X = iris.data
y = iris.target
X_PCA = my_PCA(X)

plt.figure()

for i, target_name in enumerate(iris.target_names):
 plt.scatter(X_PCA[y == i, 0], X_PCA[y == i, 1], label=target_name)

plt.legend()
plt.title('PCA of IRIS dataset')
plt.show()

The PCA allowed us to visualize the iris dataset on a two dimensions visualization and to find combinations of attributes to identify each type of iris.