Non-negative matrix factorization using Autograd

In a previous post, we had seen how to perfom non-negative matrix factorization (NNMF) using Tensorflow. In this post, we will look at performing NNMF using Autograd. Like Tensorflow, Autograd allows automatic gradient calculation.

Customary imports

In [1]:
import autograd.numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Creating the matrix to be decomposed

In [2]:
A = np.array([[3, 4, 5, 2],
                   [4, 4, 3, 3],
                   [5, 5, 4, 3]], dtype=np.float32).T

Masking one entry

In [3]:
A[0, 0] = np.NAN
In [4]:
A
Out[4]:
array([[ nan,   4.,   5.],
       [  4.,   4.,   5.],
       [  5.,   3.,   4.],
       [  2.,   3.,   3.]], dtype=float32)

Defining the cost function

In [5]:
def cost(W, H):
    pred = np.dot(W, H)
    mask = ~np.isnan(A)
    return np.sqrt(((pred - A)[mask].flatten() ** 2).mean(axis=None))

Decomposition params

In [6]:
rank = 2
learning_rate=0.01
n_steps = 10000

Gradient of cost wrt params W and H

In [7]:
from autograd import grad, multigrad
grad_cost= multigrad(cost, argnums=[0,1])

Main gradient descent routine

In [8]:
shape = A.shape
H =  np.abs(np.random.randn(rank, shape[1]))
W =  np.abs(np.random.randn(shape[0], rank))
print "Iteration, Cost"
for i in range(n_steps):

    if i%1000==0:
        print "*"*20
        print i,",", cost(W, H)
    del_W, del_H = grad_cost(W, H)
    W =  W-del_W*learning_rate
    H =  H-del_H*learning_rate

    # Ensuring that W, H remain non-negative. This is also called projected gradient descent
    W[W<0] = 0
    H[H<0] = 0
Iteration, Cost
********************
0 , 2.54464061107
********************
1000 , 0.154436155862
********************
2000 , 0.101498903833
********************
3000 , 0.090621768384
********************
4000 , 0.0873420701633
********************
5000 , 0.086696747195
********************
6000 , 0.0865793255187
********************
7000 , 0.086557934527
********************
8000 , 0.0865540097305
********************
9000 , 0.0865532865438
In [9]:
pd.DataFrame(W)
Out[9]:
0 1
0 1.839280 1.034108
1 1.459627 1.546009
2 1.902534 0.163451
3 0.667133 1.456460
In [10]:
pd.DataFrame(H)
Out[10]:
0 1 2
0 2.606525 1.436522 2.027277
1 0.151855 1.313192 1.229221
In [11]:
pred = np.dot(W, H)
pred_df = pd.DataFrame(pred).round()
pred_df
Out[11]:
0 1 2
0 5.0 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 3.0
In [12]:
pd.DataFrame(A)
Out[12]:
0 1 2
0 NaN 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 3.0
>>