Non-negative matrix factorization using Autograd

In a previous post, we had seen how to perfom non-negative matrix factorization (NNMF) using Tensorflow. In this post, we will look at performing NNMF using Autograd. Like Tensorflow, Autograd allows automatic gradient calculation.

Customary imports

In [2]:
import autograd.numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Creating the matrix to be decomposed

In [3]:
A = np.array([[3, 4, 5, 2],
                   [4, 4, 3, 3],
                   [5, 5, 4, 3]], dtype=np.float32).T

Masking one entry

In [4]:
A[0, 0] = np.NAN
In [5]:
A
Out[5]:
array([[ nan,   4.,   5.],
       [  4.,   4.,   5.],
       [  5.,   3.,   4.],
       [  2.,   3.,   3.]], dtype=float32)

Defining the cost function

In [210]:
def cost(W, H, la = .05):
    pred = np.dot(W, H)
    mask = ~np.isnan(A)
    C = np.sqrt(((pred - A)[mask].flatten() ** 2).mean(axis=None))
    D = la*len(H[H>0])
    #D = np.linalg.norm(H, ord=2)
    D = max(np.sum(np.abs(H), axis=0))
    #print(C, D)
    return D +C

Decomposition params

In [222]:
rank = 5
learning_rate=0.5
n_steps = 4000

Gradient of cost wrt params W and H

In [223]:
from autograd import grad, multigrad
grad_cost= multigrad(cost, argnums=[0,1])

Main gradient descent routine

In [224]:
shape = A.shape
H =  np.abs(np.random.randn(rank, shape[1]))
W =  np.abs(np.random.randn(shape[0], rank))
print "Iteration, Cost"
for i in range(n_steps):
    
    if i%500==0:
        print "*"*20
        print i,",", cost(W, H)
    del_W, del_H = grad_cost(W, H)
    W =  W-del_W*learning_rate
    H =  H-del_H*learning_rate
    
    # Ensuring that W, H remain non-negative. This is also called projected gradient descent
    W[W<0] = 0
    H[H<0] = 0
Iteration, Cost
********************
0 , 11.1645842634
********************
500 , 3.18111432175
********************
1000 , 3.16056906983
********************
1500 , 3.1625775801
********************
2000 , 3.13009606497
********************
2500 , 3.26063346835
********************
3000 , 3.00275139422
********************
3500 , 3.17581052995
In [225]:
pd.DataFrame(W)
Out[225]:
0 1 2 3 4
0 0.00000 0.00000 0.000000 3.838099 0.00000
1 0.00000 0.00000 0.000000 3.747767 0.00000
2 0.00000 0.00000 1.483943 3.358283 0.00000
3 0.87381 1.72481 0.000000 2.104210 0.41053
In [226]:
len(H[H>0])
Out[226]:
8
In [227]:
pd.DataFrame(H)
Out[227]:
0 1 2
0 0.000000 0.058065 0.000000
1 0.000000 0.114614 0.000000
2 0.336290 0.075604 0.000000
3 1.152227 1.291911 0.669434
4 0.000000 0.027280 0.000000
In [228]:
pred = np.dot(W, H)
pred_df = pd.DataFrame(pred).round()
pred_df
Out[228]:
0 1 2
0 4.0 5.0 3.0
1 4.0 5.0 3.0
2 4.0 4.0 2.0
3 2.0 3.0 1.0
In [229]:
pd.DataFrame(A)
Out[229]:
0 1 2
0 NaN 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 3.0