# Non-negative matrix factorization using Tensorflow¶

In a previous post, we had seen how to perfom non-negative matrix factorization (NNMF) using non-negative least squares (NNLS). In this post, we will look at performing NNMF using TensorFlow. As before, we will look at factorizing matrices that may contain missing entries (for the problem of movie recommendation, etc.). Like explained in a previous post, we would be using projected gradient descent for this problem. In other words, we would be computing the gradient, then ensuring that the weights are non-negative, then perform gradient descent.

### Customary imports¶

In [1]:
import tensorflow as tf
import numpy as np
import pandas as pd
np.random.seed(0)


### Creating the matrix to be decomposed¶

In [2]:
A_orig = np.array([[3, 4, 5, 2],
[4, 4, 3, 3],
[5, 5, 4, 4]], dtype=np.float32).T

A_orig_df = pd.DataFrame(A_orig)

In [3]:
A_orig_df #(4 users, 3 movies)

Out[3]:
0 1 2
0 3.0 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 4.0

In [4]:
A_df_masked = A_orig_df.copy()

In [5]:
np_mask = A_df_masked.notnull()

Out[5]:
0 1 2
0 False True True
1 True True True
2 True True True
3 True True True

### Basic Tensorflow setup¶

In [6]:
# Boolean mask for computing cost only on valid (not missing) entries

#latent factors
rank = 3

# Initializing random H and W
temp_H = np.random.randn(rank, shape[1]).astype(np.float32)
temp_H = np.divide(temp_H, temp_H.max())

temp_W = np.random.randn(shape[0], rank).astype(np.float32)
temp_W = np.divide(temp_W, temp_W.max())

H =  tf.Variable(temp_H)
W = tf.Variable(temp_W)
WH = tf.matmul(W, H)


### Cost function¶

In [7]:
#cost of Frobenius norm


### Misc. Tensorflow¶

In [8]:
# Learning rate
lr = 0.001
# Number of steps
steps = 1000
init = tf.global_variables_initializer()


### Ensuring non-negativity¶

In [9]:
# Clipping operation. This ensure that W and H learnt are non-negative
clip_W = W.assign(tf.maximum(tf.zeros_like(W), W))
clip_H = H.assign(tf.maximum(tf.zeros_like(H), H))
clip = tf.group(clip_W, clip_H)


### Main Tensorflow routine¶

In [10]:
steps = 1000
with tf.Session() as sess:
sess.run(init)
for i in range(steps):
sess.run(train_step)
sess.run(clip)
if i%100==0:
print("\nCost: %f" % sess.run(cost))
print("*"*40)
learnt_W = sess.run(W)
learnt_H = sess.run(H)

Cost: 148.859848
****************************************

Cost: 3.930172
****************************************

Cost: 2.068570
****************************************

Cost: 1.418309
****************************************

Cost: 0.819721
****************************************

Cost: 0.399933
****************************************

Cost: 0.176080
****************************************

Cost: 0.079007
****************************************

Cost: 0.041353
****************************************

Cost: 0.027041
****************************************


### Computing the prediction¶

In [11]:
learnt_H

Out[11]:
array([[ 0.86129224,  1.3388027 ,  1.97224879],
[ 2.16338873,  0.97277433,  1.17212451],
[ 0.25879648,  1.07861733,  1.09541821]], dtype=float32)
In [12]:
learnt_W

Out[12]:
array([[ 1.15797794,  0.97454673,  1.41825044],
[ 1.44136858,  1.16967547,  0.79135358],
[ 0.81640321,  1.98227394,  0.02636297],
[ 1.38819814,  0.29285902,  0.8031919 ]], dtype=float32)
In [13]:
pred = np.dot(learnt_W, learnt_H)
pred_df = pd.DataFrame(pred)
pred_df.round()

Out[13]:
0 1 2
0 3.0 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 4.0

Not bad! Just to recall our originial matrix.

In [14]:
A_orig_df

Out[14]:
0 1 2
0 3.0 4.0 5.0
1 4.0 4.0 5.0
2 5.0 3.0 4.0
3 2.0 3.0 4.0