Non-negative matrix factorization with missing entries using CVXPY

Like my previous few posts, in this post I'll illustrate solving non-negative matrix factorization. I'll borrow an example provided by CVXPY and modify it in three ways:

  1. allowing missing entries
  2. modifying the cost function
  3. add regularisation

This CVXPY implementation uses alternating least squares based method to compute the decomposed matrices.

Imports

In [1]:
import cvxpy as cvx
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Creating matrix to be decomposed

In [2]:
A = np.array([[3, 4, 5, 2],
                   [4, 4, 3, 3],
                   [5, 5, 4, 3]], dtype=np.float32).T

Masking one entry

In [3]:
A[0, 0] = np.NAN
A
Out[3]:
array([[ nan,   4.,   5.],
       [  4.,   4.,   5.],
       [  5.,   3.,   4.],
       [  2.,   3.,   3.]], dtype=float32)

Unlike previous posts, I'll directly provide the source code in one go and have comments in the code. We will create two cost functions:

  1. minimise the absolute error between the given matrix and the product of the decomposed matrix
  2. minimise the relative error between the given matrix and the product of the decomposed matrix
In [4]:
def nmf_features(A, k, constant=0.01, regularisation=False, MAX_ITERS=30, cost='absolute'):
    """

    Parameters
    ----------
    A: matrix to be decomposed (m rows and n columns)
    k: number of latent factors
    constant: coefficient of regularisation
    regularisation: whether to use regularisation
    MAX_ITERS: maximium number of iterations
    cost: 'absolute' or 'relative'

    Returns
    -------
    W:
    H:
    Residual

    """

    np.random.seed(0)

    m, n = A.shape
    mask = ~np.isnan(A)
    
    # Initialize W randomly.
    W_init = np.random.rand(m, k)
    W = W_init

    # Perform alternating minimization.

    residual = np.zeros(MAX_ITERS)
    for iter_num in xrange(1, 1 + MAX_ITERS):
        # For odd iterations, treat W constant, optimize over H.
        if iter_num % 2 == 1:
            H = cvx.Variable(k, n)
            constraint = [H >= 0]
            
        # For even iterations, treat X constant, optimize over Y.
        else:
            W = cvx.Variable(m, k)

            constraint = [W >= 0]
           
        Temp = W*H
        
        one_A = cvx.Constant(1.0 / (A[mask]+1e-3))
        abs_error = A[mask] - (W * H)[mask]
        rel_error = cvx.mul_elemwise(one_A, abs_error)
        if cost=='absolute':
            error=abs_error
        else:
            # If relative cost, 
            error = rel_error
            
        # Solve the problem.
        if not regularisation:
            obj = cvx.Minimize(cvx.norm(error, 'fro'))

        else:
            if iter_num % 2 == 1:
                obj = cvx.Minimize(cvx.norm(error, 'fro') + constant * cvx.norm(H))
            else:
                obj = cvx.Minimize(cvx.norm(error, 'fro') + constant * cvx.norm(W))

        prob = cvx.Problem(obj, constraint)
        prob.solve(solver=cvx.SCS)

        if prob.status != cvx.OPTIMAL:
            pass
       
        residual[iter_num - 1] = prob.value
        # Convert variable to NumPy array constant for next iteration.
        if iter_num % 2 == 1:
            H = H.value
        else:
            W = W.value
    return W, H, residual

Testing out implementation over different parameters

In [5]:
out = {}
for cost in ['absolute','relative']:
    out[cost]={}
    for regularisation in [False, True]:
        out[cost][regularisation] = {}
        for constant in [0.001, 0.01, 0.1, 1, 10]:
            W, H, residual = nmf_features(A, 2, cost=cost, regularisation=regularisation, constant=constant)
            out[cost][regularisation][constant] = {'W':W,'H':H, 'residual':residual, 'pred':np.round(np.dot(W, H))}
            

Using relative cost

In [6]:
out['relative'][False][0.1]['pred']
Out[6]:
matrix([[ 5.,  4.,  5.],
        [ 4.,  4.,  5.],
        [ 5.,  3.,  4.],
        [ 2.,  3.,  3.]])

Great, looks like we recovered the original matrix pretty much as it is!

Using absolute cost

In [7]:
out['absolute'][False][0.1]['pred']
Out[7]:
matrix([[ 5.,  4.,  5.],
        [ 4.,  4.,  5.],
        [ 5.,  3.,  4.],
        [ 2.,  3.,  3.]])

Again, looks like we recovered the original matrix pretty much as it is!

Using regularisation

In [8]:
out['absolute'][True][10]['pred']
Out[8]:
matrix([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

Woah. All 0s. Maybe this is due to the high coefficient of regularisation!

In [9]:
out['absolute'][True][10]['H']
Out[9]:
matrix([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]])

Yes, this is indeed the case!

>>