Constrained Non-negative matrix factorisation using CVXPY

In a previous post, we saw how we can use CVXPY to perform non-negative matrix factorisation. In this post, I’ll show how to add additional constraints that may arise from the problem domain. As a trivial example, I’ll take constraints of the form when there is a less-than relationship among members of the matrix. For example, we may want to enforce certain movies to be always rated more than others? We’ll create a matrix of 30 users and 12 items. We will enforce the contraint that the rating of the first 6 items be atleast twice that of the last 6 items.

Creating a ratings matrix

We will now create a matrix where the relationship among items exists.

import numpy as np
import pandas as pd

K, N, M = 2, 12, 30
Y_gen = np.random.rand(M, K)
X_1 = np.random.rand(K, N/2)
# So that atleast twice as much
X_2 = 2* X_1 + np.random.rand(K, N/2)
X_gen = np.hstack([X_2, X_1])
# Normalizing
X_gen = X_gen/np.max(X_gen)
# Creating A (ratings matrix of size M, N)
A = np.dot(Y_gen, X_gen)
pd.DataFrame(A).head()

	0	1	2	3	4	5	6	7	8	9	10	11
0	0.732046	0.613565	0.961128	0.920089	0.244323	0.506472	0.280477	0.251049	0.324418	0.378219	0.075556	0.131750
1	0.903630	0.340956	0.784109	0.919741	0.190856	0.433635	0.321932	0.135134	0.290862	0.394680	0.052976	0.081148
2	0.972145	0.576558	1.046197	1.098279	0.261103	0.562996	0.358574	0.233405	0.368118	0.460967	0.077286	0.128344
3	0.292231	0.263864	0.401968	0.377116	0.102567	0.210890	0.113070	0.108163	0.134489	0.154266	0.031993	0.056299
4	0.694038	0.803459	1.125454	0.987344	0.290605	0.582178	0.278848	0.331075	0.365935	0.397023	0.093088	0.168300

We can see that for each user, the 0th item has higher rating compared to the 5th, 1st more than the 6th and so on. Now, in our alternating least squares implementation, we break down A as Y.X. Here X has dimensions of K, N. To ensure the relationship among the items, we will put contraints on X of the form: X[:, 0] > 2 x X[:, 5] and so on. We will create a simple for loop for the same.

e = "["
for a in range(N/2):
    e+="X[:,%d] > 2 * X[:,%d]," %(a, a+N/2)
e = e[:-1] + "]"
e

'[X[:,0] > 2 * X[:,6],X[:,1] > 2 * X[:,7],X[:,2] > 2 * X[:,8],X[:,3] > 2 * X[:,9],X[:,4] > 2 * X[:,10],X[:,5] > 2 * X[:,11]]'

As we can see, we now have 6 constraints that we can feed into the optimisation routine. Whenever we learn X in the ALS, we apply these constraint.

CVX routine for handling input constraints

def nmf_features(A, k,  MAX_ITERS=30, input_constraints_X=None, input_constraints_Y = None):
    import cvxpy as cvx
    np.random.seed(0)

    # Generate random data matrix A.
    m, n = A.shape
    mask = ~np.isnan(A)

    # Initialize Y randomly.
    Y_init = np.random.rand(m, k)
    Y = Y_init

    # Perform alternating minimization.

    residual = np.zeros(MAX_ITERS)
    for iter_num in xrange(1, 1 + MAX_ITERS):
    
        # For odd iterations, treat Y constant, optimize over X.
        if iter_num % 2 == 1:
            X = cvx.Variable(k, n)
            constraint = [X >= 0]
            if input_constraints_X:
                constraint.extend(eval(input_constraints_X))

        # For even iterations, treat X constant, optimize over Y.
        else:
            Y = cvx.Variable(m, k)
            constraint = [Y >= 0]
           

        Temp = Y * X
        error = A[mask] - (Y * X)[mask]
       
        
        obj = cvx.Minimize(cvx.norm(error, 'fro'))
       

        prob = cvx.Problem(obj, constraint)
        prob.solve(solver=cvx.SCS)

        if prob.status != cvx.OPTIMAL:
            pass
       
        residual[iter_num - 1] = prob.value
      
        if iter_num % 2 == 1:
            X = X.value
        else:
            Y = Y.value
    return X, Y, residual

# Without constraints
X, Y, r = nmf_features(A, 3, MAX_ITERS=20)
# With contstraints
X_c, Y_c, r_c = nmf_features(A, 3, MAX_ITERS=20, input_constraints_X=e)

pd.DataFrame(X)

	0	1	2	3	4	5	6	7	8	9	10	11
0	0.749994	0.112355	0.485850	0.674801	0.113004	0.281371	0.257239	0.04056	0.196474	0.297978	0.02745	0.033952
1	0.102384	0.222149	0.266055	0.199361	0.070403	0.133510	0.047174	0.09233	0.081233	0.076518	0.02375	0.045097
2	0.567213	0.558638	0.825066	0.756059	0.211427	0.430690	0.222174	0.22944	0.273260	0.307475	0.06659	0.118371

pd.DataFrame(X_c)

	0	1	2	3	4	5	6	7	8	9	10	11
0	0.749882	0.112384	0.485923	0.674778	0.113027	0.281399	0.257206	0.040566	0.196489	0.297960	0.027461	0.033971
1	0.102366	0.222080	0.266058	0.199353	0.070404	0.133511	0.047168	0.092298	0.081233	0.076513	0.023751	0.045091
2	0.567363	0.558700	0.825253	0.756242	0.211473	0.430789	0.222234	0.229470	0.273319	0.307549	0.066604	0.118382

Ok. The obtained X matrix looks fairly similar. How about we reverse the constraints.

e_rev = "["
for a in range(N/2):
    e_rev+=" 2* X[:,%d]  < X[:,%d]," %(a, a+N/2)
e_rev = e_rev[:-1] + "]"
e_rev

'[ 2* X[:,0]  < X[:,6], 2* X[:,1]  < X[:,7], 2* X[:,2]  < X[:,8], 2* X[:,3]  < X[:,9], 2* X[:,4]  < X[:,10], 2* X[:,5]  < X[:,11]]'

X_c_rev, Y_c_rev, r_c_rev = nmf_features(A, 3, MAX_ITERS=20, input_constraints_X=e_rev)

pd.DataFrame(X_c_rev)

	0	1	2	3	4	5	6	7	8	9	10	11
0	0.250945	0.038070	0.174189	0.252085	0.033251	0.069176	0.502026	0.076147	0.348450	0.504277	0.066521	0.138405
1	0.030757	0.088033	0.085947	0.065135	0.024395	0.045976	0.061398	0.176002	0.171773	0.130146	0.048760	0.091882
2	0.220256	0.183292	0.269014	0.282814	0.065713	0.128120	0.440553	0.366600	0.538065	0.565669	0.131436	0.256263

There you go! We now have learnt latent factors that conform to our constraints.