Code
import tensorly
from tensorly.decomposition import parafac, non_negative_parafac
import numpy as npNipun Batra
April 20, 2017
In a previous post, we had looked at predicting for users who weren’t a part of the original matrix factorisation. In this post, we’ll look at the same for 3-d tensors. In case you want to learn more about tensor factorisations, look at my earlier post.

General tensor factorisation for a 3d tensor A (M X N X O) would produce 3 factors- X (M X K), Y (N X K) and Z (O X K). The \(A_{ijl}\) entry can be found as (Khatri product) :
\[ A_{ijl} = \sum_k{X_{ik}Y_{jk}Z_{lk}}\]
However, we’d assume that the \(M^{th}\) entry isn’t a part of this decomposition. So, how do we obtain the X factors correspondonding to \(M^{th}\) entry? We learn the Y and Z factors from the tensor A (excluding the \(M^{th}\) row entries). We assume the Y and Z learnt to be shared across the entries across rows of A (1 through M).

The above figure shows the latent factor for X (\(X_{M}\)) corresponding to the \(M^{th}\) entry of X that we wish to learn. On the LHS, we see the matrix corresponding to \(A_{M}\). The highlighted entry of \(A_{M}\) is created by element-wise multiplication of \(X_M, Y_0, Z_0\) and then summing. Thus, each of the N X O entries of \(A_M\) are created by multiplying \(X_M\) with a row from Y and a row from Z. In general,
\[A_{M, n, o} = \sum_k{X_{M, k} \times Y_{n, k} \times Z_{o, k}}\]
Now, to learn \(X_M\), we plan to use least squares. For that, we need to reduce the problem into \(\alpha x = \beta\) We do this as follows:
We can now write,
\[ \alpha X_M^T \approx \beta \] Thus, X_M^T = Least Squares (\(\alpha, \beta\))
Ofcourse, \(\beta\) can have missing entries, which we mask out. Thus, we can write:
\(X_M^T\) = Least Squares (\(\alpha [Mask], \beta [Mask]\))
In case we’re doing a non-negative tensor factorisation, we can instead learn \(X_M^T\) as follows: \(X_M^T\) = Non-negative Least Squares (\(\alpha [Mask], \beta [Mask]\))
array([[ 0.48012616, 1.13542261],
[ 0.49409014, 2.98947262],
[ 0.5072998 , 5.03375154],
[ 0.52051081, 7.07682331]])
[[ 0.27650081 1.76735291]
[ 0.27935339 2.00914552]
[ 0.28220934 2.25020479]
[ 0.28454255 4.65329218]
[ 0.28747809 5.28991186]
[ 0.29041711 5.92460074]
[ 0.29214989 7.83533407]
[ 0.29516391 8.9072908 ]
[ 0.29818151 9.9759964 ]
[ 0.299758 11.01549695]
[ 0.30285052 12.52253365]
[ 0.30594669 14.02499969]]
Shape of alpha = (12, 2)
array([[ 7.40340055e-01, 7.62705972e-01],
[ 4.14288653e+01, 7.57249713e-01],
[ 8.51282259e+01, 6.56239315e-01],
[ 1.29063811e+02, 5.46019997e-01],
[ 1.73739412e+02, 4.06496594e-01],
[ 2.19798887e+02, 2.11453297e-01],
[ 2.64609697e+02, 6.54705290e-02],
[ 3.01392149e+02, 2.39700484e-01],
[ 3.39963876e+02, 3.41824756e-01]])
It seems that the first column captures the increasing trend of values in the tensor
array([[ 108., 109., 110.],
[ 111., 112., 113.],
[ 114., 115., 116.],
[ 117., 118., 119.]], dtype=float32)
Not bad! We’re exactly there!