Tensor decomposition using Autograd

In this post, I will compare an Autograd based implementation with a standard Tensorly based PARAFAC implementation.

Imports

In [1]:
import autograd.numpy as np
import tensorly
from tensorly.decomposition import parafac
from autograd import multigrad

Creating a tensor of size 2, 3, 4

In [2]:
t = np.arange(24).reshape(2, 3, 4).astype('float32')
In [3]:
t
Out[3]:
array([[[  0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]]], dtype=float32)

Using Tensorly to break it into rank 2 matrices

In [4]:
A, B, C = parafac(t, rank=2)
In [5]:
A
Out[5]:
array([[ 35.86886727,   4.33025058],
       [ 61.91392143,   1.1737134 ]])
In [6]:
B
Out[6]:
array([[ 0.48010198, -1.33704111],
       [ 0.58706644, -0.98212501],
       [ 0.6940309 , -0.62720892]])
In [7]:
C
Out[7]:
array([[ 0.48671098,  1.46524661],
       [ 0.51060759,  1.35143039],
       [ 0.53450419,  1.23761417],
       [ 0.5584008 ,  1.12379794]])

Defining Khatri product to get a tensor from the matrices

In [8]:
def khatri(A, B, C):
    return np.einsum('il,jl,kl->ijk',A,B,C)
In [9]:
khatri(A, B, C)
Out[9]:
array([[[ -0.10186127,   0.96861972,   2.03910072,   3.10958171],
        [  4.01740161,   5.00464518,   5.99188875,   6.97913232],
        [  8.13666448,   9.04067063,   9.94467678,  10.84868293]],

       [[ 12.16806623,  13.05700484,  13.94594346,  14.83488207],
        [ 16.00173049,  17.0015141 ,  18.0012977 ,  19.0010813 ],
        [ 19.83539476,  20.94602335,  22.05665194,  23.16728053]]])

Rounding off

In [10]:
np.round(khatri(A, B, C))
Out[10]:
array([[[ -0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]]])

Not bad! We get back the original tensor with ~100% accuracy

Cost function for GD based optimisation

In [11]:
def cost(A, B, C):
    pred = khatri(A, B, C)
    gt = t
    mask = ~np.isnan(t)
    error = (pred-gt)[mask].flatten()
    return np.sqrt((error**2).mean())

Main autograd routine

In [12]:
mg = multigrad(cost, argnums=[0, 1, 2])
rank = 2
lr = 0.01
In [13]:
np.random.seed(0)
m, n, o = t.shape
a = np.random.randn(m, rank)
b = np.random.randn(n, rank)
c = np.random.randn(o, rank)

for i in range(6000):
    del_a, del_b, del_c = mg(a, b, c)
    a-=lr*del_a
    b-=lr*del_b
    c-=lr*del_c
    if i%500==0:
        print(cost(a, b, c))
    
13.0867070223
1.54275312814
0.71339135722
0.529013021262
0.504080757325
0.469797308066
0.413525451111
0.301622273908
0.209627382602
0.213141337327
0.214516963017
0.215441239064
In [14]:
np.round(khatri(a, b, c))
Out[14]:
array([[[ -0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]]])

Decomposing with missing entries

In [15]:
t[0, 0, 0] = np.NAN
t[1, 1, 2] = np.NAN
t
Out[15]:
array([[[ nan,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  nan,  19.],
        [ 20.,  21.,  22.,  23.]]], dtype=float32)
In [16]:
rank = 2
lr = 0.01

np.random.seed(1)
m, n, o = t.shape
a = np.random.randn(m, rank)
b = np.random.randn(n, rank)
c = np.random.randn(o, rank)

for i in range(6000):
    del_a, del_b, del_c = mg(a, b, c)
    a-=lr*del_a
    b-=lr*del_b
    c-=lr*del_c
    if i%500==0:
        print(cost(a, b, c))
14.3657653138
11.2554542723
0.391440064205
0.320089741632
0.279150824088
0.254263517167
0.235274595444
0.217495424995
0.197602923558
0.218594628293
0.223672460718
0.225830360065
In [17]:
np.round(khatri(a, b, c))
Out[17]:
array([[[ -0.,   1.,   2.,   3.],
        [  4.,   5.,   6.,   7.],
        [  8.,   9.,  10.,  11.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]]])

What if we used a different seed?

In [18]:
np.random.seed(0)
m, n, o = t.shape
a = np.random.randn(m, rank)
b = np.random.randn(n, rank)
c = np.random.randn(o, rank)

for i in range(6000):
    del_a, del_b, del_c = mg(a, b, c)
    a-=lr*del_a
    b-=lr*del_b
    c-=lr*del_c
    if i%500==0:
        print(cost(a, b, c))
13.1375238648
1.42377956103
1.26196020429
0.47681451677
0.476038335842
0.475054930056
0.47360718886
0.471433455007
0.468181835873
0.463411604108
0.456562523642
0.446768595735
In [19]:
np.round(khatri(a, b, c))
Out[19]:
array([[[  2.,   1.,   2.,   3.],
        [  5.,   5.,   6.,   6.],
        [  9.,   9.,  10.,  10.]],

       [[ 12.,  13.,  14.,  15.],
        [ 16.,  17.,  18.,  19.],
        [ 20.,  21.,  22.,  23.]]])

Okay. Not so good result this time!