Review of PCA

v_{i} \approx w_{1} \cdot h_{1, i} + \dots w_{r} \cdot h_{1, r}

$\mathbf{v}_i \approx \mathbf{w}_1 \cdot h_{1,i} + \dots \mathbf{w}_r \cdot h_{1,r}$

W can be regarded as containing a basis that is optimized for the linear approximation of the data in V.

Since relatively few basis vectors are used to represent many data vectors, good approximation can only be achieved if the basis vectors discover structure that is latent in the data.

[Lee & Seung, "Learning the parts of objects by non-negative matrix factorization." Nature, 1999.]

N=7, M=5.

Fix K=2 and run Non-negative matrix decomposition:

@INPUT:
    R: a m. to be factorized, dim. N x M
    P: an initial m. of dim. N x K
    Q: an initial m. of dim. M x K
    K: the no. of latent features
    steps: the max no. of steps to perform the optimisation
    alpha: the learning rate
    beta: the regularization parameter

@OUTPUT:
    the final matrices P and Q

SVD+NMF

Data Science: Techniques and Applications (DSTA)

Review of PCA

Intuitions from Geometry

PCA for dimensionality reduction, I

PCA for dimensionality reduction, II

Review of SVD

Underlying idea, I

Example

Non-interpretability of SVD

Non-negative matrix decomposition

The numerical problem

Notation

Interpretation

NMF as error-minimization

Lee-Seung’s Method

Interpretability of NMF

19x19 mugshots

Activity-matrix decomposition

A simple ratings matrix

Direct implementation (1 run)

Analysis of the error, I

Analysis of the error, II

Analysis of the result