Curse of Dimensionality

High dimensions
- Average distance between two points will be large -> sparse
- Predictions much less reliable -> need more data
Reduce dimensionality
- Speed up training
- Helps visualization
Approaches
- Feature selection
- Feature extraction
  - Transforms dataset into a new feature space with lower dimensionality

Principal Component Analysis

Feature extraction, unsupervised. Finds the directions of maximum variance (minimum information loss) in data & projects data onto a new subspace with equal or fewer dimensions.

Principal component
- The orthogonal axes of the new subspace

Algorithm

Find $$d \times k$$ matrix $$W$$ to map vector $$x$$ of dimension $$d$$ to new $$k$$-dimensional feature space.

$$ z = xW, x \in \mathbb{R}^d, z \in \mathbb{R}^k, W \in \mathbb{R}^{d \times k}

Standardize $$d$$-dimensional dataset
Construct covariance matrix
1. $$\Sigma = \begin{bmatrix}\sigma1^2 & \sigma{12} & \sigma{13} \ \sigma{21} & \sigma2^2 & \sigma{23} \ \sigma{31} & \sigma{32} & \sigma_3^2 \end{bmatrix}$$
Decompose covariance matrix into eigenvectors & eigenvalues
1. $$\Sigma v = \lambda v$$
2. Eigenvector: principal components
3. Eigenvalue: magnitude of the corresponding principal component
Sort eigenvalues by decreasing order to rank the corresponding eigenvectors
Select $$k$$ eigenvectors corresponding to the $$k$$ largest eigenvalues
Construct projection matrix $$W$$ from the top $$k$$ eigenvectors
Transform $$d$$-dimensional dataset into a new $$k$$-dimensional one

Variance

$$\sigma^2 = \sum \frac{(X - \mu)^2}{n}$$

Covariance

$$\sigma{jk} = \frac{1}{n}\sum^n{i=1}(x_j^{(i)} - \mu_j)(x_j^{(i)} - \mu_k)$$

PCA for Compression

Lossy compression: only keep the information that has the most effect on how the image looks.

Can do so only if $$W$$ has an inverse -> non-square matrices do not have one.

Principal Component Analysis

Curse of Dimensionality

Principal Component Analysis

Algorithm

Variance

Covariance

PCA for Compression

results matching ""

No results matching ""