Overview
A direct estimated computation of the equation parameters using pseudo-inverse. Optimal in the sense of mean square error.
- Activation function must be continuous invertible functions
- Number of nodes in input = number of nodes in the first hidden layer
- Applies the whole training set to its algorithm at once
Moore-Penrose Pseudoinverse
$$\begin{aligned} H &\in \mathbb{R}^{m \times n} \ H^+ &= (H^TH)^{-1}H^T &\text{ if } m > n\ H^+ &= H^T(HH^T)^{-1} &\text{ if } m < n \end{aligned}$$
Algorithm
- Output normalizatoin
- $$C{req} = \frac{C{req}}{max(C_{req})}$$
- Calculate $$W_{init}$$
- Set $$B = A, C = C_{req}$$
- $$f(AW{init}) = C{req}\W{init} = A^+f^{-1}(C{req})$$
- Calculate $$B_{req}$$
- Set $$C = C{req}, W = W{init}$$
- $$f(B{req}W{init}) = C{req}\B{req} = f^{-1}(C{req})W{init}^+$$
- Normalize $$B_{req}$$ to ensure they are within the valid range of the activation function
- $$B{req} = \frac{B{req} * 0.99}{max(B_{req})}$$
- Calculate $$V$$
- Set $$B = B_{req}$$
- $$f(AV) = B{req}\V = A^+f^{-1}(B{req})$$
- Calculate actual output $$B$$
- $$B = f(AV)$$
- Recalculate $$W$$
- $$W = B^+f^{-1}(C_{req})$$
- Use the derived $$V$$ and $$W$$ to obtain outputs given inputs $$A$$
- $$B = f(AV)$$
- $$C = kf(BW)$$, $$k$$ is the normalization factor
Enhanced Input Representation
Increase number of neurons in the hidden layer.