Introduction

Neural networks
- Low-level computational structure
- Performs well with raw data
- Can learn, but opaque to user
Fuzzy logic
- High-level reasoning
- Using linguistic information from domain experts
- Lack the ability to learn & adjust to new environments
Fuzzy-neural network
- Parallel computation & learning abilities + human-like knowledge representation & explanation abilities
- Trained to develop IF-THEN fuzzy rules & determine membership functions

Fuzzy-Neural Network

Multi-layer neural network with 3 hidden layers
- Input
- Input membership functions
  - Activation function: triangular, Gaussian, etc.
- Fuzzy rules
  - Conjunction of the rule antecedents evaluated by the fuzzy operation intersection: product, etc.
  - $$\mu_{Rn}$$: firing strength of fuzzy rule neuron $$Rn$$
  - $$w_{Rn}$$: normalized degrees of confidence (certainty factors)
    - Adjusted during training
    - Normalized by dividing the values by the highest weight obtained at each iteration
- Output membership functions
  - Combines fuzzy rule neurons using fuzzy operator union: e.g. probabilistic sum
- Defuzzification
  - Each neuron = single output of the system
Summary
- To remove incorrect rule
- To find all relevant rules
- To fine relative importance of the rules
- To tune the shape of the membership functions

Example: Mamdani Fuzzy Inference Model

Activation Function Example: Triangular Function

$$ \begin{aligned} y = &0 &\text{ if } x \le a - \frac{b}{2} \&1 - \frac{2|x-a|}{b} &\text{ if } a - \frac{b}{2} < x < a + \frac{b}{2} \&0 &\text{ if } x \ge a + \frac{b}{2} \end{aligned}

$$a$$: center
$$b$$: width of triangle

Defuzzification Example: Sum-Product Composition

Weighted average of the centroids of all output membership functions.

$$y = \frac{\sum \mu{Ci}a{Ci}b{Ci}}{\sum \mu{Ci} b_{Ci}}$$

Example: XOR

Training

Construct FNN using the fuzzy rules defined, use BP to train
1. Bad rules will be pruned
Construct FNN with random rules, sue BP to train
1. Eventually the rules can be extracted from numerical data

Example: Adaptive Neuro-Fuzzy Inference System (ANFIS)

Based on Sugeno Fuzzy Inference Model.

Sugeno Fuzzy Rule

IF $$x_1$$ is $$A_1$$ AND ... AND $$x_m$$ is $$A_m$$
THEN $$y = f(x_1, ..., x_m)$$

$$y$$
- Constant: zero-order Sugeno fuzzy model
- First-order polynomial: first-order Sugeno fuzzy model

Layer 1: input layer
Layer 2: fuzzification layer
- Bell activation function
  - $$y = \frac{1}{1 + (\frac{x - a}{c})^{2b}}$$
  - $$a$$: center
  - $$b$$: width
  - $$c$$: slope
Layer 3: rule layer
- Firing strength of the truth value
- Conjunction of rule antecedents: operator product
  - $$yi = \prod^n{j=1} x_{ji}$$
Layer 4: normalization layer
- Normalized firing strength of a given rule
  - Represents the contribution of a rule to the final result
  - $$yi = \frac{x{ii}}{\sum^n{j=1}x{ji}} = \frac{\mui}{\sum^n{j=1}\mu_j} = \overline{\mu_i}$$
Layer 5: defuzzification layer
- Calculates weighted consequent value of a given rule
  - $$yi = \overline{\mu_i}[k{i0} + k{i1}x_1 + ... + k{im}x_m]$$
  - $$k_{ij}$$: consequent parameters of rule $$i$$
Layer 6: summation neuron
- $$y = \sum^n_{i=1} x_i$$
- Functionally equivalent to a first-order Sugeno fuzzy model

Training

Hybrid learning algorithm combining least-squares estimator & gradient descent method.

Assign initial activation functions to each membership neuron
1. Function centers connected to input are set s.t. the domains are divided equally
2. Widths & slopes set to allow sufficient overlapping
Epoch
1. Forward pass: learn consequent parameters
  1. Input: $$x_i(p)$$, output: $$y_d(p)$$ for $$p = 1, ..., P$$ (number of input-output patterns)
  2. Forward propagate $$y_d = A k$$
    1. $$y_d$$: $$P \times 1$$
      1. $$y_d = [y_d(1) \dots y_d(P)]^T$$
    2. $$A$$: $$P \times n(1+m)$$
    3. $$k$$: $$n(1+m) \times 1$$ (number of consequent parameters)
      1. $$k = [k{10}k{11}\dots k{1m}k{20}\dots k_{nm}]^T$$
  3. $$P$$ usually greater than $$n(1+m)$$
    1. $$k^$$: *Least-square estimate of $$k$$
      1. $$k^* = (A^TA)^{-1}A^Ty_d$$
      2. Minimizes square error $$| Ak - y_d |^2$$
  4. Error vector $$e$$
    1. $$e = y_d - y = y_d - Ak^*$$
2. Backward pass: learn antecedent/premise parameters
  1. $$\Delta a = -\alpha \frac{\partial E}{\partial a}$$
  2. $$E = \frac{1}{2}e^2$$
  3. $$\frac{\partial E}{\partial a} = \frac{\partial E}{\partial e}\frac{\partial e}{\partial y}\frac{\partial y}{\partial \overline{\mui} f_i}\frac{\partial \overline{\mu_i} f_i}{\partial \overline{\mu_i}}\frac{\partial \overline{\mu_i}}{\partial \mu_i}\frac{\partial \mu_i}{\partial \mu{Aj}}\frac{\partial \mu_{Aj}}{\partial a}$$

$$\begin{aligned} \overline{\mui} &= \frac{\mu_i}{\sum^n{j=1} \muj}\ \frac{\partial \overline{\mu_i}}{\partial \mu_i} &= \frac{(\sum^n{j=1} \muj) - \mu_i}{(\sum^n{j=1} \muj)^2}\ &= \frac{1}{\mu_i}(\frac{\mu_i}{(\sum^n{j=1} \muj)} - \frac{\mu_i^2}{(\sum^n{j=1} \mu_j)^2})\ &= \frac{1}{\mu_i}(\overline{\mu_i} - \overline{\mu_i}^2) = \frac{\overline{\mu_i}(1 - \overline{\mu_i})}{\mu_i} \end{aligned}$$

Fuzzy-Neural Network

Introduction