Business — Banking — Management — Marketing & Sales

THE NEURAL NETWORKS TECHNIQUE



Category: Risk Management in Banking

Neural networks (NNs), in their simplest form, are similar to a linear regression (see Principle et al., 2000). The explained variable y is the response, while the explaining variables x are the covariates. The linear regression would be:

yw0_formula

The diagram representing this model would be as in Figure 37.3. In the NN literature, this function—diagram is a single unit perception. However, an NN is more complex than this plain linear model. Each node processes several inputs that result in several outputs, y, serving as inputs to other nodes of the network, until we reach the final stage where we get the modelled output, as shown in Figure 37.4. When considering a single neuron, it receives inputs and transforms them into outputs. We use standard italic letters for inputs to and outputs from a single neuron. The unit receives signals  x and transforms them into signals  y that serve as inputs to other neurons. The activation function f (u) takes a specified form, such as the logistic form f(u) = + exp(u)], which produces a simple output such as 0 or 1. Other functions, with values within the 0 to 1 range, might serve as well. The function u(x, w) might be a linear function of the inputs through the weights (w) plus a constant.

single unit neuronal model

There are several differences between simple statistical techniques and NNs:

• The relationship between inputs and outputs of each node is not linear, since it combines weights with non-linear functions. Therefore, the final relation between the inputs and the modelled variable is no longer linear.

• The NNs attempt to mimic any function between the original inputs (observable attributes) and the final outputs (modelled variable). To achieve this function approximation, they minimize the error, the gaps between the predicted values and the actual values.

Optimization consists of minimizing this error. To achieve optimization, NNs use a training process, unlike the simple regression method.

The training process uses a feedback loop. For a given set of weights and inputs, the NN produces a set of outputs depending on weights w. Changing the weights changes the outputs. The distance between the actual value and the NN predicted value, at a given stage, triggers or not a change in weights depending on the size of the mismatch. Modifying the weights results in a new set of outputs. If the error produced by the change is high enough, the NN learns by adjusting the weights.

The rule triggering the change in weights is the delta rule. The magnitudes of the weight changes depend on a parameter called the learning rate. The delta rule ensures the fastest convergence towards the best match between modelled and observed outputs. This process is called the steepest gradient descent. The process implies processing each observation in the training set a number of times to reach the highest level of convergence.

Because NNs can replicate closely any function, including complex non-linear functions, it is possible to replicate any kind of output. However, the result would become too dependent on the training sample! The NN could mimic the outputs of the training sample, but it might not do well with new observations. Hence, an NN can over-train. In order to avoid over-training, NNs use a cross-validation sample. The cross-validation sample does not serve for training the model. It serves for checking that the error in this second sample does not increase while the NN trains itself through multiple simulations on the training sample. The process prevents the model from over-training, since this would result in an increased error in the cross-validation sample.

NN models serve for both classification and function approximation of ordinal or numerical variables. In classification mode, the NN tries to achieve the highest possible level of correct classification. This is attractive for categorical variables, but might result in large errors for incorrect classifications. When running under regression mode, the model recognizes that the size of the error counts. In such a case, an error in predicting a rating of five notches is higher than an error of only two notches. In this mode, the model minimizes the errors over the whole training sample (such as ordered probit-logit models).

This difference is similar to that of using logit and probit models for classification purposes and for modelling ordinal (ranks) or numerical variables (default frequencies). In general, to model ratings or default frequencies out of observable attributes, it makes sense to use ordinal or numerical models.

NNs are appealing because of their capability to tackle very complex issues. In the context of classification, the basic issue for modelling defaults or ratings, NNs have attractive properties with respect to traditional statistics. The observed relationships between the ordinal variables, which characterize credit states, and the observable features are not linear, such as the relation between size and ratings. Moreover, many financial attributes correlate with each other, such as profitability measures, which all depend on some common factors, such as net income of profitability before tax, or Earnings Before Interest, Tax and Depreciation (EBITDA). Multicollinearity is a weakness of linear models as well. NNs take care of non-linear interdependencies, thereby having a higher potential for accuracy. On the other hand, understanding why the modelled risk is high or low, from the input values, gets more complex. This drawback might neutralize the benefits of a higher accuracy. It is still too early to see whether they beat classical statistical models, but their usage extends in the financial universe, notably for standalone credit risk modelling.


« ||| »

Comments are closed.