GoldenGem

From GoldenGem, the free neural network

Jump to: navigation, search

What are the main types of financial analysis?

There are two main types. Technical Analysis attempts to extrapolate a price using only earlier values of that same price. Only a few papers show a statistically significant advantage over random trading. The second type is Fundamental Analysis, where data from financial statements, interest rates, volumes, competitors' prices, prices of raw materials or other variables known to affect the target prices are used. Neural networks can act as a 'bridge' between Technical Analysis and the more highly regarded Fundamental Analysis. Future values of prices are approximated, not by nonlinear extrapolation of earlier values, but by hypothesizing and testing actual causal connections. It is possible to load a set of prices and volumes of the most well-known shares, bonds, and indices by pressing one button. But unless one has the skill to find a meaningful relationship among these variables, The 'load from internet' button is really only for beginners. A competent user will graduate to using other types of data, available on the internet and also accessible by GoldenGem with a bit more work; the instructions link on the left leads to a series of links explaining further types of data that can be imported, and we're willing to modify the program to make it compatible with new formats as they come up.

Developing this implementation over time we have needed to confront subtle questions before the algorithm began to work well, as it now does. The Verification link on the left shows GoldenGem predicting abstract mathematical functions. The program does not use the fact that the functions are repeating; only today's values of the input variables are used in making the last day of prediction. In cases when earlier values than the last known value may have an effect you should also include `stochastics' as input variables. The program does not take into account 'oscillations' such as described in the Elliot wave 'theory,' a theory which we do not subscribe to; however, note that the program is able to predict functions that oscillate. Finally, the program is able to predict any one of the functions knowing the others, but there is no preferred process of extrapolation that should would work for a single function treated by itself.

How reliable is a prediction that this program gives me?

We have added a pair of indicator lights to help answer this in each case. The first indicator light refers to a number r during backtesting, which is defined to be the minimum of the correlation coefficient of predicted versus actual daily changes, on the one hand, and the correlation coefficient times the ratio (actual variance)/(predicted variance), on the other hand. To give some idea of the meaning of this: if a person were to buy or short, every day during backtesting, in proportion to its predicted percentage change, assuming things are normally distributed, the percentage profit over n days of trading during that time in the past would have been

r (
π

2
)1/2(
n

365
)1/2V


where V is the maximum of actual and predicted annual volatility expressed in percentage points.

This taking of V to be the maximum of actual and predicted volatility seems contrived, but it is exactly what one wants. If r and V were defined using only the actual volatility, then a strategy of relying upon a posteriori information would exist to attain deceptively good returns during backtesting which do not really result from any prediction: a sluggish response, in which the green curve stays near the 2 year average, would correspond to a strategy of predicting always a sudden return to the 2 year average, which during backtesting includes knowledge of future days and would unfairly reward the correlation coefficient alone. Whereas if r and V were defined using only the predicted volatility, there would be no intrinsic relation between r and the actual percentage gain: a large r value could arise from a prediction with very low variance. The value of r as we have defined it rules out both these problems, and appears to correspond with what looks like intuitively good backtesting. The first light is yellow when the r value is larger than 0.39 and green when it is larger than 0.6. The second light goes from red to yellow to green as the training input is removed. You will need to try different combinations of input variables before you will be able to make both lights remain green at the same time. If the lights cannot be made to remain green, the answer to your question is, the prediction is meaningless. If the lights do remain green, then that means a relationship has been found which has been able to make succesful predictions during the backtesting interval. When both lights have remained green, does this imply the prediction can be trusted? Not yet. Even taking account the variance ratio as we have, the formulation in terms of profit shows that this number could be high enough to set the green light, just because some of the hypothetical trades were extremely profitable, others not at all. You also need to to actually look at the behaviour of the prediction line, the part of the green line extending into the future, past the red line, throughout backtesting, and see qualitatively how consistenly it is correct. When sensitivity is set to zero there is no training input, and the green graph is calculated only using data values of all variables from the time of the earlier red graph, and any prediction you see therefore shows a real mathematical relationship during backtesting. Finally, you still are not quite done. Even when you have assessed both statistically and visually that the predictions throughout backtesting are good, to be really sure the variables you are looking at are related, you should set 'today's date' to a time in the past, or otherwise load data only up various times in the past, and train the net to predict a range of values which you actually already know. This is a 'validation data set,' and the next version of GoldenGem will make this last stage of validation easier.

There do exist relationships between variables which are known to affect prices, the ones which are well-known can't be exploited unless you have knowledge of the input variables advance of the trading public. It is not true that all existing relationships are well-known. Insider trading is legal if you exploit public domain information through your own intelligence.

What will happen the first time I try it?

A good training strategy is to start with a high sensitivity, and to bring it down in stages. Assuming the variables actually were expected to be related, and your training strategy was correct, you are likely to end up with the first light turning red, signifying inadequate correlation, by the time the second light turns green. This is usually for one of three reasons:

1. If you see vertical green spikes, and a message 'Press the Reset button,' then you have traumatized the net. Like a human or animal, it will take a very long time to recover. Similar to a good nights' sleep might do for an animal, the Reset button gives a fresh start, and all is forgiven, but it will need to be trained again from the beginning.

2. If the green line is flat, this is because it was trained inadequately. Raise the sensitivity slider again and wait a while before bringing it down (ideally in stages).

3. If the green line looks just like the red line, but shifted to the right by the amount on the Days slider, you are seeing a situation where the expected future value is always nothing but the last known value. If all graphs are like that, congratulations, you have found a 'Markovian' set of shares: interesting but with no opportunity for arbitrage, assuming the neural network has found the best possible solution.

The program reports advice during training, and an automatic training button is included which helps avoid some of these pitfalls.

Conclusion

Financial analysis is something you do, not something you buy. A neural network requires involvement by the user. You have to choose what data you think is relevant, you have to learn how to train the net, and it is up to you to evaluate the backtesting. The most important thing to remember is that although the display shows only two graphs at a time (actual and predicted), the predicted graph is generated by taking account of the mathematical relations among the prices and volumes of all loaded variables simultaneously and so the choice of ungraphed tickers affects the quality of the match between the two graphs you are observing.