What is Machine Learning?
Machine Learning is generally understood to be the idea that computer programs can improve themselves over time through learning effects. In terms of stock market trading, this means the next step in the evolution of statistical and quantitative methods. Although the influence of humans in the quantum business was already diminishing, decision makers continued to play an important role - for example, by first determining the model to be used at the beginning of quantitative analyses.
In machine learning, this human influence is again much less or, ideally, no longer exists. The algorithms can view many models in parallel and then decide for themselves which framework to use for further investigations.
For many established asset managers this step seems revolutionary. Accordingly, opinions diverge widely, ranging from skepticism, confusion and lack of understanding to the modern thinking that this enables progressive, better decision making.
A new word for old ideas?
In the paper "Can Machines 'Learn' Finance?", Ronen Israel, Bryan Kelly and Tobias Moskowitz write that the high range of opinions is related to the fact that machine learning is a rapidly developing but very technical field in which few market participants have a full understanding. That's why the term is sometimes used in practice in the way that suits it, for example for marketing purposes.
But how does machine learning differ from previous quantitative concepts? The paper names three criteria that the algorithms fulfill: [1]
Criteria of machine learning:
Application of "large" models with many parameters (features) and/or complex non-linear relationships between inputs and outputs with the aim of achieving maximum forecast quality on the basis of an unknown pricing model of the market
Select a preferred model from any number of different models, taking into account regulatory techniques to limit granularity and cross-validation methods with simulated out-of-sample tests to avoid over-optimization
Innovative approaches to efficient model optimization that reduce the computational effort in big-data environments, such as Stochastic Gradient Descent, which considers only random parts of the data without a large loss of accuracy
The authors write that ideally "Goldilocks models" are used as input for machine learning. These are large enough to reliably identify the real, potentially complex relationships with predictive character in the data. At the same time, they are not flexible enough to generate over-optimization on historical data, which would lead to disappointing performance out of sample.
When does Machine Learning work?
According to the paper, the success stories of machine learning in areas such as speech recognition, strategic games and robotics to date come from environments that combine two critical factors: [1]