If machine learning is the "latest rage" in quantitative asset management, then causal artificial intelligence is certainly the "very latest rage". In fact, developments from this area promise the most potential for the future.
Classic machine learning works extremely well in certain areas. The best-known success stories are, for example, speech recognition, strategic games and robotics. This has been made possible primarily by the sharp increase in available data as well as computing capacities.
But when it comes to the successful use of such models in the financial markets, it is much more difficult. Here, users have to deal with problems such as the non-stationary environment, a limited amount of data and the poor signal-to-noise ratio. In addition, the best models are complicated black boxes that humans can neither understand nor explain.
In the meantime, structural modelling has established itself as a promising approach. Machine learning is implemented within a higher-level, theoretically sound (economic) model accepted by human experts, which can currently be described as a state-of-the-art solution.
Curve Fitting instead of Intelligence
Professor Judea Pearl is an expert for whom this is not enough for the future. In an article, he writes that Deep Learning is ultimately curve fitting, but the user does not learn why it works (or why it doesn't) and what should be changed. So the fault could be in the programme or the methodology, but also the framework might have changed. One simply does not know.
Now, one could argue that such transparency is not necessary. After all, even with our brain we don't know exactly why it works. But Pearl contradicts this comparison: the brain functions the same in different people. This makes it possible to communicate in a common language, to learn from each other and to motivate ourselves. Artificial intelligence is still a long way from that. So far, algorithms have only reached the first of three levels, the "seeing" statistical level. This asks, mutatis mutandis: "Which disease is best explained by the observed symptoms?"
I view machine learning as a tool to get us from data to probabilities. But then we still have to make two extra steps to go from probabilities into real understanding – two big steps.
Judea Pearl 
Two big steps
For the two higher steps on the way to true artificial intelligence, Pearl says the following aspects need to be considered: 
Effects of interventions ("Will my symptoms go away if I take this medicine?"): This information can be coded into visual models that describe which variable responds to another.
Imaginary reasoning ("Would my symptoms go away if I didn't take this medicine?"): This is the language scientists use, but it is even more difficult. Here we need equations that tell us how variables respond to changes in other variables.
The crucial point of these two stages is that they require causal models. The basis for this is the Bayesian network, which Pearl introduced as early as 1985.  It is a probabilistic, graphical model that shows a set of variables and their conditional dependencies. It can be used to determine the probability with which individual factors contributed to the occurrence of a particular event.
Here is an example. Two events can cause grass to be wet: rain or an active sprinkler system. Rain directly affects the sprinkler, which is then usually not active. In a Bayesian network, this can be modelled as shown below, when each variable can take two possible values (T = true, F = false). 
The crux: recognising causalities
Now, however, it has turned out that it is a difficult undertaking to recognise causal relationships in the time series of complex, dynamic systems in a statistically reliable way. Even purely intuitively, this must be the case - otherwise the whole thing would probably have become standard long ago.
In concrete terms, the challenges lie in strong mutual dependencies and time lags as well as a high dimensionality of the many different variables. Accordingly, autocorrelations occur in the time series, or various individual variables that are not very meaningful on their own suddenly achieve a clear effect collectively. This not only complicates the search for true causalities, but also the detection of false-positive correlations.
The following diagram schematically shows a complex system (A) for which the underlying causal dependencies (B) are to be estimated. Both linear and non-linear correlations and their time lags are taken into account. Pairwise correlations can lead to errors (grey arrows). Firstly, due to common drivers: for example, X2 affects both X1 and X3, so that there is also a correlation between X1 and X3, but this is not causal and is potentially misleading. And secondly, due to indirect pathways: X2 affects X3 and X3 affects X4, but the correlation between X2 and X4 is not causal.
Such misleading correlations are the main problem of classical machine learning. They have to be recognised and sorted out using suitable methods. In this way, increasingly only the few, but highly probable causal relationships can be retained in the model. The crucial question is: How does this work?
Specialised companies like causaLens use highly qualified teams to develop complex mathematical models to determine how likely the relationships in the data are to be causal and how strong they are to be estimated. To do this, for example, alternative scenarios are run through in order to determine chains of cause and effect - very much like a human model in the sense of a "machine scientist". However, the whole thing is still in its infancy, as the founder of the company, Darko Matovski, said at the annual conference Portfolio Management in December 2020. The exact details of the process are very complicated and represent the "secret sauce" of causaLens, which offers its clients appropriately calibrated forecasting models. These are created like classic machine learning in the Python programming language.
An interesting aspect is the function of visualising causal relationships and thus making the "thought process" of the algorithms understandable. This in turn enables users to interact with the models. The causal relations thus represent the common language in which humans and algorithms communicate with each other. 
And the tables can also be turned: Not only can the algorithms learn causal relationships from the data, which are then evaluated by experts. The presumed correlations or context can also be specified by the experts and then evaluated by machine learning, which is analogous to classical structural modelling.
Cooperation between man and machine
Causal models can also integrate specific expert knowledge if there are secure correlations that enable a higher significance. The advantage here is that this knowledge does not have to be available in digital form and can also be unstructured. In addition, experts can supplement the "causal map" accordingly if there is insufficient data in an area.
Input is also not limited to one point in time and can be continuous. Therefore, if framework conditions change, quick, well-founded adjustments over time are possible. Admittedly, this means that the models no longer function autonomously. But at the same time, it enables algorithms to draw consistent and contextual conclusions about cause and effect in the data. Thanks to the external input, the models can sometimes even react to unexpected changes faster than is even recognisable in the data. This shows how humans and machines ideally complement each other.  However, this can also increase the potential for error.
"We need to blend computer-based forecasting and subjective judgment in the future". (Philip Tetlock in his book "Superforecasting")
"Real" artificial intelligence
An additional success factor is the repeated execution of virtual experiments. These can also simulate events that never occurred in reality (artificial imagination). By observing the results, a learning effect can emerge. This would be very similar to how a person learns, whose complex network of neurons in the brain could also be simulated with a computer. Admittedly, it would not work exactly as it does in our heads. But according to Matovski, it can still work, even if a certain abstraction is made - similar to how an aeroplane is an abstraction of a bird as a model from nature.
Add the aforementioned input from experts, and playing out alternative scenarios becomes possible. Through the interface of the visualised representation, the causal relationships can be manually adjusted to see how this is likely to affect the results. This corresponds to the third stage of artificial intelligence according to Pearl: "What if?".
Machine learning alone has nothing to do with true artificial intelligence. The decisive factor is the development of causal models. Unlike in classical machine learning for image recognition or strategy games, however, no sudden, major breakthrough is to be expected here. The whole thing is rather a long process with many small advances that accumulate over time.
It is important to realise that perfectionism is the wrong approach. And in general, expectations need not be set too high. Because in the markets, even small but consistent advantages are enough to generate significant excess returns over time. So if causal models perform just a little better than the classic approaches of the competition, that is already worth its weight in gold.
On a meta-level, there is another interesting conclusion: at the end of the day, causal artificial intelligence does nothing other than identify the key drivers in the data over time and evaluate them using verified models. This is very similar to how the best investors determine the market regime based on comprehensive analyses and experience and then implement appropriate trading strategies for it. Both could lead to the same result in the end: A positive alpha.
Support our mission Alpha for Impact
This is some text inside of a div block.
Der Beitrag des Gastautors spiegelt nicht zwangsläufig die Meinung der Redaktion wieder.
 Pearl, J., The Limitations of Opaque Learning Machines, in: Brockman, J. (2019), Possible Minds – 25 Ways of Looking at AI, Penguin Press  Pearl, J. (1985), Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning, University of California  Wikipedia, Bayesian Network, https://en.wikipedia.org/wiki/Bayesian_network, Zugriff am 27.01.2021  causaLens (2020), Empower Experts with Causal AI
Would you like to use this article - in full or in part - for your purposes? Then please consider the following Creative Commons-Lizenz.