Share this Article & Support our Mission Alpha for Impact
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.

Test first, then trade - Interview with Johann Chr. Lotter

Post by 
Text Link
Johann Christian Lotter is a classic career changer on the stock exchange. He studied law and physics, worked for a while in the film and television industry, lived in the USA, where he programmed video games. Later, he founded his own software company for film effects and computer games. Then, about ten years ago, he wanted to expand and so talked to various backers. He found an investor with a special idea: to convert the existing game development platform into a trading platform that could be used for both quick backtests and automated trading. And that's exactly what he did. In the meantime, Johann Christian Lotter and his constantly growing team have already tested more than 1300 strategies for customers. Reason enough for us to talk to him about it in detail.

Interview with Johann Christian Lotter, CTO at oP group Germany

Mr. Lotter, are video games and trading really so similar that you can adapt an existing development platform from one area to the other?

‍Lotter: Before I spoke to our investor, I had no contact with stock market topics at all. So I was a bit surprised when we heard that these two areas were supposedly very similar. In fact, it turned out to be true.

You can certainly see it both positively and negatively that there are so many parallels here. But since you are not a trader yourself, the focus was on programming from the outset. How has the whole thing developed over the last ten years?

Lotter: I was the specialist for fast programming that our investor was looking for at the time. After the development work, we had two platforms, one for games and the other for trading strategies on the stock exchange. We then quickly realized what to do with the latter: we offered strategy development and programming as a service for institutional and private traders. And that was exactly the right decision, because since then business has virtually exploded. To date, we have tested more than 1300 strategies for clients. I now employ twelve freelancers worldwide. Some of them I haven't even met in person, but in our networked world, the team still works well.

That sounds like a success story. What programming language do you work with?

Lotter: Exclusively with C or the extension C++. That is simply the fastest option. Anything else, such as R or Python, would certainly also work, but would be a detour for us that would cost efficiency and thus computing time. Speed is precisely the big plus point here.

Which strategies are particularly popular with your customers?

Lotter: It depends on who the customer is. In principle, there are four categories in terms of methods. One is model-based strategies, which are particularly popular with institutional investors. Here, a market model is specified into which certain parameters are entered that indicate market inefficiencies and whose changes over time affects the expected prices. If there are sufficiently large deviations from statistical coincidence, the market is traded accordingly. Classic examples are trend following, mean reversion, market cycles, statistical arbitrage or high frequency trading. A second method are strategies that aim at risk premiums. These include, for example, option writer strategies that repeatedly generate small profits over long phases, but every now and then also realize the ever-present risk of high individual losses. These strategies are in demand by both private and institutional investors. The third method is data mining. Today, this is mainly machine learning or deep learning, which is more in demand in the institutional sector. These models are given certain inputs that are expected to influence prices in some way, and the computer is then left to recognize recurring patterns in them. To do this, one "trains" the algorithm with a vast amount of data. Examples are order book data, classical price data or fundamental data. The disadvantage, however, is that these strategies are a black box: In the end, signals come out, but no one can say how exactly they came about. The fourth major category, which I call "indicator soup," is by far the most popular among private investors. This is where various indicators are combined to generate trading signals without targeting any specific, explainable inefficiency in the market.

The term "indicator soup" doesn't sound very promising. In your experience, how well do such approaches work?

Lotter: To be honest, I thought such strategies were nonsense at first. After all, indicators are derived from market data, i.e. prices and trading volumes, and thus do not contain any further information. On the contrary, they are merely a simplification, often with a time lag as well. But contrary to my initial skepticism, I have to admit that some of these approaches actually work well, which really surprised me.

What are the general success rates of the strategies you test?

Lotter: Around 60 percent of all strategies basically work on the basis of the backtests. I think that is definitely a positive statement, with this number perhaps even higher than some would expect. However, that includes the successful approaches of institutional investors. Of the strategies of retail investors, on the other hand, rarely does anything work better than chance.

Losses in Random Trading
Figure 1) Losses in Random Trading
The chart shows a statistical evaluation of the results of four years of random trading with Forex / CFDs. One trade per day and transaction costs of 2.5 pips were assumed. In the end, 79 percent of the 1000 simulated traders were in the red.

So you take a positive view of a 40 percent success rate.

Lotter: Absolutely. Because through the tests, we can say pretty confidently which approaches don't work in practice. Admittedly, the customers who commissioned these strategies still have to pay for our work, so you could say the whole thing was for nothing. But I don't see it that way: if traders know it's not profitable, they're probably saving a lot of money over the alternative of trying it out in practice to find out.

So you charge for testing the strategies as contract work?

Lotter: Yes, we set a price for it beforehand based on the estimated effort. A distinction has to be made here as to whether an already established strategy is simply programmed one-to-one or whether a strategy really has to be "developed," including optimizations and so on.

How do you proceed with your backtests?

Lotter: About nine out of ten classical backtests lead to wrong or misleading results. This is the main reason why algorithmic trading systems often fail in live trading. And of course we have to take this into account when developing strategies. We start at the beginning of the available data and test the set of rules on the first leg. Then the determined settings are applied to another unseen data period. This process is repeated several times. So we combine in-sample and out-of-sample data periods step by step by "pushing" them through the entire available data history (walk-forward analysis). This mitigates many problems such as over-optimization. However, there is still often a distorting bias in the strategy, as it was usually selected for testing in the first place because of potentially good performance. Even with out-of-sample data and walk-forward analysis, the backtest results are therefore on the optimistic side. So we can say: the majority of trading systems with a positive backtest are in fact unprofitable.

That's amazing. What is the reason for the high error rate?

Lotter: The problem is that you rarely get a zero result in a backtest. A purely random trading strategy will give a negative result 50 percent of the time and a positive result the other 50 percent of the time. But when the result is negative, people often try to tweak the code, vary markets, or select different time horizons until the result finally "fits." So any of these random changes produce an equally randomly better backtest. That's why there are so many unprofitable strategies that nevertheless do very well in the backtest.

Can you mitigate that problem?

Lotter: To a large extent, yes. To do that, we apply methods to check the backtest results, which we call reality checks. One example is the Monte Carlo reality check, which removes short-term price correlations and market inefficiencies but retains the long-term trend. In simple terms, this method "swirls" the price data and then reassembles it (we do this by pulling from the historical data without putting it back). This results in a new, artificial data series that has been "thrown together" so to speak. We then compare our original backtest result with the randomized results. This results in a p-value, i.e. a ratio for the probability that our test result was caused by chance. The lower the p-value, the more confidence we can have in the backtest result. In statistics, a result is usually considered significant if the p-value is below five percent. If the strategy still works statistically significantly better than chance on the swirled data, it will probably also be profitable in later practical use - at least as long as the market structure does not change completely. So a residual risk always remains. In order to be able to make statistical assessments, the simulation described is repeated very often, usually 1000 times, in order to derive concrete statements on the basis of the resulting distributions.

Monte Carlo Reality Check
Figure 2) Monte Carlo Reality Check
The graph shows the Monte Carlo Reality Check for a trend following strategy that really works with high probability. The x-axis shows the profit factor and the y-axis the corresponding frequencies. For 1000 simulations, the p-value for the tested strategy (black bar) was less than one percent.

Can you give us an example of which strategies do not work well?

Lotter: One important aspect is the time horizon, quite independent of the chosen market. Below 1-hour candles or especially below 30-minute candles, it becomes very difficult; from experience, almost nothing works there. In this range, the noise is simply too high, i.e. the coincidence compared to the contained signal, so that this can hardly be identified reliably. It is interesting, however, that the whole thing changes again in the high frequency range, i.e. for time horizons below one second. Here, profitable approaches are possible again, but of course the details are very important. The smallest errors can be very expensive here.

And which approaches regularly rank among the best?

Lotter: We have drawn up a ranking list of all the systems that have been studied so far. Our customers clearly prefer some methods and markets over others. To determine the success/failure rate, we used backtests over eight years for non-optimized systems and a walk-forward analysis for optimized systems. A successful system must achieve at least an average annual return of twelve percent for stocks and options or 30 percent for forex, CFDs, or cryptocurrencies. In addition, the statistical coefficient of determination must be above 70 percent. If customers have commissioned a reality check, it must have passed with 95 percent certainty. If any of these criteria were not met, the system was classified as a failure.

How should these figures be assessed?

Lotter: The statistics are clouded by strategies from trading books and forums, most of which fail either a proper backtest or at least a reality check. However, we got a surprising result with many a strategy from the indicator soup. Normally, one would expect them all to fail big time, since they are not based on a market model. But in fact, almost every third indicator complex was successful, even in the reality check. It is also somewhat surprising that the most complex strategies of all, the data mining systems, which usually use deep learning algorithms, do not perform much better. While they have an acceptable success rate, some of them are outperformed by much simpler strategies. Of all the approaches we have tested so far, the big winners have been the long-term trading strategies for stocks, ETFs and options. In the case of the latter, again, the simpler systems have often done better.

Monte Carlo Fail
Figure 3) Monte Carlo Fail
This graph shows the Monte Carlo reality check for comparison for a deliberately designed placebo strategy that has a high probability of not working. In 1000 simulations, the p-value of this strategy (black bar) was about 40 percent. The Monte Carlo Reality Check is not always so clear-cut, and not all false backtests can be detected with it, which is also true for walk-forward analysis. But if these methods indicate that the tested strategy does not work, it is better to leave it alone.

One category of profitable approaches is trend-following strategies. The trick here is to take advantage of trends without giving too much back in sideways phases. How can traders achieve this feat?

Lotter: We have developed an algorithm for this, the Market Meanness Index (MMI). It shows whether the market is likely to be in a trend or not, which helps to prevent losses caused by false signals from trend indicators. It is a purely statistical approach that works independently of volatility, trends or cycles on the price curve. For this purpose, the index measures the mean reversion tendency of the market. This refers to the effort to revert to the mean after what appears to be the start of a trend. If this happens too often, trend following systems will fail.

Can you please explain the concept in more detail?

‍Lotter: Each series of independent random numbers returns to the median of the distribution with a probability of 75 percent. Let's say you have a sequence of random, uncorrelated daily data, that is, a classic random walk. If Monday's data point was above the median, 75 percent of the time Tuesday's data will be lower than Monday's. And if Monday's data point was below the median, the probability of Tuesday being higher is also 75 percent. The MMI function now counts the number of pairs of data for which this is true and returns their percentage.

Why 75 percent in particular?

‍Lotter: It's important to distinguish: It is not the rates themselves that are random, but their changes. Therefore, the MMI function should return a smaller percentage, say 55 percent, when fed with prices. But as far as the changes in prices are concerned, i.e., the returns, there is no correlation between the price change from yesterday to today or the price change from today to tomorrow in a 100 percent efficient market. Thus, if the MMI function is fed with completely random price changes of an efficient market, it yields a value of 75 percent on average. The less efficient the market and the stronger the trend, the lower the MMI value. Therefore, a falling MMI is indicative of a trend. Conversely, a rising MMI indicates that the market is becoming more difficult for trend-following systems.

Market Meanness Index (MMI)
Figure 4) Market Meanness Index (MMI)
Shown is the application of the MMI to a synthetic price curve that initially moves sideways (black) and then with additional trend (blue). The example clearly shows how the MMI falls when a price trend exists.

Do some traders also use a portfolio of individual strategies that complement each other?

Lotter: Yes, relatively often. And this approach works well. This is also to be expected when combining several strategies that are profitable on their own, more or less uncorrelated to each other and/or targeting different time horizons. However, such a portfolio of strategies also needs to be reviewed and adjusted from time to time. For example, if one of the strategies included no longer works.

How can traders tell if the losses are temporary?

Lotter: This is an important question. Several reasons can cause a strategy to lose money initially. It may already be outdated because the underlying inefficiency in the market has disappeared. Or the system is worthless and the test has been distorted by a bias that has survived all reality checks. Or it is just a normal drawdown of an otherwise profitable strategy that one simply has to sit out. In the latter case, the drawdown is why you need some initial capital to trade in the first place, aside from margin requirements and to cover trading costs. But the basic problem is that you can never fully trust the backtest results. However, the simplest, classic method of assessing the ongoing viability of a trading strategy is based on the maximum drawdown. To do this, you set the current drawdown in relation to the test period, also taking into account its duration and the length of the time horizon. But this method has a catch: it is a late and inaccurate signal.

Are there other ways?

Lotter: To find out whether you should stop a strategy immediately, we calculate the deviation of the current live trading situation from the behavior of the strategy in the backtest. For this, we do not use the maximum drawdown, but the backtest capital curve. The result is the so-called "Cold Blood Index", which compares the similarity of the live situation with the backtest and gives a probability value that everything is still within the historical expectation. Accordingly, as long as the drawdown is considered statistically expectable, the strategy should continue to be pursued. If, on the other hand, the deviations exceed a certain threshold, it can be concluded that the strategy is highly unlikely to work anymore.

Can traders also use your platform for trading themselves?

Lotter: The Zorro platform is freely available for development, demo and live implementation of trading signals, both semi- and fully automated. It also includes many pre-built scripts for such things as the walk-forward analysis described above, the Monte Carlo Reality Check and the Market Meanness Index. Most traders run the scripts on a rented server, so that they do not have to keep their home computer online all the time. In many cases, they could also be left with the broker, but here there is often skepticism as to whether everything is really above board. You simply have a better feeling if the broker really only receives the orders sent and does not have access to the signals behind them.

Do you also trade on the stock exchange yourself?

Lotter: If I trade myself, it's only for testing purposes. I don't consider myself suitable for trading and am far too risk-averse for it. In addition, the strategies we develop are on behalf of clients and are therefore not used elsewhere. Privately, I simply run a long-term portfolio of stocks and ETFs that I rebalance regularly. For my taste, that's quite enough. And that also frees up my mind for my work.

In contrast, do you sometimes perceive the world of trading as crazy?

Lotter: Not anymore, after I have heard and seen a lot of things and have been taught better in some things like the profitability of some strategies with indicators. But in the beginning, the whole thing was quite bizarre for me, comparable to medieval alchemy. It's a completely different world than I knew before, because everything is in flux and always changing. And I've learned that individual traders have a very different sense of risk than I do. That is perhaps the most important thing you have to be able to do as a trader or investor: assess yourself well. This is the only way to find the personally "right" way on the stock market, which is also feasible for you in the long term.

The article was first published in TRADERS` magazine. You can read the complete issue February 2023 (German only) here free of charge. You can also test TRADERS' Magazine digital without obligation and read three issues completely free of charge.

Would you like to use this article - in full or in part - for your purposes? Then please consider the following Creative Commons-Lizenz.