Lean Experiments

In theory there is no difference between theory and practice. In practice there is.
— Yogi Berra
The Scientific Method: A good idea is a hypothesis that has been validated in an experiment.

The Scientific Method: A good idea is a hypothesis that has been validated in an experiment.

Human brains are wired for story-telling and for fight-or-flight decisions, not statistical uncertainty. We tell ourselves simple stories to explain complex things we do not and, most importantly, cannot know. The truth is that we have no idea which way the market is going most of the time, and whatever prediction we make is sure to be over-simplified, if not flat out wrong. Unaided, most humans traders perform poorly in today’s electronic markets.

This understanding is, of course, nothing new. Four hundred years ago, Francis Bacon warned that our minds not only struggle to interpret the world around us but also are wired to deceive us. “Beware the fallacies into which undisciplined thinkers most easily fall – they are the real distorting prisms of human nature.” Indeed, the human minds are programmed toward: “assuming more order than exists in chaotic nature.” We are especially susceptible to self-deception when it comes to statistics. The problem exists because we place too much weight on the odds that past events will repeat, when unrepeatable chance is a better explanation.

As an extension of common sense, mathematical reasoning allows us to see the hidden structures underneath the messy and chaotic surface of our world. It’s a science of not being wrong, hammered out by centuries of hard work and argument. Humans are prone to confirmation bias – trying to confirm what we want to be true. So a good antidote to human biases would be: instead of looking for reasons why you could be right, look for reasons why you could be wrong. In other words, determine where your biases are and attempt to remove them.

Taking the above into considerations, we now have a reasonable basis for running experiments that screen for good ideas, i.e., hypotheses that have been validated in a battery of experiments. But how do we formulate such hypotheses in the first place? And from where do we get ideas, i.e., lots of them?

One approach that is used by firms like Two Sigma Investments involve programming its machines to cull torrents of information from sources like newswires, earnings reports, weather bulletins, and Twitter; and then build trading models and algorithms that make trading decisions based on “signals” extracted from those data. Practitioners of this newer approach to quantitative investing differs from traditional “quants,” who program their machines to bet on statistical relationships among security prices but don’t bother much with real-world information. Instead, the goal of these newer breed of practitioners is to get an advantage over human fund managers by writing algorithms that are smarter and faster than any human could be in scouring the world’s information, finding patterns, and making trading decisions in stocks, bonds, options, futures, or currencies.

To see how models are built using this approach, let’s consider supply and demand for coconuts in our island economy. Let's suppose that our island economy has by now developed the concept of money and has its own unit of currency. How the coconut market behaves is determined by the elasticities of supply and demand which, respectively, tell us how price sensitive coconut growers and coconut buyers are. Now suppose that from time to time there are spells of bad weather which make growing coconuts difficult. A casual observer on the island would notice that when the weather gets bad, the price of coconuts rises and the quantity produced and purchased falls. When the weather is good, prices are low and quantities are high. This islander will notice such patterns and the patterns will become part of her beliefs about the world she lives in. Of course, the islander may not understand why this pattern exists – she merely understands that the pattern does exist.

Now, let’s extend our supply and demand model a bit. Let’s now suppose that weather conditions on the island are somewhat persistent from year to year. If the weather is bad one year then it is likely to be bad the next year. In this case, when prices are high one year, they will tend to be high next year. High prices this year, for example, means that the weather must have been bad recently. Again, our casual observer will incorporate this pattern into her beliefs and again she would not be required to understand why this pattern exists. Suppose we add a futures market which co-exists with the coconut market. The futures market on the island operates every Friday and sells claims on future coconuts.

Suppose now the island soothsayer comes up with a model which explains the quantity and price variations in terms of supply and demand. Unbeknownst to this soothsayer, the model is actually true. The model provides a meaningful and accurate description of how the coconut market works. However, the model is not particularly useful for predicting future prices. The model says that if there is an adverse shift in supply, then quantities should fall and prices should rise. The amount of quantity and price change are governed by two parameters, i.e., elasticities of supply and demand. However, predicting future prices in this environment boils down to predicting the weather. On that score, the supply and demand model, despite being true, is of little help.

In contrast, quantifying the observable patterns in the data is definitely helpful for the purpose of forecasting future prices. In fact, the current price contains valuable information on the likely future price. A simple regression of the current price on the past price will provide futures market participants with enough information to price bets on future prices. If quantities and prices are measured with error, then the best forecast will make use of both quantity and price to predict the future price. In this environment, futures traders have no use for the supply and demand model, even though it provides key insights into how this coconut market works.

There is a caveat in that a change in weather pattern on the island would render useless statistical patterns that had prevailed in the past. If there were a subsidy to coconut growers on the island, the island soothsayer's supply and demand model would correctly predict that the average quantities would rise and average prices would fall. So the soothsayer's model is useful after all, but only during such times when the regime changes. During normal times, ad hoc statistical forecasting methods – methods devoid of any structural economic content but which have substantial predictive power – work reasonably well in making market predictions.

With the increasing reliance on such a “big data approach,” critics warned of the dangers of “backtest overfitting,” in which random correlations are interpreted wrongly as strong relationships, and of placing big bets on “spurious” relationships that are non-existent in the real world. Such criticisms can be easily understood through the lens of the coconut market example above. “Pseudo-mathematics,” according to David Bailey, a research fellow at the University of California, Davis, “is a large part of the reason why so many algorithmic and systematic hedge funds do not live up to the elevated expectations generated by their managers.” Of course, not all firms are alike. At firms that are setup to work like laboratories, scientifically rigorous methodologies are applied. And one could easily tell from results.

Look Ma! No Hands!

Look Ma! No Hands!

A more interesting approach to consider, we think, might be to constrain the universe of possible relationships based on a priori domain knowledge so as to allow an automated feedback loop to be introduced into the process workflow. For example, in the domain of currency trading, we might consider a universe of macroeconomic relationships amongst exchange rates, interest rates, capital flow, economic growth, inflation, unemployment, savings, and investment. Correlations and/or causal chains can be traced among these concepts based on our structural understanding of macroeconomics, augmented with published data and statistics. The corresponding strength of relationships can then be objectively measured and their significance ranked for automated decision-making by various currency trading models in the system.

We feel that trading strategies that are based on a deeper understanding of the logical cause-and-effect economic relationships that drive markets just seem a more prudent approach. Within such a constrained yet knowledge-rich universe, we can confidently let the machines perform independent experiments. I.e., machines can come up with macroeconomics-driven hypotheses based on recognized data patterns, test them with powerful computers, interpret findings without human guidance, and learn to make improved hypotheses in the next iteration. In other words, machines drive the entire process of running scientific experiments on multiple sources of real-time market data, with human oversight but no human intervention. As a key benefit, continuous improvements to the trading models can be made in real time based on performance results even as real-world events unfold.

Looking ahead, we believe the re-invention of invention in computational finance can be realized by expanding the epistemic base of financial technology so it can deliver improved quantitative tools for integrated study of a wide range of financial trading models embedded into a common macroeconomic simulation framework, which would in turn contribute to our deeper understanding of computational finance within the context of quantitative macroeconomics.

The problem with QE is that it works in practice, but it doesn’t work in theory.
— Ben Bernanke (2014)


  1. Hope, Bradley (2014, April 1). How Computers Trawl a Sea of Data for Stock Picks. Wall Street Journal. Retrieved from: http://www.wsj.com/articles/how-computers-trawl-a-sea-of-data-for-stock-picks-1427941801
  2. Strasburg, Jenny (2012, May 21). Computer Trading Takes Human Turn. Wall Street Journal. Retrieved from: http://www.wsj.com/articles/SB10001424052702304791704577418363843211828
  3. Schrage, Michael (2014). The Innovator’s Hypothesis: How Cheap Experiments are Worth More than Good Ideas. MIT Press.
  4. Ellenberg, Jordan (2014). How Not to Be Wrong: The Power of Mathematical Thinking. Penguin Press.