A path to causal inference

The aim of data analytics is to infer the relationships between variables in a system in order to predict and/or control said system. For example, we may wish to understand the relationship between a stock’s return and its volatility in order to profit from changes in these variables or to reduce risk. For instance if we knew that when volatility went down the stock price would go up we could profit by buying the stock when we first saw signs of a drop in stock volatility.

Unfortunately, it is often not possible to directly infer causation because we are (usually) unable to directly perturb a given system; the only way to unambiguously determine causation. Consequently, we often settle for analysing simple correlations which act as a crude proxy for causation. This can be seen in the chart from Spurious Correlations which shows the total revenue generated by arcades in the US versus the number of computer science doctorates. Whilst there is a 98.5% correlation there is no plausible causal mechanism.

from http://www.tylervigen.com/spurious-correlations
from http://www.tylervigen.com/spurious-correlations

There are alternative approaches to inferring causation such as developing a mechanistic model as is done in epidemiology or building an experiment as is done is behavioural economics. The first of these requires a good understanding of the underlying process whilst the latter requires strict controls to prevent external influences and to make it as realistic as possible. Another approach is to consider causation in a statistical sense as is done with Granger causality.

Granger causality is based upon the premise that the process X strictly Granger causes another process Y if future values of Y can be better predicted using the past values of X rather than only the past values of Y. This notion was originally introduced by Wiener (1956) and later formalized in terms of linear auto-regression by Granger (1969). As stated by Barnett et al. (2009), “identifying Granger causality is not identical to identifying a physically instantiated causal interaction in a system; this can only be unambiguously identified by perturbing the system. Instead, it is a causal relation in a statistical sense.” A major problem with Granger causality is that most real problems, such as the return–volume relation, are nonlinear (Hiemstra and Jones, 1994; Chuang et al., 2009). This led researchers such as Baek and Brock (1992) and Hiemstra and Jones (1994) to develop nonlinear extensions to Granger causality. The Hiemstra and Jones (1994) test is now the most commonly used method among practitioners in finance and economics. Unfortunately, Diks and Panchenko (2005) show that this measure may not actually test Granger causality and identify numerous situations in which the test actually fails. An alternative approach is to use a truly nonlinear and nonparametric method such as information theory.

Information theory was originally developed to examine the properties in signal processing, such as data compression, by Shannon (1948). However, it is now widely used in the physical sciences for problems such as statistical inference due to its ability to analyse nonlinear statistical dependencies and higher order moments of the distribution.

Mutual information (MI) is a popular measure in the field of information theory. MI gives the mutual reduction in uncertainty of one variable given another. For example, one can calculate the reduction in uncertainty of the daily return at time, t, by knowing the daily volume at time, t. If there is no reduction in uncertainty, then the daily returns and volumes are statistically independent. Unfortunately, since this measure is symmetric under the exchange of variables, it is only able to determine if two variables are related. However, if one wishes to imply causation one can simply add a time lag to one variable; this assumes that the causal effect cannot back propagate through time. For example, one can find the reduction in uncertainty in the daily return at time, t+1, given the daily volume at time, t, and vice versa. If there is only a reduction in uncertainty in one direction or one is substantially larger, then one variable must be strongly influencing or causing the changes in the other variable.

One can use an asymmetric measure such as Transfer Entropy (TE) (Schreiber, 2000), an information theoretic measure of time-directed information transfer between jointly dependent processes. Barnett et al. (2009) state that TE is not framed in terms of prediction but in terms of resolution of uncertainty. The TE from Y to X is the degree to which Y disambiguates the future of X beyond the degree to which X already disambiguates its own future. This parallels the notion of Granger causality. In fact, Barnett et al. (2009) show that TE is equivalent to Granger causality for Gaussian distributed variables and Hlaváčková-Schindler (2011) extended this to variables distributed as exponential Weinman’s, log-normal’s and certain parametrizations of Generalized Gaussian’s.

This article is based upon a recent paper I published in the Journal of Applied Economics entitled “An information theoretic analysis of stock returns, volatility and trading volumes”; Ong (2015). In this paper I used information theory to show that the observed negative correlation between a stock’s returns and its volatility (known as the Leverage Effect, Black (1976)) is driven by trading volumes; this is supportive of previous research by Avramov et al (2006). This is important for trading and risk management purposes and supports the idea of a behavioural based explanation for the Leverage Effect.


Avramov, D., Chordia, T. and Goyal, A. (2006) The impact of trades on daily volatility, Review of Financial Studies, 19, 1241–77

Baek, E. and Brock, W. (1992) A general test for nonlinear Granger causality: bivariate model, Working Paper, Iowa State University and University of Wisconsin at Madison

Barnett, L., Barrett, A. B. and Seth, A. K. (2009) Granger causality and transfer entropy are equivalent for Gaussian variables, Physical Review Letters, 103, 238701

Black, F. (1976) Studies in stock market volatility changes, in Proceedings of the 1976 Meeting of the Business and Economics Statistics Section, American Statistical Association, Alexandria, VA, pp. 177–81

Chuang, -C.-C., Kuan, C.-M. and Lin, H.-Y. (2009) Causality in quantiles and dynamic stock return-volume relations, Journal of Banking & Finance, 33, 1351–60

Diks, C. and Panchenko, V. (2005) A note on the Hiemstra-Jones test for Granger non-causality, Studies in Nonlinear Dynamics and Econometrics, 9, 1558–3708.

Hiemstra, C. and Jones, J. D. (1994) Testing for linear and nonlinear Granger causality in the stock price-volume relation, The Journal of Finance, 49, 1639–64.

Hlaváčková-Schindler, K. (2011) Equivalence of Granger causality and transfer entropy: a generalization, Applied Mathematical Sciences, 5, 3637–48

Ong, M. (2015) An information theoretic analysis of stock returns, volatility and trading volumes, Applied Economics, 47, 36, 3891-3906

Schreiber, T. (2000) Measuring information transfer, Physical Review Letters, 85, 461–4

Shannon, C. E. (1948) A note on the concept of entropy, Bell System Technical Journal, 27, 379–423