Abstract :
We present an aggregation of the causal identifiability solutions techniques and their assumptions as advanced in extant literatures with datasets of odd origins, which do not necessarily conform to the independent and identically distributed (i.i.d) dataset, multinomial datasets and the Gaussian datasets settings; alongside their concomitant assumptions. The transformation process in data generation can sometimes be a desideratum of datasets of the following forms: linear and non-Gaussian, nonlinear & non-Gaussian, datasets with missing values, datasets tainted with selection biases, datasets with whose variables forms cycles, datasets with heterogeneous/nonstationary variables, datasets with confounding or latent variables, time-series datasets, deterministic datasets, etc. The study begins proper in section 2 after the introduction with the basic background into the concept of causality with observational data. The concept of graph as an embodiment of the background knowledge with structural causal model (SCM) is explicated in section 3; followed by the basic assumptions employed especially with common observational data settings in section 4. An exposition into the categorization of the algorithms used in causality is presented in section 4. Section 5 aggregates and expounds the causal identifiability techniques and their associated assumptions athwart varying datasets; which is the crux of the study and a recapitulation of same is presented in table 1. This study’s main contribution is to present an aggregate review of the causal techniques and their assumptions across different data settings especially in data settings of odd origins, as reviews such as this are grossly lacking in extant literatures.
Keywords :
causal identifiability techniques, Causality, observational datasetsReferences :
- Box, J.F., RA Fisher, the Life of a Scientist. Revue Philosophique de la France Et de l, 1980. 170(4).
- Benson, K. and A.J. Hartz, A comparison of observational studies and randomized, controlled trials. New England Journal of Medicine, 2000. 342(25): p. 1878-1886.
- Silverman, S.L., From randomized controlled trials to observational studies. The American journal of medicine, 2009. 122(2): p. 114-120.
- Pearl, J., Causal inference in statistics: An overview. Statistics surveys, 2009. 3: p. 96-146.
- Nogueira, A.R., et al., Methods and tools for causal discovery and causal inference. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2022: p. e1449.
- Morgan, S. and C. Winship, Counterfactuals and causal inference. Cambridge University Press. 2007, Cambridge University Press New York, NY.
- Guo, R., et al., A survey of learning causality with data: Problems and methods. ACM Computing Surveys (CSUR), 2020. 53(4): p. 1-37.
- Glymour, C., K. Zhang, and P. Spirtes, Review of causal discovery methods based on graphical models. Frontiers in genetics, 2019. 10: p. 524.
- Yao, L., et al., A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), 2021. 15(5): p. 1-46.
- Pearl, J. and D. Mackenzie, The book of why: the new science of cause and effect. 2018: Basic books.
- Hitchcock, C. and M. Rédei, Reichenbach’s common cause principle.
- Pearl, J., Causality. 2009: Cambridge university press.
- Peters, J., D. Janzing, and B. Schölkopf, Elements of causal inference: foundations and learning algorithms. 2017: The MIT Press.
- Neyman, J., Sur les applications de la theorie des probabilites aux experiences agricoles: essai des principes (Masters Thesis); Justification of applications of the calculus of probabilities to the solutions of certain questions in agricultural experimentation. Excerpts English translation (Reprinted). Stat Sci, 1923. 5: p. 463-472.
- Rubin, D.B., Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 1974. 66(5): p. 688.
- Spirtes, P., C. Glymour, and R. Scheines, Discovery algorithms for causally sufficient structures, in Causation, prediction, and search. 1993, Springer. p. 103-162.
- Greenland, S., J. Pearl, and J.M. Robins, Causal diagrams for epidemiologic research. Epidemiology, 1999: p. 37-48.
- Lauritzen, S.L., Causal Inference from. Complex stochastic systems, 2000: p. 63.
- Neal, B., Introduction to causal inference from a machine learning perspective. Course Lecture Notes (draft), 2020.
- Holland, P.W., Statistics and causal inference. Journal of the American statistical Association, 1986. 81(396): p. 945-960.
- Eberhardt, F., Introduction to the foundations of causal discovery. International Journal of Data Science and Analytics, 2017. 3(2): p. 81-91.
- Halpern, J.Y., The Book of Why, Judea Pearl, Basic Books (2018). 2019, Elsevier.
- Elwert, F., Graphical causal models, in Handbook of causal analysis for social research. 2013, Springer. p. 245-273.
- Pearl, J., Probabilistic reasoning in intelligent systems: networks of plausible inference. 1988: Morgan kaufmann.
- Gultchin, L., et al. Differentiable causal backdoor discovery. in International Conference on Artificial Intelligence and Statistics. 2020. PMLR.
- Correa, J. and E. Bareinboim. Causal effect identification by adjustment under confounding and selection biases. in Proceedings of the AAAI Conference on Artificial Intelligence. 2017.
- Tian, J. and J. Pearl, A general identification condition for causal effects. 2002: eScholarship, University of California.
- Pearl, J., Theoretical impediments to machine learning with seven sparks from the causal revolution. arXiv preprint arXiv:1801.04016, 2018.
- Richardson, T.S. and J.M. Robins. Single world intervention graphs: a primer. in Second UAI workshop on causal structure learning, Bellevue, Washington. 2013. Citeseer.
- Zhang, J. and P. Spirtes, Intervention, determinism, and the causal minimality condition. Synthese, 2011. 182(3): p. 335-347.
- Hauser, A. and P. Bühlmann, Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. The Journal of Machine Learning Research, 2012. 13(1): p. 2409-2464.
- Hauser, A. and P. Bühlmann, Jointly interventional and observational data: estimation of interventional Markov equivalence classes of directed acyclic graphs. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2015. 77(1): p. 291-318.
- Sharma, S., et al., Monitoring protein conformation along the pathway of chaperonin-assisted folding. Cell, 2008. 133(1): p. 142-153.
- Uhler, C., et al., Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, 2013: p. 436-463.
- Mayrhofer, R. and M.R. Waldmann, Sufficiency and necessity assumptions in causal structure induction. Cognitive science, 2016. 40(8): p. 2137-2150.
- Zhang, J. and W. Mayer, Weakening faithfulness: some heuristic causal discovery algorithms. International journal of data science and analytics, 2017. 3(2): p. 93-104.
- Zhang, J. and P.L. Spirtes, Strong faithfulness and uniform consistency in causal inference. arXiv preprint arXiv:1212.2506, 2012.
- Agresti, A., Two Bayesian/frequentist challenges for categorical data analyses. Metron, 2014. 72(2): p. 125-132.
- Spirtes, P., et al., Causation, prediction, and search. 2000: MIT press.
- Pena, J.M. Learning gaussian graphical models of gene networks with false discovery rate control. in European conference on evolutionary computation, machine learning and data mining in bioinformatics. 2008. Springer.
- Nyberg, E. and K. Korb, Informative interventions. Causality and probability in the sciences. College Publications, London, 2006.
- Meek, C., Strong completeness and faithfulness in Bayesian Networks. In uncertainty in artificial intelligence: Proceedings of the eleventh conference. 1995, San Francisco, CA: Morgan Kaufmann.
- Pearl, J. and T.S. Verma, A theory of inferred causation, in Studies in Logic and the Foundations of Mathematics. 1995, Elsevier. p. 789-811.
- Kalisch, M. and P. Bühlman, Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research, 2007. 8(3).
- Ramsey, J.D., A scalable conditional independence test for nonlinear, non-Gaussian data. arXiv preprint arXiv:1401.5031, 2014.
- Sejdinovic, D., et al., Equivalence of distance-based and RKHS-based statistics in hypothesis testing. The Annals of Statistics, 2013: p. 2263-2291.
- Zhang, K., et al., Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775, 2012.
- Colombo, D., et al., Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, 2012: p. 294-321.
- Spirtes, P.L., C. Meek, and T.S. Richardson, Causal inference in the presence of latent variables and selection bias. arXiv preprint arXiv:1302.4983, 2013.
- Richardson, T., Feedback models: Interpretation and discovery. 1996, Ph. D. thesis, Carnegie Mellon.
- Zhang, J. and P. Spirtes, The three faces of faithfulness. Synthese, 2016. 193(4): p. 1011-1027.
- Schwarz, G., Estimating the dimension of a model. The annals of statistics, 1978: p. 461-464.
- Maathuis, M.H. and P. Nandy, A Review of Some Recent Advances in Causal Inference. Handbook of big data, 2016: p. 387-407.
- Chickering, D.M., Optimal structure identification with greedy search. Journal of machine learning research, 2002. 3(Nov): p. 507-554.
- Chickering, D.M., D. Geiger, and D. Heckerman, Learning Bayesian networks is NP-hard. 1994, Citeseer.
- Tsamardinos, I., L.E. Brown, and C.F. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning, 2006. 65(1): p. 31-78.
- Wong, M.L., S.Y. Lee, and K.S. Leung. A hybrid approach to discover Bayesian networks from databases using evolutionary programming. in 2002 IEEE International Conference on Data Mining, 2002. Proceedings. IEEE.
- Heinze-Deml, C., M.H. Maathuis, and N. Meinshausen, Causal structure learning. Annual Review of Statistics and Its Application, 2018. 5: p. 371-391.
- Shimizu, S., et al., A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 2006. 7(10).
- Hyvärinen, A., J. Karhunen, and E. Oja, Independent component analysis, adaptive and learning systems for signal processing, communications, and control. John Wiley & Sons, Inc, 2001. 1: p. 11-14.
- Shimizu, S., et al., DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. The Journal of Machine Learning Research, 2011. 12: p. 1225-1248.
- Hoyer, P., et al., Nonlinear causal discovery with additive noise models. Advances in neural information processing systems, 2008. 21.
- Hoyer, P.O., et al., Causal discovery of linear acyclic models with arbitrary distributions. arXiv preprint arXiv:1206.3260, 2012.
- Zhang, K. and A. Hyvarinen, On the identifiability of the post-nonlinear causal model. arXiv preprint arXiv:1205.2599, 2012.
- Hyvärinen, A., et al., Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research, 2010. 11(5).
- Lacerda, G., et al., Discovering cyclic causal models by independent components analysis. arXiv preprint arXiv:1206.3273, 2012.
- Hoyer, P.O., et al., Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 2008. 49(2): p. 362-378.
- Kano, Y. and S. Shimizu. Causal inference using nonnormality. in Proceedings of the international symposium on science of modeling, the 30th anniversary of the information criterion. 2003.
- Kiviniemi, V., et al., Independent component analysis of nondeterministic fMRI signal sources. Neuroimage, 2003. 19(2): p. 253-260.
- Dillon, W.R. and M. Goldstein, Multivariate analysis: Methods and applications. 1984: New York (NY): Wiley, 1984.
- Humeniuk, R., et al., Validation of the alcohol, smoking and substance involvement screening test (ASSIST). Addiction, 2008. 103(6): p. 1039-1047.
- Sanchez-Romero, R., et al., Estimating feedforward and feedback effective connections from fMRI time series: Assessments of statistical methods. Network Neuroscience, 2019. 3(2): p. 274-306.
- Zhang, K. and L.-W. Chan. Extensions of ICA for causality discovery in the hong kong stock market. in International Conference on Neural Information Processing. 2006. Springer.
- Mooij, J., et al. Regression by dependence minimization and its application to causal inference in additive noise models. in Proceedings of the 26th annual international conference on machine learning. 2009.
- Granger, C.W., Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society, 1969: p. 424-438.
- Entner, D. and P.O. Hoyer, On causal discovery from time series data using FCI. Probabilistic graphical models, 2010: p. 121-128.
- Runge, J., et al., Inferring causation from time series in Earth system sciences. Nature communications, 2019. 10(1): p. 1-13.
- Runge, J. Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. in Conference on Uncertainty in Artificial Intelligence. 2020. PMLR.
- Gerhardus, A. and J. Runge, High-recall causal discovery for autocorrelated time series with latent confounders. Advances in Neural Information Processing Systems, 2020. 33: p. 12615-12625.
- Eichler, M. Causal inference from time series: What can be learned from granger causality. in Proceedings of the 13th International Congress of Logic, Methodology and Philosophy of Science. 2007. Citeseer.
- Janzing, D., et al., Information-geometric approach to inferring causal directions. Artificial Intelligence, 2012. 182: p. 1-31.
- Huang, B., et al. Behind distribution shift: Mining driving forces of changes and causal arrows. in 2017 IEEE International Conference on Data Mining (ICDM). 2017. IEEE.
- Schölkopf, B., A.J. Smola, and F. Bach, Learning with kernels: support vector machines, regularization, optimization, and beyond. 2002: MIT press.
- Zhang, K., et al. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. in IJCAI: Proceedings of the Conference. 2017. NIH Public Access.
- Tu, R., et al. Causal discovery in the presence of missing data. in The 22nd International Conference on Artificial Intelligence and Statistics. 2019. PMLR.
- Zhang, K., et al., Causal discovery in the presence of measurement error: Identifiability conditions. arXiv preprint arXiv:1706.03768, 2017.
- Institute, N.C., Cancer, in Dictionary of Cancer Terms. 2009, cancer.gov: online.
- Westreich, D., Berkson’s bias, selection bias, and missing data. Epidemiology (Cambridge, Mass.), 2012. 23(1): p. 159.
- Hernán, M.A., S. Hernández-Díaz, and J.M. Robins, A structural approach to selection bias. Epidemiology, 2004: p. 615-625.
- Kopec, J.A. and J.M. Esdaile, Bias in case-control studies. A review. Journal of epidemiology and community health, 1990. 44(3): p. 179.
- Zhang, K., et al. On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection. in UAI. 2016.
- Pearl, J., Causal inference. Causality: objectives and assessment, 2010: p. 39-58.
- Claassen, T., J. Mooij, and T. Heskes, Learning sparse causal models is not NP-hard. arXiv preprint arXiv:1309.6824, 2013.
- Agarwal, A. and R. Shankar, Modeling supply chain performance variables. Asian Academy of Management Journal, 2005. 10(2): p. 47-68.
- Spirtes, P., Building causal graphs from statistical data in the presence of latent variables, in Studies in Logic and the Foundations of Mathematics. 1995, Elsevier. p. 813-829.
- Forré, P. and J.M. Mooij, Constraint-based causal discovery for non-linear structural causal models with cycles and latent confounders. arXiv preprint arXiv:1807.03024, 2018.
- Strobl, E.V., A constraint-based algorithm for causal discovery with cycles, latent variables and selection bias. International Journal of Data Science and Analytics, 2019. 8(1): p. 33-56.
- Richardson, T. A discovery algorithm for directed cyclic graphs. Uncertainty in Artificial Intelligence. in Proceedings, 12th Conference, Morgan Kaufman, CA. 1996.
- Mooij, J. and T. Heskes, Cyclic causal discovery from continuous equilibrium data. arXiv preprint arXiv:1309.6849, 2013.
- Mooij, J.M., et al., On causal discovery with cyclic additive noise models. Advances in neural information processing systems, 2011. 24.