How to Make Causal Inferences Using Texts
نویسندگان
چکیده
New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic response. Our work provides a rigorous foundation for textbased causal inferences. ∗We thank Edo Airoldi, Peter Aronow, Matt Blackwell, Sarah Bouchat, Chris Felton, Mark Handcock, Erin Hartman, Rebecca Johnson, Gary King, Ian Lundberg, Rich Nielsen, Thomas Richardson, Matt Salganik, Melissa Sands, Fredrik Sävje, Arthur Spirling, Alex Tahk, Endre Tvinnereim, Hannah Waight, Hanna Wallach, Simone Zhang and numerous seminar participants for useful discussions about making causal inference with texts. We also thank Dustin Tingley for early conversations about potential SUTVA concerns with respect to STM and sequential experiments as a possible way to combat it. †Ph.D. Candidate, Department of Politics, Princeton University, [email protected] ‡Ph.D. Candidate, Graduate School of Business, Stanford University, [email protected] §Associate Professor, Department of Political Science, University of Chicago, JustinGrimmer.org, [email protected]. ¶Assistant Professor, Department of Political Science, University of California San Diego, [email protected] ‖Assistant Professor, Department of Sociology, Princeton University, brandonstewart.org, [email protected]
منابع مشابه
Syntactic and Causal Constraints on the Necessity of Conditional Inferences by Readers
The data of three experiments (Campion, in press) confirm that the readers of texts including conditional arguments process the conditional syntax as an asymmetric constraint which warrants the Modus Ponens, a logically valid inference. However, causal knowledge can raise doubt about that inference and warrant the validity of the reciprocal inference (Affirmation of the Consequent). Thus, accor...
متن کاملClose Does Count: Evidence of a Proximity Effect in Inference from Causal Knowledge
Two studies are reported in which participants drew inferences about variables in systems of causal relationships. Previous work has shown that such inferences are influenced by information about variables that, on a normative account of causal reasoning, should be irrelevant. The present studies tested two hypotheses about how relevance is assigned to these normatively irrelevant variables. Th...
متن کاملRunning head: TRANSITIVE REASONING IN CAUSAL CHAINS Transitive Reasoning Distorts Induction in Causal Chains
A probabilistic causal chain ABC may intuitively appear to be transitive: If A probabilistically causes B, and B probabilistically causes C, A probabilistically causes C. However, probabilistic causal relations are only transitive if the so-called Markov condition holds. In two experiments, we examined how people make probabilistic judgments about indirect relationships AC in causal chains A...
متن کاملReasoning about causal relationships: Inferences on causal networks.
Over the last decade, a normative framework for making causal inferences, Bayesian Probabilistic Causal Networks, has come to dominate psychological studies of inference based on causal relationships. The following causal networks-[X→Y→Z, X←Y→Z, X→Y←Z]-supply answers for questions like, "Suppose both X and Y occur, what is the probability Z occurs?" or "Suppose you intervene and make Y occur, w...
متن کاملتحلیل محتوا
Background & Aim: Content analysis was used first in communication sciences. Today, it is frequently used in media analysis. In other sciences such as nursing, researchers apply this method in their studies. Material & Method: In spite of the importance of this method in nursing research, there was not enough Persian material on the subject. Therefore, this review study was conducted to clari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.02163 شماره
صفحات -
تاریخ انتشار 2017