The following list contains papers that fall somewhere between distributable working paper and published. This is not an exhaustive list of my research:

1) Forecasting Elections: Comparing Prediction Markets, Polls, and their Biases
Public Opinion Quarterly. 2009. Vol. 73, No. 5, pp 895-916.
Click for abstract of paper »

Using the 2008 elections, I explore the accuracy and informational content of forecasts derived from two different types of data: polls and prediction markets. Both types of data suffer from inherent biases, and this is the first analysis to compare the accuracy of these forecasts adjusting for these biases. Moreover, the analysis expands on previous research by evaluating state-level forecasts in Presidential and Senatorial races, rather than just the national popular vote. Utilizing several different estimation strategies, I demonstrate that early in the cycle and in not-certain races debiased prediction market-based forecasts provide more accurate probabilities of victory and more information than debiased poll-based forecasts. These results are significant because accurately documenting the underlying probabilities, at any given day before the election, is critical for enabling academics to determine the impact of shocks to the campaign, for the public to invest wisely and for practitioners to spend efficiently.

Debiased Aggregated Polls and Prediction Market Prices
Chance. 2010. Vol. 23, No. 3, pp 6-7.

2) Simplifying Market Access: a New Confidence-Based Interface with Florian Teschner
The Journal of Prediction Markets. 2012. Vol. 6, No. 3, pp 27-41.
Click for abstract of paper »

Markets are a strong instrument for aggregating dispersed information, yet there are flaws. Markets are too complex for some users, they fail to capture massive amounts of their users’ relevant information, and they suffer from some individual-level biases. Based on recent research in polling environments, we design a new market interface that captures both a participant’s point estimate and confidence. The new interface lowers the barrier to entry, asks market’s implicit question more directly, and helps reduce known biases. We further utilize a novel market rule that supplements the interface with its simplicity. Thus, we find that market participants using our new interface: provide meaningful information and are more likely to submit profitable orders than using a standard market interface.

3) A Combinatorial Prediction Market for the U.S. Elections with Miroslav Dudik, Sebastien Lahaie, and David Pennock
Economics and Computation. 2013.
Click for abstract of paper »

We report on a large-scale case study of a combinatorial prediction market. We implemented a back-end pricing engine based on Dud´ık et al.’s [2012] combinatorial market maker, together with a wizard-like front end to guide users to constructing any of millions of predictions about the presidential, senatorial, and gubernatorial elections in the United States in 2012. Users could create complex combinations of predictions and, as a result, we obtained detailed information about the joint distribution and conditional estimates of election results. We describe our market, how users behaved, and how well our predictions compared with benchmark forecasts. We conduct a series of counterfactual simulations to investigate how our market might be improved in the future.

4) Lay understanding of probability distributions with Daniel G. Goldstein
Judgment and Decision Making. 2014. Vol. 9, No. 1, pp. 1–14.
Click for abstract of paper »

How accurate are laypeople’s intuitions about probability distributions of events? The economic and psychological literatures provide opposing answers. A classical economic view assumes that ordinary decision makers consult perfect expectations, while recent psychological research has emphasized biases in perceptions. In this work, we test laypeople’s intuitions about probability distributions. To establish a ground truth against which accuracy can be assessed, we control the information seen by each subject to establish unambiguous normative answers. We find that laypeople’s statistical intuitions can be highly accurate, and depend strongly upon the elicitation method used. In particular, we find that eliciting an entire distribution from a respondent using a graphical interface, and then computing simple statistics (such as means, fractiles, and confidence intervals) on this distribution, leads to greater accuracy, on both the individual and aggregate level, than the standard method of asking about the same statistics directly.

5) The Extent of Price Misalignment in Prediction Markets with David Pennock
Algorithmic Finance. 2014. Vol. 3, pp. 3-20
Click for abstract of paper »

We study misaligned prices for logically related contracts in prediction markets. First, we uncover persistent arbitrage opportunities for risk-neutral institutional investors between identical contracts on different exchanges. Examining the impact of several thousand dollars of transactions on the exchanges themselves in a randomized field trial, we document that price support extends well beyond what is seen in the published order book and that arbitrage opportunities are significantly larger than purely observational measurements indicate. Second, we demonstrate misalignment among identical and logically related contracts listed on the same exchange that cluster around moments of high information flow, when related contracts systemically shut down or fail to respond efficiently. Third, we document bounded rationality in prediction markets; examples include: consistent asymmetry between buying and selling, leaving the average return for selling higher than for buying; and persistent price lags between exchanges. Despite these signs of departure from theoretical optimality, the markets studied function well on balance, considering the sometimes complex and subtle relationships among contracts. Yet, we detail how to improve prediction markets by moving the burden of finding and fixing logical contradictions into the exchange and providing flexible trading interfaces, both of which free traders to focus on providing meaningful information in the form they find most natural.

6) Combining forecasts for elections: Accurate, relevant, and timely
International Journal of Forecasting. 2015. Vol. 31, pp. 952-964
Click for abstract of paper »

This paper increases the efficiency and understanding of forecasts for Electoral College and senatorial elections by generating, and then combining, forecasts based on voter intention polling, fundamental data, and prediction markets. The paper addresses the most relevant outcome variable, probability of victory in state-by-state elections, while also solving for the traditional outcomes, and ensuring that the forecasts easily update continuously during the course of the main election cycle. Attempting to maximize those attributes and accuracy, I create efficient forecasts, for each of these three types of raw data, with innovations in aggregating the data, and then correlating the aggregated data with outcomes. The paper demonstrates that all three data types provide significant and meaningful contributions towards election forecasts. Varied stakeholders, including researchers, election investors, and election workers, can all benefit from the efficient combined forecasts declined in this paper; the forecast is tested and excels out-of-sample during the 2012 elections.

7) Fundamental models for forecasting elections at the state level with Patrick Hummel
Electoral Studies. 2014. Vol. 35, pp. 123-139
Click for abstract of paper »

This paper develops new fundamental models for forecasting presidential, senatorial, and gubernatorial elections at the state level using fundamental data from six categories: past election results, incumbency, presidential approval ratings, economic indicators, ideological indicators, and biographical information about the candidates. Despite the fact that our models differ from other state-level forecasting models in that they can be used to make forecasts of elections earlier than existing models and they do not use data from pre-election polls on voting intentions, our models give rise to lower out-of-sample forecasting errors for both the binary outcomes of elections and the fraction of the major party vote received by each candidate. We further illustrate new ways of incorporating various economic and political indicators into forecasting models that enable us to obtain a better understanding of what types of fundamental data most meaningfully predict the outcomes of elections in each state. Among our results, we find that economic variables are most meaningful as trends rather than levels and that second quarter data is as predictive of election outcomes as third quarter data.

8] Forecasting elections with non-representative polls with Wei Wang, Andrew Gelman, and Sharad Goel
International Journal of Forecasting. 2015. Vol. 31, pp. 980-991
Click for abstract of paper »

Election forecasts have traditionally been based on representative polls, in which randomly sampled individuals are asked for whom they intend to vote. While representative polling has historically proven to be quite effective, it comes at considerable financial and time costs. Moreover, as response rates have declined over the past several decades, the statistical benefits of representative sampling have diminished. In this paper, we show that with proper statistical adjustment, non-representative polls can be used to generate accurate election forecasts, and often faster and at less expense than traditional survey methods. We demonstrate this approach by creating forecasts from a novel and highly non-representative survey dataset: a series of daily voter intention polls for the 2012 presidential election conducted on the Xbox gaming platform. After adjusting the Xbox responses via multilevel regression and poststratification, we obtain estimates in line with forecasts from leading poll analysts, which were based on aggregating hundreds of traditional polls conducted during the election cycle. We conclude by arguing that non-representative polling shows promise not only for election forecasting, but also for measuring public opinion on a broad range of social, economic and cultural issues.

9) Are Polls and Probabilities Self-Fulfilling Prophecies? with Neil Malhotra
Research and Politics. 2014. July-September, 1-10.
Click for abstract of paper »

Psychologists have long observed that people often conform to majority opinion. This bandwagon effect occurs in the political domain as people learn about prevailing public opinion via ubiquitous polls. A recent phenomenon published probabilities derived from prediction market contract prices and aggregated polls may play a similar role. Consequently, polls and probabilities can become self-fulfilling prophecies whereby majorities, whether in support of candidates or policies, grow in a cascading manner. Despite the increased attention to whether measurement of public opinion can itself affect public opinion, the existing empirical literature is quite limited on the bandwagon effects of polls and non-existent on the effects of probabilities. To address this gap, we conducted an experiment on a diverse national sample in which we randomly assigned people to receive information about different levels of support (or probability of passage) for three public policies. We find that public opinion as expressed through polls significantly impacts individual-level attitudes whereas probabilities exhibit no effect. We also posit a mechanism underlying the bandwagon effect for polls: low public support decreases support for policies but high public support does not increase support. In sum, our study shows that measuring public opinion has the potential to change public opinion.

10) A comparison of forecasting methods: fundamentals, polling, prediction markets, and experts with Deepak Pathak and Miroslav Dudik
Email for Copy of Paper. This paper is forthcoming in Journal of Prediction Markets.
Click for abstract of paper »

We compare and contrast how Oscar forecasts we derive from four data types: fundamentals, polling, prediction markets, and experts, compete over four attributes: relevancy, accuracy, timeliness, and cost effectiveness. Fundamentals-based forecasts are relatively expensive to construct, an attribute the academic literature ignores too often, and update slowly over time, constraining their accuracy. However, fundamentals provide valuable insights into the relationship between key indicators for nominated movies and outcomes; other awards shows have a lot of predictive power and box office results very little. Polling-based forecasts have the potential to be both accurate and timely, but polling requires incentives for frequent responses by high-information users to stay timely, and proper transformation of raw polls into forecasts to be accurate. Prediction market prices are accurate forecasts, but simple transformations of raw prices into forecasts are the most accurate in our study. Experts create something similar to fundamentals, but are generally not comparatively accurate or timely. We believe that the results of this study generalize to many domains.

Working Papers
1) Forecasting Elections: Voter Intentions versus Expectations with Justin Wolfers
This Draft: January 23, 2013.
My first interview with Mark Blumenthal of, May 16, 2010 at AAPOR.
Click for abstract of paper »

In this paper, we explore the value of an underutilized political polling question: who do you think will win the upcoming election? We demonstrate that this expectation question points to the winning candidate more often than the standard political polling question of voter intention: if the election were held today, who would you vote for? Further, the results of the expectation question translate into more accurate forecasts of the vote share than the ubiquitous intent question. Our structural interpretation of the expectation question shows that every response is equivalent to a multi-person poll of intention; the power of the response is that it provides information about the respondent’s intent, as well as the intent of her friends and family. This paper has far reaching implications for all disciplines that use polling.

2) Trading Strategies and Market Microstructure: Evidence from a Prediction Market with Rajiv Sethi
This Draft: September 8, 2013
Click for abstract of paper »

We examine transaction-level data from Intrade’s presidential winner market for the two weeks immediately preceding the November 2012 election. The data allow us to compute key statistics, including volume, transactions, aggression, holding duration, directional exposure, margin, and profit for each of the over 3,200 unique trader accounts. We identify a diverse set of trading strategies that constitute a rich market ecology. These range from arbitrage-based strategies with low and fleeting directional exposure to strategies involving large accumulated positions in one of the two major party candidates. Most traders who make directional bets do so consistently in a single direction, unlike the information traders in standard models. We present evidence suggestive of market manipulation by a single large trader, and consider the possible motives for such behavior. Broader implications for the interpretation of prices in financial markets and the theory of market microstructure are drawn.

3) Expectations: Point-Estimates, Probability Distributions, and Forecasts
This Draft: September 20, 2012.
Click for abstract of paper »

In this paper I test a new graphical, interactive interface that captures both “best estimate” point-estimates and probability distributions from non-experts. When supplementing an expectation, a standard data point is directly stated confidence of the respondent or a confidence range. In contrast to those data points, my method induces the respondents to reveal a level of precision, and there is a sizable and statically significant positive relationship between the respondents’ revealed precision and the accuracy of their individual-level expectations. Beyond creating a more meaningful individual-level estimates, researchers can use this positive correlation between precision and accuracy to create precision-weighted aggregated forecasts that are more accurate than the standard “consensus forecasts”. Varying financial incentives does not affect these findings.

4) A new way to think about confidence ranges with Daniel G. Goldstein and Florian Teschner
Email for Copy of Paper.

5) The Mythical Swing Voter with Andrew Gelman, Sharad Goel, and Doug Rivers
Email for Copy of Paper.

6) Online and social media data as an imperfect continuous panel survey with Fernando Diaz, Michael Gamon, Jake Hofman, and Emre Kiciman
Email for Copy of Paper.
Click for abstract of paper »

There is a large body of research on utilizing online activity to predict various real world outcomes, ranging from outbreaks of influenza to outcomes of elections. There is considerably less work, however, on using this data to understand topic-specific interest and opinion amongst the general population and specific demographic subgroups, as currently measured by relatively expensive surveys. Here we investigate this possibility by studying a full census of all Twitter activity during the 2012 election cycle along with comprehensive search history of a large panel of internet users during the same period, highlighting the challenges in interpreting online and social media activity as the results of a survey. As noted in existing work, the online population is a non-representative sample of the offline world (e.g., the U.S. voting population). We extend this work to show how demographic skew and user participation is non-stationary and unpredictable over time. In addition, the nature of user contributions varies wildly around important events. Finally, we note subtle problems in mapping what people are sharing or consuming online to specific sentiment or opinion measures around a particular topic. These issues must be addressed before meaningful insight about public interest and opinion can be reliably extracted from online and social media data.

7) Manipulation in conditional decision markets with Florian Teschner and Henner Gimpel
Email for Copy of Paper.

8] Selection bias in documenting online conversations with Ran He
Email for Copy of Paper.

9) Expertise in the Field Fades in the Lab with Etan Green and Justin Rao
Click for abstract of paper »

We conduct field and laboratory experiments on the same panel of experts, measuring the internal consistency of their predictions 1) in the field, in their domain of expertise, and 2) on a conceptually identical laboratory exercise. Experts make internally consistent predictions in the field, both in absolute terms and relative to a panel of novices, but they exhibit markedly less consistency on the isomorphic lab exercise. Possible explanations for this fading expertise include low motivation in the lab and a failure to transfer skills learned implicitly in the field to the more abstract lab setting.