BARP: Using Bayesian Additive Regression Trees to Improve Multi-Level Regression and Post-Stratification

Working Paper.

Multi-level regression and post-stratification (MRP) is the current gold standard for extrapolating opinion data from nationally representative surveys to smaller geographic units.  However, MRP requires researchers to model opinion as a linear function of individual- and state-level covariates, risking errors due to mis-specification. I propose a modified version of MRP that replaces the multi-level model with a non-parametric classification method called Bayesian Additive Regression Trees (BART or, when combined with post-stratification, BARP). I compare both methods across a number of data contexts, finding that BARP consistently outperforms MRP mainly due to its insulation from mis-specification. Both methods are equally vulnerable to data quality issues, affirming previous research on the limitations of MRP. In addition, I present evidence of systematic bias in MRP estimates when opinion is correlated with population, a novel finding with implications for applied research on representation. [BACK]