A Robust Machine Learning Approach for Variable Importance in Health Spending
Harvard University, Department of Health Care Policy
Tuesday, March 29, 2016
12:30 to 2pm
180A Longwood Ave 224-E
Lunch will be provided.
Abstract: The impact of medical conditions on health care spending has almost exclusively been examined in parametric regression for health plan payment risk adjustment. This paper presents nonparametric machine-learning-based effect estimators for variable importance to understand the role of individual medical condition categories in health spending among commercially insured enrollees. We evaluate how much more, on average, enrollees with each medical condition cost after controlling for demographic information and other medical conditions. This is accomplished within the targeted learning framework using targeted maximum likelihood estimation and super learning to estimate the effects of these medical conditions. Our results demonstrate that multiple sclerosis, congestive heart failure, severe cancers, major depression and bipolar disorders, and chronic hepatitis are the most costly medical conditions on average per individual. In contrast, standard parametric regression formulas for plan payment risk adjustment differed nontrivially both in the size of effect estimates and relative ranks. The health spending literature may be considerably underestimating the spending contributions of a number of medical conditions, which is a potentially critical oversight. If current risk-adjustment methods are not capturing the true incremental effect of medical conditions, undesirable incentives in health insurance markets may remain.