Explainable Health Risk Predictor with Transformer-based Medicare Claim Encoder

Abstract

In 2019, The Centers for Medicare and Medicaid Services (CMS) launched an Artificial Intelligence (AI) Health Outcomes Challenge seeking solutions to predict risk in value-based care for incorporation into CMS Innovation Center payment and service delivery models. Recently, modern language models have played key roles in a number of health related tasks. This paper presents, to the best of our knowledge, the first application of these models to patient readmission prediction. To facilitate this, we create a dataset of 1.2 million medical history samples derived from the Limited Dataset (LDS) issued by CMS. Moreover, we propose a comprehensive modeling solution centered on a deep learning framework for this data. To demonstrate the framework, we train an attention-based Transformer to learn Medicare semantics in support of performing downstream prediction tasks thereby achieving 0.91 AUC and 0.91 recall on readmission classification. We also introduce a novel data pre-processing pipeline and discuss pertinent deployment considerations surrounding model explainability and bias.