Transitioning machine learning models across electronic health record (EHR) versions can be improved by mapping different EHR encodings to a common vocabulary.

Clinical risk models based on Electronic Health Record (EHR) data can facilitate stratifying care, thereby improving outcomes while lowering costs. However, EHRs frequently employ different representations of patient data to tailor functionality to the needs of individual institutions and even different units within an institution. These differences hinder the development and use of clinical risk models that generalize across EHR systems over time and across institutions.

To address this problem, we use auxiliary knowledge from the Unified Medical Language System (UMLS), a collection of medical ontologies, to build clinical risk models that span multiple EHRs. We evaluate our method over an EHR system transition on two clinically relevant tasks, in-hospital mortality and prolonged length of stay. For both outcomes, a feature representation derived from EHR-specific events and the UMLS yields better results than using EHR-specific events alone.