Health Language Blog

3 Ways Data Normalization is Key to Link Claims and Clinically-Sourced Data

Posted on 09/05/14


Working with electronic health data is no walk in the park.

Just ask the numerous healthcare organizations that want to share data among disparate IT systems or pull data from multiple data sources for analytical or reporting purposes.  A significant challenge is being able to leverage both claims and clinically-sourced data.

Claims data 101

When we refer to claims data, we are speaking specifically about the electronic data that is used to represent bills submitted by physicians to health plans.  The cool thing about claims data is that since providers generally want to get paid for their services, it is ubiquitous and easy to obtain.  Claims data provides a broad view of a patient’s interaction with healthcare systems as nearly every encounter with a patient, irrespective of whether it is coming from a lab, a doctors office, a pharmacy, or a physician therapist will generate a claim and the data within that claim can be easily aggregated for analytical purposes. Despite growing adoption of EMRs, not all physicians use an EMR at this point, so in these cases only claims data may be available to describe diagnosis and procedures.  But, the billing codes such as ICD-9-CM, even ICD-10-CM, and others are designed to categorize or group patients.  So often the actual diagnosis is not captured.  For example, a patient may have an ICD-10-CM code of Other specified heart block which does not describe the patient’s actual condition of a sinoauricular block.    

Clinical data 101

Clinical data can derive from many sources but is now most readily available from the EMR.  This clinical data is often more accurate compared to claims data in terms of documenting what’s really going on with the patient.  For example, the EMR will include an active patient problem list that describes in more clinical detail the conditions.  As in the example above, the patient’s sinauricular block will be on the problem list allowing the cardiologists to understand exactly what kind of heart block the patient suffers.   Further, clinical data is a richer dataset - it contains data elements such as vital signs, habits, lists of non prescription drugs, survey results, and so forth. Clinical data also includes the vast amount of unstructured documents such as progress notes, H&P’s, and discharge summaries that can be mined for a wealth of information.

A study published by the Journal of American Medical Association (JAMA) showed that of children with EMR blood pressure values that were high on at least three separate doctor visits, only 26 percent of them had a claim with a diagnosis code of hypertension on it.  With clinical data, you can infer certain medical conditions (as in this case hypertension) even without a diagnosis code.

Finally, clinical data is more timely - a critical component when treating sick patients.  It’s well known that follow-up after a hospital admission drastically reduces the possibility of readmission.  If you are primary care doctor in an ACO and you want to be notified within seven days that one of your patients has been admitted to the hospital, you definitely don’t want to wait for the claim to be submitted (it could take days or even months) and processed before you are notified.  As providers interact with the EMR in near real time, a system which evaluates data from an EMR can allow a much more rapid response time.

How do I utilize both sources? Data normalization to the rescue

The integration of clinical and claims data can generate a number of benefits for healthcare organizations. For one, they can obtain a holistic view of the patient and his or her care over time. In addition, integration can contribute to population health management and emerging healthcare delivery models such as ACOs.

However, the integration of claims data and clinically sourced data is one of the most difficult challenges in the field of data sharing.  A major roadblock in making this happen boils down to a Babel of languages.  The fact is, the industry lacks a single, universally accepted standard that defines the meaning of every type of healthcare data. There’s a lot of ambiguity to overcome.  In fact, as we push towards requiring standards in healthcare, we are seeing an increasing number of disparate terminology standards!  

Data normalization, however, seeks to harmonize data from different sources into standard terminologies. This technology relies on initial automated mappings to quickly match incompatible terminologies to a shared vocabulary. So, with data normalization providers and payers are better suited to bring those two datasets together.  

Three ways that data normalization is key to linking claims and clinical data:

1. Overcoming Historical Isolation

Claims data offers information on the services provided to a patient, often over a period of many years. Clinical data provides a greater wealth of detail on each patient. The clinical side provides information on the outcome of services and procedures. While claims data will list medical tests administered to a patient, it’s the clinical data that provides the results of those tests. Combining the two datasets provides a comprehensive view of the patient.

But claims and clinical data have been collected separately over the years and maintained in separate IT systems. Claims and clinical data are as far apart as two data sources can be -- and it’s hard enough getting sources of clinical data to talk to each other. Data normalization, however, can rationalize the differences between claims and clinical data, providing the ability to match vastly different terminology  and coding schemas to a system of shared meaning.

For example, if you were trying to identify all of your patients that had heart blocks, you would want to pull the ICD-10-CM code of Other specified heart block (I45.5) and the SNOMED CT code for sinoauricular block along with many other SNOMED codes.  Maps between SNOMED and ICD can enable the aggregation of both claims and clinical sources of data.  

2. Speeding Up The Process

The process of cleaning, organizing and, finally, combining clinically sourced data and claims data can take months. The more data healthcare providers and payers maintain, the longer the task will take. Merging vastly divergent sets of data is never going to be easy. But the automation provided through a data normalization solution can greatly simplify -- and accelerate -- the linkage of clinical and claims data. A manual process of interpreting data just isn’t practical given the volumes of data confronting healthcare organizations. What’s more, the clock is ticking on some healthcare initiatives that rely on bridging the clinical-claims gap. Accountable care organizations (ACOs) with Medicare contracts, for example, must be able to reduce costs and meet quality measures within a limited timeframe.

3  Dealing With Exceptions

The job of joining clinical and claims data, as noted, involves a high degree of difficulty. A particularly troubling hurdle is the amount of custom content that needs to be combined. Advanced solutions provide the ability to create custom maps to reconcile the exceptions that would otherwise bog down an integration project. For example, lab tests and results pose a particular challenge.  Your claims data will include CPT codes, but your clinical data will increasingly include LOINC codes and also local proprietary lab codes.  So if you needed to determine which of your patients with diabetes were receiving Hemoglobin A1C tests, you need to have the local codes mapped to LOINC and the ability to pull both related LOINC and CPT codes.  Some of the local lab test mappings to LOINC may require manual intervention to deal with the exceptions that were not able to be automapped.  

How are you addressing the integration of clinical and claims data? Leave your comments below.
data normalization

Topics: data normalization

About the Author

Dr. Brian Levy, MD is Vice President and Chief Medical Officer with Health Language, part of Wolters Kluwer Health. He holds an MD and BS from the University of Michigan. Go Blue!