Health Language Blog

Leveraging Administrative Data for Better Quality Assessment

Posted on 08/30/16

The Role of Data Collection Standards for Race, Ethnicity, Sex, Primary Language, and Disability Status

In this installment of our Terminology Standards series, we will explore the growing need for accurate Population_Colorful_PopHealth.jpgtracking of patient demographics, evolving industry standards, and additional best practice steps for ensuring data is complete.

At a high level, healthcare industry movements aim to improve care delivery across regions and groups by assessing such quality indicators as barriers to access, health disparities, and the performance of community safety nets. Yet, obtaining meaningful data to measure these factors has proved daunting. As such, there is increasing interest in examining "administrative" data housed in computerized records and billing processes.

Billing forms identify key demographics such as age, gender, city, state, and zip codes, which in turn provide information on specific population sectors and where they reside. However, effectively identifying more specific data such as race, ethnicity, and language is much harder across disparate systems where the terminologies used to identify these attributes vary.

To effectively close the gaps in healthcare quality, researchers and healthcare organizations need access to this granular information. Thus, standardization of how this information is electronically represented is important.

In response to this need, the U.S. Department of Health and Human Services – Office of Minority Health (OMH) published the Data Collection Standards for Race, Ethnicity, Sex Primary Language, and Disability Status to support the Affordable Care Act health data collection and analysis strategy. This is an important step in the right direction, yet there are still challenges that healthcare organizations must overcome to comprehensively address standardization of all administrative data.

What are the administrative data types?

  • Age and gender
  • County, state, and zip code
  • Race, ethnicity, and language

The challenge of collecting race and ethnicity data

On a global scale, gathering race and ethnicity data advances research for treatment of diseases that affect specific populations. On the health system level, tracking of this information can improve population health management efforts.

Defining race and ethnicity is challenging because patients are often reluctant to respond to questionnaires or fail to self-disclose information accurately. Also, the representation of specific categories representing race on hospital and outpatient intake forms is challenging due to regional differences in definitions. Patients may not agree with the broad categories offered—such as Hispanic—when a patient is multi-cultural or from a country that is not considered “Hispanic” outside of the United States. For these reasons, national standardized categories for race and ethnicity do not always addresses the specific nuances of a region. Health networks must respond by expanding strategies to support accurate aggregation of all ethnicity categories.

The challenge of collecting language data

Understanding language preferences is important to the treatment of ethnically diverse populations. Communication gaps cause patients to seek care less frequently or from multiple sources, increasing the likelihood of duplicate services or errors. When practitioners understand and communicate the language preference of their patients through interpreters, they build trust and improve patient engagement—and ultimately, care outcomes.

Similar to the challenges of race and ethnicity, accurately identifying all language preferences in one standardized list is difficult. There are currently more than 600 languages in use across the United States. Therefore, a single list of languages will not address all possible use cases in a particular region, requiring that health systems implement systems to address these variances.


As health networks advance population health management strategies, the need for gathering and standardizing demographic data is increasingly important. Content sets and tools exist in the industry that can minimize the complexities of standardizing administrative data and aid IT leaders in their quest to improve analytics.

Read Previous Blog Post                                                                  Read Next Blog Post


Topics: terminology, population health management, Administrative Data, Quality Assessment

About the Author

Beverly Holley has 14 years of medical industry experience working with both providers and payers. She is a Certified Professional coder through the American Academy of Professional Coders (AAPC) and a certified ICD-10 Trainer through the American Health Information Management Association (AHIMA). Additionally, Beverly recently completed an American Medical Informatics Association (AMIA) course in Bioinformatics where she prepared a paper on the use of Value Sets. Beverly joined the Health Language team in 2013 and currently leads a team of professionals that are a vital part of our content consulting team. In that role, she oversees the professional services clinical implementation activities. She and her team help our clients map their data to standardized terminologies, create and maintain solid data governance practices, remediate for ICD-10 and provide consulting services for our 280+ content sets.