Health Data & Analytics

Life Sciences

Government & nonprofits

Blog

Real-world data

Evolution of Electronic Health Record (EHR) Data

Datavant

April 28, 2022

min

Table of Contents

An interview with Stacey Long (Chief Strategy Officer, OMNY Health)

The volume of health data is growing exponentially. Even more notable is that patient data is becoming more readily available in privacy compliant formats for a variety of healthcare stakeholders to use in research, care improvement and reimbursement. In my blog post on healthcare data supply trends, I highlighted how electronic health record (EHR) data extracted from EHR software, disease registries and health systems represented the most common data type that partners brought to the Datavant ecosystem last year. To understand this trend, I interviewed Stacey Long, Chief Strategy Officer at OMNY Health, about the evolution she has seen in the EHR data landscape.

Stacey, you have had a long career in real-world data and analytics. Can you describe your background and experiences?

Thanks Su. I’ve been working in the real-world data (RWD) data for more than 25 years, first as a health services researcher, and later building products and tools to support researchers from government, provider, payer, andLife Sciences organizations through my positions with Thomson Reuters, Truven Health Analytics, and IBM. For the past year and half, I have focused on building out product strategy and services operations at OMNY Health. OMNY Health’s RWD-focused platform connects providers organizations,Life Sciences companies and patients for the purpose of sharing data and insights, and establishing collaborative research and quality improvement initiatives.

Can you level-set for our audience — what information can you get from EHR data?

EHR data consists of deep clinical content to support a diverse set of care, reimbursement and research initiatives. EHR systems capture diagnoses, procedures, pharmacy orders and drug administrations relevant to specific patient encounters, as well as vitals (e.g. BMI, blood pressure, oxygen saturation), lab orders and results, care setting, and date of services delivered by the provider. Additionally, medical history, family history, details on the clinician delivering the care, and clinician notes describing other observations and rationale for decisions are often available to add clinical context to these data elements.

What do you think is driving the increasing availability of EHR data?

Foundational to this trend is the nearly universal adoption of EHRs by providers over the past decade, driven by federal incentives to adopt EHR systems and CMS quality reporting requirements. Beyond that, I believe there are two driving forces in the increased amount of EHR data entering the RWD landscape. First, the acceptance of EHR-derived data to support evidence-based decisions has been growing both within provider organizations and across the broader healthcare ecosystem, most notably with the recent FDA guidance on using EHR data to support clinical trials and evidence generation. Second, technological advances have made obtaining and gaining insights from EHR data easier than ever. The consolidation of EHR vendors from hundreds to less than 20 groups has resulted in more uniformity and standardization across providers, more complete and accurate data, larger and more diverse patient populations, and more efficient data extraction. In addition, advancements in data science tools to mine the unstructured notes through natural language processing (NLP) and machine learning (ML) has opened a whole new set of analytics, increasing the robustness of insights derived from EHR data. Lastly, there are now privacy-preserving technologies that de-identify EHR data to maximize their utility while preserving patient privacy.

Beyond EHR software companies, health systems and specialty provider networks are becoming active collaborators in supporting research initiatives aimed at workflow efficiency, accelerating the development of new therapies, and improving patient outcomes. Health systems and specialty networks are making investments in the data they are generating to meet these goals, as well as utilize data for their own care management initiatives, quality improvement goals, and population health programs.

Where are you seeing EHR data be used the most? Which disease areas and use cases?

With the trend toward precision medicine, we are seeing requests for EHR data across all disease areas, but especially in therapeutic areas where disease severity and treatment effectiveness are measured through biometric changes in vitals, lab values, imaging, physician-reported severity scores, and more recently patient-reported outcomes (PRO). We are seeing these measures used in dermatology, autoimmune disease, ophthalmology, orthopedics, respiratory, cardiometabolic disease, and oncology. This data is being used to help identify undiagnosed rare disease populations, improve diversity of clinical trials, and address healthcare inequity.

With the acceleration in data availability and technology capability, do you see new applications for EHR data in the future that may have an even bigger impact on patient outcomes?

Emerging use cases for EHR data have centered around improving clinical trial efficiency, such as identifying eligible patients faster or using RWD to create a synthetic control arm. EHR data is increasingly linked to data collected through clinical trials, resulting in hybrid clinical trial-real world datasets for deeper understanding of the patient journey outside the trial. Additionally, they are informing pharmacovigilance and safety monitoring during care delivery to reduce provider reporting burden and offer more comprehensive reporting to manufacturers and regulatory bodies. For example, rather than rely on manual reporting systems such as FAERS or MAUDE, EHR data is being mined retrospectively for safety signals and long-term effectiveness and EHR software vendors are adding capabilities to report events during encounters. For this use case, it is important to have transparency in the data source and data curation methodology in order to meet FDA auditability requirements. The FDA emphasizes the importance of data transparency in their recent guidance on using EHR-sourced real-world data and registry data. Another emerging use case is the growing number of AI companies leveraging EHR data to develop predictive models to detect disease earlier or predict major clinical events. We are also seeing a growing trend in developing and integrating quality initiatives with EHR data and technology. Lastly, we are seeing a trend of EHR data linked to complementary data sets such as claims data which can contextualize the whole patient journey.

There are a lot of EHR data providers in the landscape – EHR software companies, disease registries that extract data from many EHRs, or research networks of health systems. What are some considerations for buyers when choosing an EHR data provider?

The right EHR data source really depends on the intended use case, which determines the data variables needed. As an industry, we are fortunate to have an increasing number of options for RWD sources. Some of the evaluation criteria for EHR sources include:

Availability of the data variables needed for the analysis
Representativeness of the source to capture the relevant care providers and their treated patient population
Patient population size and whether it meets statistical significance for the study
Completeness and longitudinality of the data as it relates to the intended use case
Cleanliness and degree of normalization applied to the data
Auditability of source data as needed for regulatory use cases
Ease of contracting and cost to procure the data

Some general purpose EHR systems are used by clinicians treating patients across a wide spectrum of inpatient or ambulatory settings, while other EHR systems capture elements specific to specialty areas. Specialty EHRs tend to have more depth of data in structured format relevant for specific diseases such as in oncology, cardiovascular, behavioral health, and dermatology. Academic Medical Centers or Specialty Hospitals (e.g. Children’s hospitals, VA hospitals) may attract certain types of patients and treating providers so EHR data from these care settings will reflect that patient and provider profile. EHR-derived registries usually capture data on patients with a specific disease and may have limited data fields unless manual abstraction is applied to capture more data elements in structured format. It is important to do your homework and evaluate the criteria listed above when considering different EHR data providers.

There are also trade-offs when considering whether to work with an EHR data originator that is closest to the point of care versus an EHR data aggregator. Data providers that are closest to the point of care may be able to provide source verification and auditability, which is needed for regulatory use cases. However, these sources may require data cleaning and standardization, which adds extra work for the researcher. Data aggregators with curated data sets can make the researcher’s job easier in terms of standardization of the data, although heavily curated data may limit some of the functionality and ability to detect differences across population groups.

We built the OMNY Health platform of curated research-ready de-identified clinical data sourced directly from a diverse set of provider organizations across the United States to make the process of EHR data selection and procurement both flexible and efficient from a contracting perspective. We designed our business operations and data models to address some of the considerations I mentioned above. For example, we work with specialty networks to pull out data that capture additional depth in disease-specific scores and measures that are not generally available. Our health system and specialty provider networks are active partners with OMNY to capture data to support a diverse set of retrospective and prospective research studies, as well as participate in quality initiatives.

How do you think about the use of unstructured data from EHRs? What is hindering further use of unstructured data which can provide a lot of research value?

Unstructured health data in EHR systems is a gold-mine of information to understand the ‘why’ of a patient’s diagnosis and treatment. Clinical notes capture the qualitative perspective from the patient, as well as the rationale behind health provider decisions in treatment selection and treatment changes. Common requests we receive are to understand reasons for changes in therapy, including dosing or therapy choice, documentation of genomic biomarkers, and availability of specific PRO measures. It is challenging to extract this information at-scale today, although the advancement in natural language processing (NLP) capabilities is helping to increase the usability of this information. Once extracted, clinical notes are transformed into structured or semi-structured fields that are analysis-ready, and personal health information is removed before augmenting the transformed data with the structured data fields of the EHR.

As an industry, we have advanced quickly with NLP and ML to mine unstructured notes. This activity is scaling but we’re also moving beyond unstructured text to other sources of unstructured data, such as images. We’re starting to address the challenges of de-identifying and consuming imaging data so that they are analytically meaningful.

What are you most excited for as the EHR data landscape continues to grow and evolve?

It’s exciting to see the growth of new data sources that can be used to connect the dots along the patient journey and outcomes. Linkages of EHR data to other data streams such as claims, registries, social determinants of health (SDOH) data, and now Internet of Things (IOT) data streams through tokenization is opening up new use cases as we strive to build a comprehensive picture of patient care and health outcomes. It is also exciting to contribute toward building out new data sources which incorporate information from under-represented patient populations to understand and address health inequity. I’m a firm believer that collaboration across the broader ecosystem will drive more evidence-based decisions and improve patient lives.

Thank you for sharing your insights with me, Stacey!

If you would like to learn more, email Stacey Long at stacey@omnyhealth.com or Su Huang at su@datavant.com. Special thanks to Stella Chang (OMNY Health) and Elenee Argentinis (Datavant) for their review of this post.

Editor’s note: This post has been updated on October 19, 2022 for accuracy and comprehensiveness.

Spotlight on AnalyticsIQ: Privacy Leadership in State De-Identification

AnalyticsIQ, a marketing data and analytics company, recently adopted Datavant’s state de-identification process to enhance the privacy of its SDOH datasets. By undergoing this privacy analysis prior to linking its data with other datasets, AnalyticsIQ has taken an extra step that could contribute to a more efficient Expert Determination (which is required when its data is linked with others in Datavant’s ecosystem).

AnalyticsIQ’s decision to adopt state de-identification standards underscores the importance of privacy in the data ecosystem. By addressing privacy challenges head-on, AnalyticsIQ and similar partners are poised to lead clinical research forward, providing datasets that are not only compliant with privacy requirements, but also ready for seamless integration into larger datasets.

"Stakeholders across the industry are seeking swift, secure access to high-quality, privacy-compliant SDOH data to drive efficiencies and improve patient outcomes,” says Christine Lee, head of health strategy and partnerships at AnalyticsIQ.

“By collaborating with Datavant to proactively perform state de-identification and Expert Determination on our consumer dataset, we help minimize potentially time-consuming steps upfront and enable partners to leverage actionable insights when they need them most. This approach underscores our commitment to supporting healthcare innovation while upholding the highest standards of privacy and compliance."

Building Trust in Privacy-Preserving Data Ecosystems

As the regulatory landscape continues to evolve, Datavant’s state de-identification product offers an innovative tool for privacy officers and data custodians alike. By addressing both state-specific and HIPAA requirements, companies can stay ahead of regulatory demands and build trust across data partners and end-users. For life sciences organizations, this can lead to faster, more reliable access to the datasets they need to drive research and innovation while supporting high privacy standards.

As life sciences companies increasingly rely on SDOH data to drive insights, the need for privacy-preserving solutions grows. Data ecosystems like Datavant’s, which link real-world datasets while safeguarding privacy, are critical to driving innovation in healthcare. By integrating state de-identified SDOH data, life sciences can gain a more comprehensive view of patient populations, uncover social factors that impact health outcomes, and ultimately guide clinical research that improves health.

The Power of SDOH Data with Providers and Payers to Close Gaps in Care

Both payers and providers are increasingly utilizing SDOH data to enhance care delivery and improve health equity. By incorporating SDOH data into their strategies, both groups aim to deliver more personalized care, address disparities, and better understand the social factors affecting patient outcomes.

Payers Deploy Targeted Care Using SDOH Data

Payers increasingly leverage SDOH data to meet health equity requirements and enhance care delivery:

Tailored Member Programs: Payers develop specialized initiatives like nutrition delivery services and transportation to and from medical appointments.
Identifying Care Gaps: SDOH data helps payers identify gaps in care for underserved communities, enabling strategic in-home assessments and interventions.
Future Risk Adjustment Models: The Centers for Medicare & Medicaid Services (CMS) plans to incorporate SDOH-related Z codes into risk adjustment models, recognizing the significance of SDOH data in assessing healthcare needs.

Payers’ consideration of SDOH underscores their commitment to improving health equity, delivering targeted care, and addressing disparities for vulnerable populations.

Example: CDPHP supports physical and mental wellbeing with non-medical assistance

Capital District Physicians’ Health Plan (CDPHP) incorporated SDOH, partnering with Papa, to combat loneliness and isolation in older adults, families, and other vulnerable populations. CDPHP aimed to address:

Social isolation
Loneliness
Transportation barriers
Gaps in care

By integrating SDOH data, CDPHP enhanced their services to deliver comprehensive care for its Medicare Advantage members.

Providers Optimize Value-Based Care Using SDOH Data

Value-based care organizations face challenges in fully understanding their patient panels. SDOH data significantly assists providers to address these challenges and improve patient care. Here are some examples of how:

Onboard Patients Into Care Programs: Providers use SDOH data to identify patients who require additional support and connect them with appropriate resources.
Stratify Patients by Risk: SDOH data combined with clinical information identifies high-risk patients, enabling targeted interventions and resource allocation.
Manage Transition of Care: SDOH data informs post-discharge plans, considering social factors to support smoother transitions and reduce readmissions.

By leveraging SDOH data, providers gain a more comprehensive understanding of their patient population, leading to more targeted and personalized care interventions.

While accessing SDOH data offers significant advantages, challenges can arise from:

Lack of Interoperability and Uniformity: Data exists in fragmented sources like electronic health records (EHRs), public health databases, social service systems, and proprietary databases. Integrating and securing data while ensuring data integrity and confidentiality can be complex, resource-intensive and risky.
Lag in Payer Claims Data: Payers can take weeks or months to release claims data. This delays informed decision-making, care improvement, analysis, and performance evaluation.
Incomplete Data Sets in Health Information Exchanges (HIEs): Not all healthcare providers or organizations participate in HIEs. This reduces the available data pool. Moreover, varying data sharing policies result in data gaps or inconsistencies.

To overcome these challenges, providers must have robust data integration strategies, standardization efforts, and access to health data ecosystems to ensure comprehensive and timely access to SDOH data.

SDOH data holds immense potential in transforming healthcare and addressing health disparities.

With Datavant, healthcare organizations are securely accessing SDOH data, and further enhancing the efficiency of their datasets through state de-identification capabilities - empowering stakeholders across the industry to make data-driven decisions that drive care forward.