Health Data & Analytics
Life Sciences
Government & nonprofits

Datavant Incorporates Leading Referential Data Source for Privacy-Preserving Enterprise Identity Resolution

Author
Publish Date
Read Time
May 16, 2023
Table of Contents

Datavant is proud to announce the integration of a new referential dataset into Datavant Match. Datavant Match is the industry-leading privacy-preserving record linkage (PPRL) solution for identity resolution. It addresses data fragmentation with end-to-end privacy technology and advanced machine learning algorithms. Whether you’re running a long term safety study, public health research on health equity, or a medication adherence analytics exercise, highly accurate matching is critical. Referential data improves the quality and number of matches in a given linked dataset, and enables reliable identity resolution between disparate datasets across the enterprise.

Real world data linkage challenges often revolve around matching strategy

Real world data is highly variable: data is collected and standardized in different ways and personal identifiers naturally change due to life events. These challenges are important to solve because the quality of data linkage can determine the success or failure of entire studies. Organizations often struggle to build a comprehensive matching strategy. Simple algorithms result in many false positive and false negative matches, while complex strategies are difficult to scale. On top of that, there is a need to ensure privacy while linking records.

Datavant has built expertise in data standardization, high-accuracy matching, and patient privacy to solve these problems. Our customers can leverage the Match-generated highly-stable identifier, a Datavant ID (DVID), as the source of truth, defining which records across disparate datasets correspond to the same individual. The DVID simplifies enterprise-wide identity resolution by giving each record its own person-level identifier.

DVIDs are optimized to remain stable, preventing customers from having to frequently update DVIDs, and therefore reducing manual work. Whenever Match finds individuals that were already detected, it will assign them the same DVID. After the one-time set up, Match will automatically run when it sees new records, assign DVIDs, and distribute those DVIDs at standard intervals.

What is referential data?

Referential data is a term used to describe a high-quality, comprehensive dataset that can be used to make inferences about other datasets. In this case, Match leverages a dataset with 30+ years of historical information, covering the entire United States population and containing billions of records. Using public records, online data, and proprietary sources alongside rigorous data hygiene practices, this curated dataset tracks changes in personal identifiers for individuals across time. Datavant Match maps tokenized records to tokenized identities in the referential dataset, linking records and assigning DVIDs with high accuracy.

Most matching strategies struggle to link records when the underlying personal identifiable information (PII) is completely different (e.g., last names change through marriage, a new address used, gender changes). With referential data, however, Match can make probabilistic inferences to accurately link records even when the underlying PII has changed. Previously, high-precision matching may have meant far fewer matches. Referential data mitigates this by tracking personal identifiers over time, while linking as many records as possible to maximize cohorts. The result is that Match uncovers more highly accurate matches.

Datavant Match powers scientifically rigorous use cases

Match drives high accuracy in data linkage, achieving 99% precision and 95% recall in internal studies. Use cases requiring high accuracy such as external control arm development and real-world evidence generation for drug efficacy require highly-accurate matching. False positive matches result in inaccurate patient tracking, jeopardizing the quality of the study. Achieving high precision and high recall means there is less of a trade off between high quality matches and more matches. With Match, pharmaceutical companies are better poised to find the adequate number of patients for rare disease studies; providers can more accurately track the effect of interventions on their patient population; researchers can confidently conduct longitudinal studies, mitigating the risk of dropping patients due to the variable, fragmented nature of real-world data. Meanwhile, organizations simplify their datasets by using the DVID as the source of truth identifier for individuals. Leveraging the referential dataset, Datavant Match drives high-quality patient matching and innovation across siloed verticals in healthcare.

Beyond single use cases, Match enables enterprise-wide identity resolution

Datavant Match goes beyond record matching for isolated use cases — it also enables reliable identity resolution between disparate datasets across the enterprise. For organizations with many identified datasets, sharing across the enterprise is not possible due to privacy and security regulations. Even if data can be shared, creating an identified matching strategy is difficult when certain datasets have incomplete or disjointed PII fields. This creates data silos within organizations. But by tokenizing and matching these data through Match, organizations remove the ambiguity from matching and create unified enterprise data assets. Using a highly-stable identifier (DVID) across all datasets leads to a more thorough understanding of populations of interest, unlocking new trial participants, opportunities for targeted marketing campaigns, longitudinal tracking, and cross-disciplinary research.

Get started with Match

With referential data, advanced machine learning, and proprietary data standardization and cleaning methods, Datavant Match immediately adds value by surfacing more, higher-quality matches. Datavant Match is used by top pharma companies, data analytics companies, employers, non-profit and academic institutions, payers, and providers to compliantly match patient records across a range of use cases.

To explore how Match could power your use case or enterprise-wide data interoperability goals, contact our team today.

Spotlight on AnalyticsIQ: Privacy Leadership in State De-Identification

AnalyticsIQ, a marketing data and analytics company, recently adopted Datavant’s state de-identification process to enhance the privacy of its SDOH datasets. By undergoing this privacy analysis prior to linking its data with other datasets, AnalyticsIQ has taken an extra step that could contribute to a more efficient Expert Determination (which is required when its data is linked with others in Datavant’s ecosystem).

AnalyticsIQ’s decision to adopt state de-identification standards underscores the importance of privacy in the data ecosystem. By addressing privacy challenges head-on, AnalyticsIQ and similar partners are poised to lead clinical research forward, providing datasets that are not only compliant with privacy requirements, but also ready for seamless integration into larger datasets.

"Stakeholders across the industry are seeking swift, secure access to high-quality, privacy-compliant SDOH data to drive efficiencies and improve patient outcomes,” says Christine Lee, head of health strategy and partnerships at AnalyticsIQ. 

“By collaborating with Datavant to proactively perform state de-identification and Expert Determination on our consumer dataset, we help minimize potentially time-consuming steps upfront and enable partners to leverage actionable insights when they need them most. This approach underscores our commitment to supporting healthcare innovation while upholding the highest standards of privacy and compliance."

Building Trust in Privacy-Preserving Data Ecosystems

As the regulatory landscape continues to evolve, Datavant’s state de-identification product offers an innovative tool for privacy officers and data custodians alike. By addressing both state-specific and HIPAA requirements, companies can stay ahead of regulatory demands and build trust across data partners and end-users. For life sciences organizations, this can lead to faster, more reliable access to the datasets they need to drive research and innovation while supporting high privacy standards.

As life sciences companies increasingly rely on SDOH data to drive insights, the need for privacy-preserving solutions grows. Data ecosystems like Datavant’s, which link real-world datasets while safeguarding privacy, are critical to driving innovation in healthcare. By integrating state de-identified SDOH data, life sciences can gain a more comprehensive view of patient populations, uncover social factors that impact health outcomes, and ultimately guide clinical research that improves health. 

The Power of SDOH Data with Providers and Payers to Close Gaps in Care

Both payers and providers are increasingly utilizing SDOH data to enhance care delivery and improve health equity. By incorporating SDOH data into their strategies, both groups aim to deliver more personalized care, address disparities, and better understand the social factors affecting patient outcomes.

Payers Deploy Targeted Care Using SDOH Data

Payers increasingly leverage SDOH data to meet health equity requirements and enhance care delivery:

  • Tailored Member Programs: Payers develop specialized initiatives like nutrition delivery services and transportation to and from medical appointments.
  • Identifying Care Gaps: SDOH data helps payers identify gaps in care for underserved communities, enabling strategic in-home assessments and interventions.
  • Future Risk Adjustment Models: The Centers for Medicare & Medicaid Services (CMS) plans to incorporate SDOH-related Z codes into risk adjustment models, recognizing the significance of SDOH data in assessing healthcare needs.

Payers’ consideration of SDOH underscores their commitment to improving health equity, delivering targeted care, and addressing disparities for vulnerable populations.

Example: CDPHP supports physical and mental wellbeing with non-medical assistance

Capital District Physicians’ Health Plan (CDPHP) incorporated SDOH, partnering with Papa, to combat loneliness and isolation in older adults, families, and other vulnerable populations. CDPHP aimed to address:

  • Social isolation
  • Loneliness
  • Transportation barriers
  • Gaps in care

By integrating SDOH data, CDPHP enhanced their services to deliver comprehensive care for its Medicare Advantage members.

Providers Optimize Value-Based Care Using SDOH Data

Value-based care organizations face challenges in fully understanding their patient panels. SDOH data significantly assists providers to address these challenges and improve patient care. Here are some examples of how:

  • Onboard Patients Into Care Programs: Providers use SDOH data to identify patients who require additional support and connect them with appropriate resources.
  • Stratify Patients by Risk: SDOH data combined with clinical information identifies high-risk patients, enabling targeted interventions and resource allocation.
  • Manage Transition of Care: SDOH data informs post-discharge plans, considering social factors to support smoother transitions and reduce readmissions.

By leveraging SDOH data, providers gain a more comprehensive understanding of their patient population, leading to more targeted and personalized care interventions.

While accessing SDOH data offers significant advantages, challenges can arise from:

  • Lack of Interoperability and Uniformity: Data exists in fragmented sources like electronic health records (EHRs), public health databases, social service systems, and proprietary databases. Integrating and securing data while ensuring data integrity and confidentiality can be complex, resource-intensive and risky.
  • Lag in Payer Claims Data: Payers can take weeks or months to release claims data. This delays informed decision-making, care improvement, analysis, and performance evaluation.
  • Incomplete Data Sets in Health Information Exchanges (HIEs): Not all healthcare providers or organizations participate in HIEs. This reduces the available data pool. Moreover, varying data sharing policies result in data gaps or inconsistencies.

To overcome these challenges, providers must have robust data integration strategies, standardization efforts, and access to health data ecosystems to ensure comprehensive and timely access to SDOH data.

SDOH data holds immense potential in transforming healthcare and addressing health disparities. 

With Datavant, healthcare organizations are securely accessing SDOH data, and further enhancing the efficiency of their datasets through state de-identification capabilities - empowering stakeholders across the industry to make data-driven decisions that drive care forward.

See all blogs

Achieve your boldest ambitions

Explore how Datavant can be your health data logistics partner.

Contact us