By Kathleen Gavin, Karin Eisinger
In the first half of 2023 the White House, the FDA, and the NIH have all highlighted the importance of advanced data ecosystems to accelerate major achievements in human healthcare and research. Their publications point to federal recognition that we have reached a critical tipping point for data connectivity. The White House, in its National Strategy to Advance Privacy-Preserving Data Sharing and Analytics, laid out a proposed path to advance privacy preserving data sharing and analytics (PPDSA), with the overarching goal of catalyzing American innovation and creativity by facilitating data linkage. Similarly, President Biden’s Bold Goals for U.S. Biotechnology and Biomanufacturing describes a data initiative with the goal of ensuring high-quality, wide-ranging, easily accessible, and secure biological datasets aimed at driving breakthroughs for the U.S. bioeconomy.
The White House is not alone in recognizing the critical importance of these linked datasets. In March, the FDA released draft guidance for Clinical Trial Considerations to Support Accelerated Approval of Oncology Therapeutics, recognizing that many post marketing requirements of confirmatory studies to verify clinical benefit are often not submitted on time, or at all. In addition to recommending randomized controlled trials (RCTs) as the preferred approach to support an accelerated approval application, they believe that RCT participants should undergo long term follow-up studies for verification of clinical benefit. This is a direct example of where data connectivity, in this case between a clinical trial and real world health data, could play a central role in fulfilling long term follow-up requirements by automating data collection, reducing participant and site burden, minimizing study costs, limiting attrition, ultimately facilitating advances and hastening timelines for regulatory submissions.
With this example at the forefront, it is clear that operationalizing data linkage and interoperability, while preserving privacy is a critical next step in maximizing the use of healthcare data.
Currently, multiple federally funded programs are developing data ecosystems by therapeutic area. For example, the Cancer Research Data Commons (CRDC) is the National Cancer Institute (NCI) supported cloud-based data science infrastructure aimed at facilitating data connectivity to drive discovery, surveillance, and clinical care in oncology. The CRDC, like other NIH sponsored data repositories, is a reliable central hub for research grade datasets, such as proteomic, genomic, imaging, and clinical trial data. Even so, the NCI recognizes that a more robust data ecosystem is necessary for meeting the ambitious goals of the Cancer Moonshot, publishing its National Cancer Plan in April. The plan provides a framework for collaboration across government and society and establishes eight goals that must be achieved for the Cancer Moonshot to be successful. Maximizing data utility is goal number seven, aspiring to a future where “secure sharing of privacy-protected health data is standard practice throughout research, and researchers share and use available data to achieve rapid progress against cancer.” In order to grow the Cancer Moonshot Data Ecosystem to this ambitious potential, driving scientific discovery and increasing the speed of translating precision medicine into clinical practice, data connectivity across both research and real world data (RWD) sources will be necessary all while protecting patient privacy for broad utilization.
Privacy Preserving Record Linkage Integration
The integration of a privacy preserving record linkage (PPRL) tool is well positioned to be a critical facilitator of achieving these data sharing goals. PPRL allows secure and private linkage of data for an individual across different datasets. This approach could accelerate data connectivity among historically unlinkable datasets as well as facilitate availability of data to the broader research and clinical community. Precedent for PPRL in federally funded programming was set by NCATS National COVID Cohort Collaborative (N3C), which created the largest national, publicly available patient-level limited dataset in U.S. history, harmonizing electronic health record (EHR) data from hundreds of health systems across the U.S. The N3C has unlocked numerous important insights into COVID-19, an example of what could be possible with similar data interoperability integrations in other therapeutic areas.
As of January 25, 2023 the NIH officially established a requirement for data sharing for all federally funded research. The intention of this policy, similar to the missions described above, is to accelerate discovery and promote data-reuse for future research studies, which should be an exciting step in the future of clinical research. However, without the proper steps taken to share data in a manner in which it can truly be reused and linked in a reliable, repeatable, and secure manner, most clinical research data will continue to be siloed in data repositories with no way of maximizing its true value.
Implementation of PPRL for federally funded clinical research results as part of the data sharing process would facilitate the activation of research data in a whole new way. Patients or study participants can (and should when possible) still provide informed consent for sharing their anonymized data using PPRL, keeping the public informed of the important use cases for data sharing and ideally leading them to trust its utility. Using this model, for the first time, linkage of clinical trial data and basic biology studies with RWD sources such as social determinants of health (SDOH) data, environmental, lifestyle and other phenotypic data, EHR, claims, pharmacy, and digital-wearable/remote patient monitoring could lead to the generation of truly novel health insights and spur discovery. The ability to enable clinically actionable patient classification, diagnosis and therapy, discovery of new personalized health biomarkers and therapeutic strategies as well as demonstrate their safety and efficacy, and implement them for use in clinical practice, in a cost effective, rapid manner all require RWD data linkage. PPRL is the most secure and reliable approach to meet this need.
It has never been more evident to physicians, scientists, and the federal government that data connectivity will play an important role in accelerating federally funded human health initiatives. Now comes the challenge of coming together on a unified data initiative that truly addresses the goals and disparate needs of federally funded programs to maximize the use of the rapidly expanding health data ecosystem.
If you would like to learn more about how to partner with Datavant on data linkage to advance public health initiatives, contact us here
AnalyticsIQ, a marketing data and analytics company, recently adopted Datavant’s state de-identification process to enhance the privacy of its SDOH datasets. By undergoing this privacy analysis prior to linking its data with other datasets, AnalyticsIQ has taken an extra step that could contribute to a more efficient Expert Determination (which is required when its data is linked with others in Datavant’s ecosystem).
AnalyticsIQ’s decision to adopt state de-identification standards underscores the importance of privacy in the data ecosystem. By addressing privacy challenges head-on, AnalyticsIQ and similar partners are poised to lead clinical research forward, providing datasets that are not only compliant with privacy requirements, but also ready for seamless integration into larger datasets.
"Stakeholders across the industry are seeking swift, secure access to high-quality, privacy-compliant SDOH data to drive efficiencies and improve patient outcomes,” says Christine Lee, head of health strategy and partnerships at AnalyticsIQ.
“By collaborating with Datavant to proactively perform state de-identification and Expert Determination on our consumer dataset, we help minimize potentially time-consuming steps upfront and enable partners to leverage actionable insights when they need them most. This approach underscores our commitment to supporting healthcare innovation while upholding the highest standards of privacy and compliance."
As the regulatory landscape continues to evolve, Datavant’s state de-identification product offers an innovative tool for privacy officers and data custodians alike. By addressing both state-specific and HIPAA requirements, companies can stay ahead of regulatory demands and build trust across data partners and end-users. For life sciences organizations, this can lead to faster, more reliable access to the datasets they need to drive research and innovation while supporting high privacy standards.
As life sciences companies increasingly rely on SDOH data to drive insights, the need for privacy-preserving solutions grows. Data ecosystems like Datavant’s, which link real-world datasets while safeguarding privacy, are critical to driving innovation in healthcare. By integrating state de-identified SDOH data, life sciences can gain a more comprehensive view of patient populations, uncover social factors that impact health outcomes, and ultimately guide clinical research that improves health.
Both payers and providers are increasingly utilizing SDOH data to enhance care delivery and improve health equity. By incorporating SDOH data into their strategies, both groups aim to deliver more personalized care, address disparities, and better understand the social factors affecting patient outcomes.
Payers increasingly leverage SDOH data to meet health equity requirements and enhance care delivery:
Payers’ consideration of SDOH underscores their commitment to improving health equity, delivering targeted care, and addressing disparities for vulnerable populations.
Capital District Physicians’ Health Plan (CDPHP) incorporated SDOH, partnering with Papa, to combat loneliness and isolation in older adults, families, and other vulnerable populations. CDPHP aimed to address:
By integrating SDOH data, CDPHP enhanced their services to deliver comprehensive care for its Medicare Advantage members.
Value-based care organizations face challenges in fully understanding their patient panels. SDOH data significantly assists providers to address these challenges and improve patient care. Here are some examples of how:
By leveraging SDOH data, providers gain a more comprehensive understanding of their patient population, leading to more targeted and personalized care interventions.
While accessing SDOH data offers significant advantages, challenges can arise from:
To overcome these challenges, providers must have robust data integration strategies, standardization efforts, and access to health data ecosystems to ensure comprehensive and timely access to SDOH data.
With Datavant, healthcare organizations are securely accessing SDOH data, and further enhancing the efficiency of their datasets through state de-identification capabilities - empowering stakeholders across the industry to make data-driven decisions that drive care forward.
Explore how Datavant can be your health data logistics partner.
Contact us