In our Ecosystem Explorer Series, we interview leaders from organizations who are advancing access to health data. Today’s interview is with Cliff Li, Senior Director of Consulting, and Bob Morrison, Vice President of Data Analytics at Clarivate.
Cliff Li is a Partner in Commercial Strategy and Market Access Team at Clarivate and is the lead of the US RWD service line. Cliff acts as a Subject Matter Expert on a wide range of issues related to US pricing, coverage, brand strategy, and market access dynamics. Cliff’s expertise in patient-longitudinal analytics and leveraging a wide range of data assets allows him to assist clients in executing more sophisticated engagements to help develop actionable, value-driven strategies and tactics. Cliff holds an MPH in Health Policy & Law.
Bob Morrison brings 30+ years of consulting, technical product development, and senior leadership experience in healthcare, technology, and digital marketing. He brings 12+ years of experience designing, building, and operating high-volume patient data integration workflows linking disparate patient data types for high-impact biopharma analytics and commercial monitoring. He has led the data workflow build, operation, and client support for Symphony Health, DRG, and Clarivate. He holds an MBA from the University of Rochester.
Clarivate™ is a leading global provider of transformative intelligence. Clarivate offers enriched data, insights & analytics, workflow solutions, and expert services in the areas of Academia & Government, Intellectual Property, andLife Sciences & Healthcare. Clarivate’s connected data, deep expertise, and intelligence platforms empowerLife Sciences and healthcare companies to deliver safe, effective, and commercially successful treatments to patients faster. Clarivate is home to Cortellis™, solutions for real-world data, medtech, market access and commercialization, and deep consulting expertise.
Cliff and Bob, thank you for joining us today. Starting with the basics, how do you define claims data?
Cliff: Claims data is information that captures services that occurred at the point of care - whether that be at a pharmacy, hospital, medical office, or other place of service - and also captures how it was adjudicated by payers. The power of claims data is the granularity it provides on the patient, payer, and treating provider.
What are the differences between open and closed claims, and why would a researcher choose one versus the other?
Bob: Open claims data include both medical claims and pharmacy claims primarily sourced from clearinghouses, pharmacies, and software platforms. Since open claims encompass multiple data types, they offer insights into patient touchpoints across the healthcare landscape with no limitations on timeframe. They are payer agnostic, so patients will not be lost when switching insurance plans. The recency of the data is also a differentiator–many medical, pharmacy, and lab visits within open claims data can be seen as recently as yesterday, offering near-real-time reporting and tracking.
Closed claims data, on the other hand, are derived from the insurance provider (or payer) and capture nearly all events that occur during a patient’s enrollment period, including medical and pharmacy visits and transactions for both retail and specialty settings. This provides a valuable view into the patient journey and connects patients’ diagnoses, actions, and decisions along the way with minimal gaps.
Cliff: Open data can be leveraged in ways that closed data is most commonly used, such as health outcomes or healthcare resource utilization studies, through patient stability panels, which allow the user to benefit from the scope and scale of an open dataset but provide strict continuous enrollment criteria to “close” the patient sample.
Closed databases have some limitations. Closed claims data have at least a 90-day data lag and tend to be limited to specific enrolled populations, which means certain insurers or employers. This can lead to bias in analysis unless handled correctly. For instance, if a patient switches to a different healthcare plan, they may either disappear from the database or appear as a new patient in the new plan's database.
Another limitation of closed databases is that they do not provide any information about any supplementary insurance or services that may have been paid for by cash.
With so many data types available to researchers, what makes claims data uniquely valuable to researchers? What are some common and emerging applications of de-identified claims data?
Clif: Longitudinal de-identified claims data is rich in granularity as it not only provides insight into a patient’s healthcare journey but also demographics related to the patient, demographics of the treating providers and the affiliated health systems, as well as the payer responsible for adjudicating and reimbursing claims. Other consumer industries may be touted for knowing their customer through data, but the healthcare industry has the richest source of information on patients through the power of claims data.
Common use cases of claims data within the biopharma industry are to generate evidence for emerging therapies, disease and market landscape assessments, patient journeys, and economic value stories related to patient access and health outcomes. It’s also used to measure market performance of a treatment and to stand up commercial programs to assist patients in getting access to treatments and promotional activities for relevant key stakeholders, such as treating physicians and specialists.
Many of the emerging applications of de-identified claims data are found in the integration with other data streams, such as lab, genomic, consumer, wearable, and exposure data, among others. An example of how this is creating innovative use cases for biopharma is with measuring ROI for digital promotion campaigns. By linking exposure metrics to target audiences within real-world data, companies can more effectively measure their marketing spend by prospectively looking at the impact of the campaigns on capturing new patients and keeping patients on therapy for longer.
Integration also allows biopharma to “fill in the gaps” of the patient journey and, as such, allows for more features to be leveraged in machine learning models to identify undiagnosed patients, specifically those with rare diseases, as well as drive high-value targeting models and other predictive analytics.
Speaking of ‘filling the gaps’ and creating a longitudinal view of the patient, are there certain types of data that are particularly powerful for fulfilling those applications when linked with claims data?
Cliff: Healthcare data is so rich at the moment with the growth in precision medicine, digital therapeutics, and wearable devices — when coupled with the wealth of information available in lab / genetic testing data and unstructured EMR, claims data can now provide a much broader picture of the patient’s diagnostic and treatment pathway, as well as assessing health outcomes. Having this enhanced view will enrich longitudinal patient studies that will evolve the way biopharma is demonstrating value to key stakeholders in meeting unmet clinical needs, increasing market access to needed treatments, and improving health outcomes.
Can you provide real-world examples of how Clarivate's connected, de-identified claims data has contributed to advancing our understanding of health and disease?
Bob: Recently, a global pharma company was seeking to understand the epidemiology of disease and treatment patterns for patients with a chronic autoimmune neuromuscular disease that causes weakness in the skeletal muscles, which are responsible for breathing and moving parts of the body, including the arms, legs, facial muscles, and others.
The primary objective of this study was to leverage RWD to refine the literature estimates of the indication prevalence (37K — 112K) and provide insight into drug treatment patterns.
Our team helped the client in understanding the overall diagnosed prevalence in the U.S., by age, gender, subtypes, capturing annual diagnosed incidence, and treatment patterns by class and line of therapy.
Read more about the way Clarivate Real-World Data fuels research breakthroughs: Peer-reviewed publication highlights featuring Real-World Data - Clarivate.
Bringing together data from multiple sources comes with its own set of challenges. How does Clarivate ensure the quality and consistency of data across different sources and healthcare systems, given the inherent heterogeneity in claims data?
Bob: There are a few processes we execute to ensure the quality and consistency of data across multiple sources:
All three processes are critical to ensuring that data blended from multiple sources is ready for analytic use. However, the QA suspense process is particularly effective in that it allows us to over-audit up front and then add back records that are found to be or become valid (e.g. invalid NDC due to lagged reference data).
How do you address privacy concerns surrounding the use of de-identified open claims data? Are there specific measures in place to ensure patient privacy and data security throughout the research process?
Bob: Clarivate only receives de-identified data from our data sources, and our data sets are regularly reviewed by an independent HIPAA Statistical Expert using the expert determination method. We build redaction rules and execute automatic ongoing redactions based on expert recommendations in order to ensure patient privacy is preserved in compliance with HIPAA standards.
If any data is added or removed from the core analytic dataset or if the core data is to be combined with any other data, then an updated Statistical Expert determination is performed and redaction rules are adjusted.
For data security, we maintain our data per industry standard best practices such as maintaining data encrypted at rest and enhanced user access and challenge controls.
A 2021 study used machine learning (ML) on claims data to predict hospital readmissions, improving patient care management. How do you think the growth of AI/ML will have an impact on research using claims data?
Bob: The accuracy and precision of ML will improve not only as models evolve, but as data lakes that drive the models become more robust and actionable. Broader integration of longitudinal claims data with various data streams will increase the availability and diversity of model features, thus enhancing the capabilities of ML approaches. This will be critical in research related to finding and assessing rare-disease patients, especially those with no diagnosis coding and complex diagnostic criteria — among a plethora of other use cases.
By applying RWD-driven machine learning algorithms, early detection, differential diagnosis, and risk stratification can facilitate and catalyze time to diagnosis. As an example, one of our clients had an algorithm that used EHR data to find potential patients with a specific rare disease but struggled with lower accuracy, missing many potential patients Our analytics team leveraged RWD products by inputting symptoms, diagnoses, procedures, and treatments from EHR and claims data within a five-year period into ML algorithms. Across the six ML models, accuracy ranged from 75% to 80%, including the validation test. Our client now has the ML-based algorithm that effectively identifies suspect patients with this rare disease, with the goal of ultimately reducing rates of delayed or missed diagnosis.
Provided the technology is used responsibly, AI and ML will have a significant positive impact as claims data is integrated more extensively into precision medicine applications. Moreover, it is crucial to consider the emerging regulations around AI/ML that will provide important protections to ensure that the excitement of the technological capability does not preclude responsible use and management. As the regulatory landscape evolves, the responsible integration of AI and ML in healthcare will foster advancements in regimen efficacy analytics by further enriching claims datasets with patient data from diverse modalities and devices such as Electronic Health Records (EHR), wearables, and other sources. This evolution will facilitate a migration from RWD to RWE.
Thanks for the interview. Any recommendations for our readers if they want to learn more?
AnalyticsIQ, a marketing data and analytics company, recently adopted Datavant’s state de-identification process to enhance the privacy of its SDOH datasets. By undergoing this privacy analysis prior to linking its data with other datasets, AnalyticsIQ has taken an extra step that could contribute to a more efficient Expert Determination (which is required when its data is linked with others in Datavant’s ecosystem).
AnalyticsIQ’s decision to adopt state de-identification standards underscores the importance of privacy in the data ecosystem. By addressing privacy challenges head-on, AnalyticsIQ and similar partners are poised to lead clinical research forward, providing datasets that are not only compliant with privacy requirements, but also ready for seamless integration into larger datasets.
"Stakeholders across the industry are seeking swift, secure access to high-quality, privacy-compliant SDOH data to drive efficiencies and improve patient outcomes,” says Christine Lee, head of health strategy and partnerships at AnalyticsIQ.
“By collaborating with Datavant to proactively perform state de-identification and Expert Determination on our consumer dataset, we help minimize potentially time-consuming steps upfront and enable partners to leverage actionable insights when they need them most. This approach underscores our commitment to supporting healthcare innovation while upholding the highest standards of privacy and compliance."
As the regulatory landscape continues to evolve, Datavant’s state de-identification product offers an innovative tool for privacy officers and data custodians alike. By addressing both state-specific and HIPAA requirements, companies can stay ahead of regulatory demands and build trust across data partners and end-users. For life sciences organizations, this can lead to faster, more reliable access to the datasets they need to drive research and innovation while supporting high privacy standards.
As life sciences companies increasingly rely on SDOH data to drive insights, the need for privacy-preserving solutions grows. Data ecosystems like Datavant’s, which link real-world datasets while safeguarding privacy, are critical to driving innovation in healthcare. By integrating state de-identified SDOH data, life sciences can gain a more comprehensive view of patient populations, uncover social factors that impact health outcomes, and ultimately guide clinical research that improves health.
Both payers and providers are increasingly utilizing SDOH data to enhance care delivery and improve health equity. By incorporating SDOH data into their strategies, both groups aim to deliver more personalized care, address disparities, and better understand the social factors affecting patient outcomes.
Payers increasingly leverage SDOH data to meet health equity requirements and enhance care delivery:
Payers’ consideration of SDOH underscores their commitment to improving health equity, delivering targeted care, and addressing disparities for vulnerable populations.
Capital District Physicians’ Health Plan (CDPHP) incorporated SDOH, partnering with Papa, to combat loneliness and isolation in older adults, families, and other vulnerable populations. CDPHP aimed to address:
By integrating SDOH data, CDPHP enhanced their services to deliver comprehensive care for its Medicare Advantage members.
Value-based care organizations face challenges in fully understanding their patient panels. SDOH data significantly assists providers to address these challenges and improve patient care. Here are some examples of how:
By leveraging SDOH data, providers gain a more comprehensive understanding of their patient population, leading to more targeted and personalized care interventions.
While accessing SDOH data offers significant advantages, challenges can arise from:
To overcome these challenges, providers must have robust data integration strategies, standardization efforts, and access to health data ecosystems to ensure comprehensive and timely access to SDOH data.
With Datavant, healthcare organizations are securely accessing SDOH data, and further enhancing the efficiency of their datasets through state de-identification capabilities - empowering stakeholders across the industry to make data-driven decisions that drive care forward.
Explore how Datavant can be your health data logistics partner.
Contact us