No items found.

/

Hackathon Preview: 4 Tips for Balancing Speed and Security in Hackathon Project Development

September 6, 2022

min

Spoiler alert: Don’t hard-code your credentials

Photo by Chris Barbalis courtesy of Unsplash

From September 8–11, 2022, Datavant will host our first annual Future of Healthcare Hackathon. Hackers can submit projects to one of three tracks, Privacy, Public Health, and Improving Patient Care. Regardless of the track you choose, security within your project is always a critical consideration. Below, Datavant’s Head of Security Ben Waugh and Product Security Engineer David Gold offer some considerations around hackathon project security.

Hackathons are an integral part of the ongoing growth of any technology company, as they create an opportunity for teams to self-organize for a fixed period of time and tackle a challenging business problem. A hackathon embodies the ideals of experimentation, speed over perfection, and of course, getting things done. From Twilio’s Tweak Week to Atlassian Ship It Days, we (Ben and David!) have had the opportunity to take part in many such events over the years, and as security practitioners, we would be lying if we said we had never seen things come up in these events that were…a little concerning. Issues often arise from teams prioritizing development speed over considerations of security (read on for examples!). Yet, we’ve noticed there is little guidance out there as to how hackathon teams can keep security and privacy at top of mind as they compete without slowing down their project.

Hence, in the lead up to Datavant’s Future of Healthcare Hackathon, we’re sharing some advice and tips for teams to best manage security and privacy risks in their projects. And since we’re talking about security, these tips apply just as much to day-to-day work as they do to a hackathon project.

Why care about security in a hackathon project?

Risk profiles of different projects will always be different. Rather than disregard security entirely, though, it is better to consider the specific risk profile of a project, and avoid the potential issues lying therein. Not only will this help you improve the actual security of your project, you might also impress the more security-minded judges. And if the project is to have any kind of subsequent longevity, these security issues are issues you would have to deal with down the line anyway.

Here are four key considerations (and one bonus suggestion!) to keep in mind while building your project:

Research software frameworks.

There’s a good chance you will want to experiment with a new framework while developing your hackathon project. Experimentation is great! That’s what a hackathon is for! But, it’s always a good idea to do some background research on a new tool to see if the developers produced any security best practices documentation. This is especially true if your project is going to be hosted publicly or will interact with any non-public data or systems. Where are you going to show your project? What kind of third-party tools do you need to bring in to make it happen? These decisions can potentially create an unforeseen sensitivity.

If your frameworks are touching things you don’t ordinarily work with (databases, infrastructure, etc.), it is even more imperative to seek security guidance for these tools. For example, Flask explains a couple common web app vulnerabilities such as cross-site scripting and man-in-the-middle attacks, and describes how to address them within the framework. Often, simple configuration settings or helper methods can be used to mitigate major security risks with little effort.

You might also consider whether your aim is to build something usable, or simply to show that something is possible. If you’re building a product feature or enhancement that relies on an open source library or software package, does it by extension require your entire project to be built in an open-source framework? If your goal is to create something actually usable, this strategy, while perhaps easier to get rolling, may limit the future use cases it could have. Of course, if the project doesn’t depend on the library, but just makes use of it as a proof of concept, then this is less problematic, and a future rebuild could be made without that library in it.

As a hackathon strategy, the more research you can do on a framework you intend to experiment with before the hackathon kicks off, the better. This allows you to go in prepared, and not spend valuable time doing this background research during the hackathon.

Pre-plan your infrastructure and environment.

You want to move quickly. You don’t want to slow yourself down building overhead or setting up unsexy plumbing that may or may not be a featured element of the final presentation. Creating wide open access policies (be they Security Group rules, IAM Policies, etc.) is likely the fastest route to deploying infrastructure. However, this approach means exposing your project (and any interconnected components) to the entire world. Be mindful of this exposure and ensure your systems, accounts, and infrastructure components (such as databases) are protected with strong authentication. Enable Multi Factor Authentication for your IAM Accounts and SSH Key Pairs for Authentication instead of using passwords.

Integrating SaaS products like Slack, Google, or JIRA may be especially valuable if you’re working on a project to build a productivity tool for your company, but keep in mind that some of these tools ask for OAuth. Be thoughtful about what permissions you give those products. Does your hackathon project actually need admin access to these accounts? Are you inadvertently turning over your company contact list to Slack (and/or the world)?

Just as you can undertake some background research on best security practices for new tools ahead of the hackathon, you can strategize about your built environment ahead of time. Making important decisions about security protocols in the heat of the moment can lead you to make mistakes or forget about basic security practices. In short, leaving open access policies in the name of speedy building is a strategy that could easily backfire. Also, any security-minded judges on the panel will decidedly frown upon an approach that ignores risk management altogether. Figure out how to address it in a manner appropriate to the demands of the project.

Map Out Your Use of Data.

In the spirit of rapidly producing proofs of concepts, it may be tempting to build your project directly on top of an existing production system using existing production data. In some ways, this is the most realistic “test” data to prove the validity of your idea. However, from a security standpoint, this is generally not a good idea, especially if you are working with sensitive data (as Datavant engineers regularly do). Given the points we made above about balancing speed and security, building on existing production data with ill-conceived security could quickly become highly problematic.

Ideally, you would be able to use a robust development or staging environment (without real data), but if not, it’s best to mock out API calls that otherwise operate on production data. If you can, aim to redact sensitive attributes, or use public or synthetic data. In the case of the upcoming Datavant Hackathon, we are making available diverse datasets: price transparency data by Turquoise, synthetic datasets by Syntegra, and a variety of datasets related to provider/payer information sharing provided by Datavant. There are tons of open-source tools available to help with mocking and test data generation across many languages, frameworks, and domains. For example, here is some open-source synthetic patient data. Finally, when attempting to identify what data may be sensitive, consider not only the data that is being operated upon (e.g. patient medical records in the case of a health app), but also its metadata, including, for example, location, hospital names, and IP addresses.

Be Mindful of Credentials and Third Party Services.

We see this way too often.

Perhaps the greatest cardinal sin of software development we see committed all the time in hackathon projects is credentials hard-coded into source code. These credentials could be for other systems, cloud environments (your AWS account), or third party APIs.

We get it. This is the fastest approach. But hard-coding credentials is never a good idea out in the real world, and continues to be a bad idea in a hackathon environment too. If your project code ends up on Github, it’s not a leap to imagine your credentials being lifted for nefarious purposes…like…let’s say…mining tens of thousands of dollars of crypto currency. Your hackathon project may be hypothetical, but the consequences of that strategy are quite real. While it would be ideal to use a secrets management system, such as Hashicorp Vault, AWS Secrets Manager or AWS Parameter Store, even employing environment variables is a huge step up from hardcoding.

Don’t put credentials in your code.

Bonus for committing to your hackathon project security and reading to the end!

Leverage your security team and your security-minded contacts. Explain your project concept and ask for insight on the particular risks involved in the project, and invite their input. Finally, if you’re a security or privacy practitioner reading this, get involved in Datavant’s Future of Health Hackathon! We’ve learned a lot both as participants in hackathons and by watching other teams develop projects. Join a team and contribute directly to make sure interesting projects get built safely.

We look forward to seeing some imaginative and innovative projects! See below for more information about the Future of Healthcare Hackathon.

About the Future of Healthcare Hackathon

Datavant has hosted several hackathons over the past few years. One major highlight of these was the 2020 Pandemic Response Hackathon, which drew over 1600 participants, 230 submissions, and involved 30+ co-partners. Have a look at the 2020 project showcase to see some especially impressive submissions.

The Future of Healthcare Hackathon is a virtual event taking place from Sept. 8 — Sept. 11. Submissions will be reviewed by our judging panel including David Shulkin, prior U.S. Secretary to the VA, Niall Brennan, Chief Analytics and Privacy Officer at Clarify (formerly at the Healthcare Cost Institute), Clare Bernard, Ph.D., Senior Director, Data Sciences Platform at Broad Institute, and more.

Winners can bring their projects to life by leveraging our prize pool, which includes cash prizes and the opportunity to travel to Washington D.C. to present at the annual Future of Health Data Summit (on 9/15). Presenters at this conference will include Former and Current Heads of the FDA, Former U.S. Secretary of the VA, Chief Data Officer of Broad institute, and Federal CIO. ~250 high profile leaders in healthcare, tech, and policy, will be in attendance, as well as press.

Authored by Ben Waugh, David Gold, and Nicholas DeMaison.

Ben Waugh has a background in secure software engineering and security architecture and is currently the Head of Security at Datavant. Connect with Ben via Linkedin.

David Gold has a background in chemistry, biology, and software engineering and is currently on the product security team at Datavant. Connect with David via Linkedin.

Nicholas DeMaison enjoys telling simple stories about complex things. Connect with Nick via Linkedin.

Looking to solve security and privacy challenges of healthcare tech at scale? We’re hiring!

Spotlight on AnalyticsIQ: Privacy Leadership in State De-Identification

AnalyticsIQ, a marketing data and analytics company, recently adopted Datavant’s state de-identification process to enhance the privacy of its SDOH datasets. By undergoing this privacy analysis prior to linking its data with other datasets, AnalyticsIQ has taken an extra step that could contribute to a more efficient Expert Determination (which is required when its data is linked with others in Datavant’s ecosystem).

AnalyticsIQ’s decision to adopt state de-identification standards underscores the importance of privacy in the data ecosystem. By addressing privacy challenges head-on, AnalyticsIQ and similar partners are poised to lead clinical research forward, providing datasets that are not only compliant with privacy requirements, but also ready for seamless integration into larger datasets.

"Stakeholders across the industry are seeking swift, secure access to high-quality, privacy-compliant SDOH data to drive efficiencies and improve patient outcomes,” says Christine Lee, head of health strategy and partnerships at AnalyticsIQ.

“By collaborating with Datavant to proactively perform state de-identification and Expert Determination on our consumer dataset, we help minimize potentially time-consuming steps upfront and enable partners to leverage actionable insights when they need them most. This approach underscores our commitment to supporting healthcare innovation while upholding the highest standards of privacy and compliance."

Building Trust in Privacy-Preserving Data Ecosystems

As the regulatory landscape continues to evolve, Datavant’s state de-identification product offers an innovative tool for privacy officers and data custodians alike. By addressing both state-specific and HIPAA requirements, companies can stay ahead of regulatory demands and build trust across data partners and end-users. For life sciences organizations, this can lead to faster, more reliable access to the datasets they need to drive research and innovation while supporting high privacy standards.

As life sciences companies increasingly rely on SDOH data to drive insights, the need for privacy-preserving solutions grows. Data ecosystems like Datavant’s, which link real-world datasets while safeguarding privacy, are critical to driving innovation in healthcare. By integrating state de-identified SDOH data, life sciences can gain a more comprehensive view of patient populations, uncover social factors that impact health outcomes, and ultimately guide clinical research that improves health.

The Power of SDOH Data with Providers and Payers to Close Gaps in Care

Both payers and providers are increasingly utilizing SDOH data to enhance care delivery and improve health equity. By incorporating SDOH data into their strategies, both groups aim to deliver more personalized care, address disparities, and better understand the social factors affecting patient outcomes.

Payers Deploy Targeted Care Using SDOH Data

Payers increasingly leverage SDOH data to meet health equity requirements and enhance care delivery:

Tailored Member Programs: Payers develop specialized initiatives like nutrition delivery services and transportation to and from medical appointments.
Identifying Care Gaps: SDOH data helps payers identify gaps in care for underserved communities, enabling strategic in-home assessments and interventions.
Future Risk Adjustment Models: The Centers for Medicare & Medicaid Services (CMS) plans to incorporate SDOH-related Z codes into risk adjustment models, recognizing the significance of SDOH data in assessing healthcare needs.

Payers’ consideration of SDOH underscores their commitment to improving health equity, delivering targeted care, and addressing disparities for vulnerable populations.

Example: CDPHP supports physical and mental wellbeing with non-medical assistance

Capital District Physicians’ Health Plan (CDPHP) incorporated SDOH, partnering with Papa, to combat loneliness and isolation in older adults, families, and other vulnerable populations. CDPHP aimed to address:

Social isolation
Loneliness
Transportation barriers
Gaps in care

By integrating SDOH data, CDPHP enhanced their services to deliver comprehensive care for its Medicare Advantage members.

Providers Optimize Value-Based Care Using SDOH Data

Value-based care organizations face challenges in fully understanding their patient panels. SDOH data significantly assists providers to address these challenges and improve patient care. Here are some examples of how:

Onboard Patients Into Care Programs: Providers use SDOH data to identify patients who require additional support and connect them with appropriate resources.
Stratify Patients by Risk: SDOH data combined with clinical information identifies high-risk patients, enabling targeted interventions and resource allocation.
Manage Transition of Care: SDOH data informs post-discharge plans, considering social factors to support smoother transitions and reduce readmissions.

By leveraging SDOH data, providers gain a more comprehensive understanding of their patient population, leading to more targeted and personalized care interventions.

While accessing SDOH data offers significant advantages, challenges can arise from:

Lack of Interoperability and Uniformity: Data exists in fragmented sources like electronic health records (EHRs), public health databases, social service systems, and proprietary databases. Integrating and securing data while ensuring data integrity and confidentiality can be complex, resource-intensive and risky.
Lag in Payer Claims Data: Payers can take weeks or months to release claims data. This delays informed decision-making, care improvement, analysis, and performance evaluation.
Incomplete Data Sets in Health Information Exchanges (HIEs): Not all healthcare providers or organizations participate in HIEs. This reduces the available data pool. Moreover, varying data sharing policies result in data gaps or inconsistencies.

To overcome these challenges, providers must have robust data integration strategies, standardization efforts, and access to health data ecosystems to ensure comprehensive and timely access to SDOH data.

SDOH data holds immense potential in transforming healthcare and addressing health disparities.

With Datavant, healthcare organizations are securely accessing SDOH data, and further enhancing the efficiency of their datasets through state de-identification capabilities - empowering stakeholders across the industry to make data-driven decisions that drive care forward.

Featured resources

Tokenizing clinical trial data in the development lifecycle allows earlier access to real-world data. Validate populations prior to marketing authorization.

The Utility of Data Tokenization in Clinical Trials

Specialty drugs drive the majority of prescription drug spending. Learn about capabilities unlocked by connecting SP data, first party data, and RWD.

Linking Specialty Pharmacy Data for Commercial Success: The New World of Commercial Analytics

Datavant Connect: Matching patients across healthcare datasets

Address the need for highly accurate privacy-preserving record linkage and patient matching solutions to unlock research and innovation.

Achieve your boldest ambitions

Explore how Datavant can be your health data logistics partner.