This is the first of a two-part post discussing some of the research behind our recently launched Datavant Trials, a web-based trial tokenization solution that helps bridge the gap between clinical trials and real-world data (RWD) to accelerate research-grade evidence generation. In this post, we outline our interest in and approaches to computing overlap between datasets that play a significant role in our trial tokenization product.