Presenter: Andee Kaplan, Assistant Professor in the Department of Statistics at Colorado State University

Description: Record linkage is the task of combining records from multiple files which refer to overlapping sets of entities when there is no unique identifying field in the records. In streaming record linkage, files arrive sequentially in time and estimates of links are updated after the arrival of each file. This problem arises in settings such as longitudinal surveys, electronic health records, and online events databases, among others. The challenge in streaming record linkage is to efficiently update parameter estimates as a new data file arrives. In this talk, Andee Kaplan approaches the problem from a Bayesian perspective with estimates in the form of posterior samples of parameters and present methods for updating link estimates after the arrival of a new file that is faster than fitting a joint model with each new data file. In this work, she generalizes a two-file Bayesian Fellegi-Sunter model to the multi-file case and propose two methods to perform streaming updates. She examines the effect of prior distribution on the resulting linkage accuracy as well as the computational trade-offs between the methods when compared to a Gibbs sampler through simulated and real-world survey panel data. Andee achieves near-equivalent posterior inference at a small fraction of the compute time.

Bio: Andee Kaplan is an assistant professor in the Department of Statistics at Colorado State University. Her research interests lie in the intersection of Bayesian methodology and statistical computing, particularly as applied to large social science and ecological problems with complex dependence and messy data structures. Prior to joining Colorado State University, Andee spent two years as a postdoctoral associate at Duke University after completing her Ph.D. in statistics from Iowa State University. In her free time, Andee enjoys riding bikes and rock climbing.

Hosted by: Professor Paul Parker

Zoom link: https://ucsc.zoom.us/j/98928577588?pwd=ZHFYVi93elRWdzUzYno5ZGtWMStGZz09

Event Details

See Who Is Interested

0 people are interested in this event

User Activity

No recent activity