Gateway to Think Tanks
来源类型 | Working Paper |
规范类型 | 报告 |
DOI | 10.3386/w26227 |
来源ID | Working Paper 26227 |
Combining Family History and Machine Learning to Link Historical Records | |
Joseph Price; Kasey Buckles; Jacob Van Leeuwen; Isaac Riley | |
发表日期 | 2019-09-09 |
出版年 | 2019 |
语种 | 英语 |
摘要 | A key challenge for research on many questions in the social sciences is that it is difficult to link historical records in a way that allows investigators to observe people at different points in their life or across generations. In this paper, we develop a new approach that relies on millions of record links created by individual contributors to a large, public, wiki-style family tree. First, we use these “true” links to inform the decisions one needs to make when using traditional linking methods. Second, we use the links to construct a training data set for use in supervised machine learning methods. We describe the procedure we use and illustrate the potential of our approach by linking individuals across the 100% samples of the US decennial censuses from 1900, 1910, and 1920. We obtain an overall match rate of about 70 percent, with a false positive rate of about 12 percent. This combination of high match rate and accuracy represents a point beyond the current frontier for record linking methods. |
主题 | Econometrics ; Data Collection ; Labor Economics ; Demography and Aging ; History |
URL | https://www.nber.org/papers/w26227 |
来源智库 | National Bureau of Economic Research (United States) |
引用统计 | |
资源类型 | 智库出版物 |
条目标识符 | http://119.78.100.153/handle/2XGU8XDN/583899 |
推荐引用方式 GB/T 7714 | Joseph Price,Kasey Buckles,Jacob Van Leeuwen,et al. Combining Family History and Machine Learning to Link Historical Records. 2019. |
条目包含的文件 | ||||||
文件名称/大小 | 资源类型 | 版本类型 | 开放类型 | 使用许可 | ||
w26227.pdf(458KB) | 智库出版物 | 限制开放 | CC BY-NC-SA | 浏览 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。