G2TT
来源类型Working Paper
规范类型报告
DOI10.3386/w26227
来源IDWorking Paper 26227
Combining Family History and Machine Learning to Link Historical Records
Joseph Price; Kasey Buckles; Jacob Van Leeuwen; Isaac Riley
发表日期2019-09-09
出版年2019
语种英语
摘要A key challenge for research on many questions in the social sciences is that it is difficult to link historical records in a way that allows investigators to observe people at different points in their life or across generations. In this paper, we develop a new approach that relies on millions of record links created by individual contributors to a large, public, wiki-style family tree. First, we use these “true” links to inform the decisions one needs to make when using traditional linking methods. Second, we use the links to construct a training data set for use in supervised machine learning methods. We describe the procedure we use and illustrate the potential of our approach by linking individuals across the 100% samples of the US decennial censuses from 1900, 1910, and 1920. We obtain an overall match rate of about 70 percent, with a false positive rate of about 12 percent. This combination of high match rate and accuracy represents a point beyond the current frontier for record linking methods.
主题Econometrics ; Data Collection ; Labor Economics ; Demography and Aging ; History
URLhttps://www.nber.org/papers/w26227
来源智库National Bureau of Economic Research (United States)
引用统计
资源类型智库出版物
条目标识符http://119.78.100.153/handle/2XGU8XDN/583899
推荐引用方式
GB/T 7714
Joseph Price,Kasey Buckles,Jacob Van Leeuwen,et al. Combining Family History and Machine Learning to Link Historical Records. 2019.
条目包含的文件
文件名称/大小 资源类型 版本类型 开放类型 使用许可
w26227.pdf(458KB)智库出版物 限制开放CC BY-NC-SA浏览
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Joseph Price]的文章
[Kasey Buckles]的文章
[Jacob Van Leeuwen]的文章
百度学术
百度学术中相似的文章
[Joseph Price]的文章
[Kasey Buckles]的文章
[Jacob Van Leeuwen]的文章
必应学术
必应学术中相似的文章
[Joseph Price]的文章
[Kasey Buckles]的文章
[Jacob Van Leeuwen]的文章
相关权益政策
暂无数据
收藏/分享
文件名: w26227.pdf
格式: Adobe PDF
此文件暂不支持浏览

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。