G2TT
来源类型Working Paper
规范类型工作论文
A flexible, scaleable approach to the international patent 'name game'
Mark Huberty; Amma Serwaah; Georg Zachmann
发表日期2014-09-28
出版年2014
语种英语
概述The inventors in PATSTAT are often duplicates: the same person or company may be split into multiple entries in PATSTAT, each associated to different patents. In this paper, we address this problem with an algorithm that efficiently de-duplicates the data.
摘要

The inventors in PATSTAT are often duplicates: the same person or company may be split into multiple entries in PATSTAT, each associated to different patents. In this paper, we address this problem with an algorithm that efficiently de-duplicates the data. It needs minimal manual input and works well even on consumer-grade computers. Comparisons between entries are not limited to their names, and thus this algorithm is an improvement over earlier ones that required extensive manual work or overly cautious clean-up of the names.

Source code on Github.

Download data.

主题Innovation & Competition Policy
关键词patents
URLhttps://bruegel.org/2014/09/a-flexible-scaleable-approach-to-the-international-patent-name-game/
来源智库Bruegel (Belgium)
资源类型智库出版物
条目标识符http://119.78.100.153/handle/2XGU8XDN/429475
推荐引用方式
GB/T 7714
Mark Huberty,Amma Serwaah,Georg Zachmann. A flexible, scaleable approach to the international patent 'name game'. 2014.
条目包含的文件
文件名称/大小 资源类型 版本类型 开放类型 使用许可
WP_2014_10i.pdf.jpg(104KB)智库出版物 限制开放CC BY-NC-SA缩略图
浏览
WP_2014_10i.pdf(2743KB)智库出版物 限制开放CC BY-NC-SA浏览
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Mark Huberty]的文章
[Amma Serwaah]的文章
[Georg Zachmann]的文章
百度学术
百度学术中相似的文章
[Mark Huberty]的文章
[Amma Serwaah]的文章
[Georg Zachmann]的文章
必应学术
必应学术中相似的文章
[Mark Huberty]的文章
[Amma Serwaah]的文章
[Georg Zachmann]的文章
相关权益政策
暂无数据
收藏/分享
文件名: WP_2014_10i.pdf.jpg
格式: JPEG
文件名: WP_2014_10i.pdf
格式: Adobe PDF
此文件暂不支持浏览

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。