A linked open data representation of patents registered in the US from 2005-2017

Mofeed M Hassan, Amrapali Zaveri*, Jens Lehmann

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Patents are widely used to protect intellectual property and a measure of innovation output. Each year, the USPTO grants over 150,000 patents to individuals and companies all over the world. In fact, there were more than 280,000 patent grants issued in the US in 2015. However, accessing, searching and analyzing those patents is often still cumbersome and inefficient. To overcome those problems, Google indexes patents and converts them to Extensible Markup Language (XML) files using Optical Character Recognition (OCR) techniques. In this article, we take this idea one step further and provide semantically rich, machine-readable patents using the Linked Data principles. We have converted the data spanning 12 years - i.e. 2005-2017 from XML to Resource Description Framework (RDF) format, conforming to the Linked Data principles and made them publicly available for re-use. This data can be integrated with other data sources in order to further simplify use cases such as trend analysis, structured patent search & exploration and societal progress measurements. We describe the conversion, publishing, interlinking process along with several use cases for the USPTO Linked Patent data.

Original languageEnglish
Article number180279
Number of pages9
JournalScientific data
Volume5
DOIs
Publication statusPublished - 4 Dec 2018

Cite this