Named Entity Extraction and Disambiguation from an Uncertainty Perspective

Mena B. Habib; M. van Keulen

Named Entity Extraction and Disambiguation from an Uncertainty Perspective

Mena B. Habib, M. van Keulen

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic

1 Downloads (Pure)

Abstract

Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This work addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this work to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

Original language	English
Title of host publication	Proceedings of the Dutch Belgian Database Day, DBDBD 2011, University of Twente, The Netherlands
Place of Publication	Enschede
Publisher	Centre for Telematics and Information Technology, University of Twente
Pages	12
Number of pages	1
Publication status	Published - 1 Dec 2011
Externally published	Yes

Keywords

Named Entity Recognition Named Entity Linking Named Entity Extraction Named Entity Disambiguation

Cite this

@inproceedings{f9cf8312918d4a79a8bc456aa355c24a,

title = "Named Entity Extraction and Disambiguation from an Uncertainty Perspective",

abstract = "Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This work addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this work to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.",

keywords = "Named Entity Recognition Named Entity Linking Named Entity Extraction Named Entity Disambiguation",

author = "Habib, {Mena B.} and Keulen, {M. van}",

note = "http://eprints.eemcs.utwente.nl/21799/",

year = "2011",

month = dec,

day = "1",

language = "English",

pages = "12",

booktitle = "Proceedings of the Dutch Belgian Database Day, DBDBD 2011, University of Twente, The Netherlands",

publisher = "Centre for Telematics and Information Technology, University of Twente",

address = "Netherlands",

}

Named Entity Extraction and Disambiguation from an Uncertainty Perspective. / Habib, Mena B.; Keulen, M. van.
Proceedings of the Dutch Belgian Database Day, DBDBD 2011, University of Twente, The Netherlands. Enschede: Centre for Telematics and Information Technology, University of Twente, 2011. p. 12.

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic

TY - GEN

T1 - Named Entity Extraction and Disambiguation from an Uncertainty Perspective

AU - Habib, Mena B.

AU - Keulen, M. van

N1 - http://eprints.eemcs.utwente.nl/21799/

PY - 2011/12/1

Y1 - 2011/12/1

N2 - Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This work addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this work to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

AB - Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This work addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this work to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

KW - Named Entity Recognition Named Entity Linking Named Entity Extraction Named Entity Disambiguation

M3 - Conference article in proceeding

SP - 12

BT - Proceedings of the Dutch Belgian Database Day, DBDBD 2011, University of Twente, The Netherlands

PB - Centre for Telematics and Information Technology, University of Twente

CY - Enschede

ER -