Key information extraction from documents: Evaluation and generator?

Oliver Bensch; Mirela Popa; Constantin Spille

Key information extraction from documents: Evaluation and generator?

Oliver Bensch, Mirela Popa, Constantin Spille

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

Abstract

Extracting information from documents usually relies on natural language processing methods working on one-dimensional sequences of text. In some cases, for example, for the extraction of key information from semi-structured documents, such as invoice-documents, spatial and formatting information of text are crucial to understand the contextual meaning. Convolutional neural networks are already common in computer vision models to process and extract relationships in multidimensional data. Therefore, natural language processing models have already been combined with computer vision models in the past, to bene efit from e.g. positional information and to improve performance of these key information extraction models. Existing models were either trained on unpublished data sets or on an annotated collection of receipts, which did not focus on PDF-like documents. Hence, in this research project a template-based document generator was created to compare state-of-theart models for information extraction. An existing information extraction model "Chargrid" (Katti et al., 2019) was reconstructed and the impact of a bounding box regression decoder, as well as the impact of an NLP pre-processing step was evaluated for information extraction from documents. The results have shown that NLP based pre-processing is beneficial for model performance. However, the use of a bounding box regression decoder increases the model performance only for fields that do not follow a rectangular shape.

Original language	English
Title of host publication	Advances in Semantics and Explainability for NLP: Joint proceedings of the DeepOntoNLP and X-SENTIMENT Workshops
Pages	47-53
Number of pages	7
Volume	2918
Publication status	Published - 1 Jan 2021
Event	Joint 2nd International Workshop on Deep Learning Meets Ontologies and Natural Language Processing and 6th International Workshop on Explainable Sentiment Mining and Emotion Detection - Online, Hersonissos, Greece Duration: 6 Jun 2021 → 7 Jun 2021 Conference number: 2

Publication series

Series	CEUR Workshop Proceedings
ISSN	1613-0073

Workshop

Workshop	Joint 2nd International Workshop on Deep Learning Meets Ontologies and Natural Language Processing and 6th International Workshop on Explainable Sentiment Mining and Emotion Detection
Abbreviated title	DeepOntoNLP and X-SENTIMENT 2021
Country/Territory	Greece
City	Hersonissos
Period	6/06/21 → 7/06/21

Keywords

Bounding Box Regression Decoder
Document Generator
Key Information Extraction

Cite this

@inproceedings{6487537cb2e542a0b150b3426531f99b,

title = "Key information extraction from documents: Evaluation and generator?",

abstract = "Extracting information from documents usually relies on natural language processing methods working on one-dimensional sequences of text. In some cases, for example, for the extraction of key information from semi-structured documents, such as invoice-documents, spatial and formatting information of text are crucial to understand the contextual meaning. Convolutional neural networks are already common in computer vision models to process and extract relationships in multidimensional data. Therefore, natural language processing models have already been combined with computer vision models in the past, to bene efit from e.g. positional information and to improve performance of these key information extraction models. Existing models were either trained on unpublished data sets or on an annotated collection of receipts, which did not focus on PDF-like documents. Hence, in this research project a template-based document generator was created to compare state-of-theart models for information extraction. An existing information extraction model {"}Chargrid{"} (Katti et al., 2019) was reconstructed and the impact of a bounding box regression decoder, as well as the impact of an NLP pre-processing step was evaluated for information extraction from documents. The results have shown that NLP based pre-processing is beneficial for model performance. However, the use of a bounding box regression decoder increases the model performance only for fields that do not follow a rectangular shape.",

keywords = "Bounding Box Regression Decoder, Document Generator, Key Information Extraction",

author = "Oliver Bensch and Mirela Popa and Constantin Spille",

note = "Funding Information: ? Supported by organization KI Group GmbH. Publisher Copyright: {\textcopyright} 2021 CEUR-WS. All rights reserved.; Joint 2nd International Workshop on Deep Learning Meets Ontologies and Natural Language Processing and 6th International Workshop on Explainable Sentiment Mining and Emotion Detection, DeepOntoNLP and X-SENTIMENT 2021 ; Conference date: 06-06-2021 Through 07-06-2021",

year = "2021",

month = jan,

day = "1",

language = "English",

volume = "2918",

series = "CEUR Workshop Proceedings",

publisher = "Rheinisch-Westfaelische Technische Hochschule Aachen * Lehrstuhl Informatik V",

pages = "47--53",

booktitle = "Advances in Semantics and Explainability for NLP: Joint proceedings of the DeepOntoNLP and X-SENTIMENT Workshops",

}

Bensch, O, Popa, M & Spille, C 2021, Key information extraction from documents: Evaluation and generator? in Advances in Semantics and Explainability for NLP: Joint proceedings of the DeepOntoNLP and X-SENTIMENT Workshops. vol. 2918, CEUR Workshop Proceedings, pp. 47-53, Joint 2nd International Workshop on Deep Learning Meets Ontologies and Natural Language Processing and 6th International Workshop on Explainable Sentiment Mining and Emotion Detection, Hersonissos, Greece, 6/06/21.

Key information extraction from documents: Evaluation and generator? / Bensch, Oliver; Popa, Mirela; Spille, Constantin.
Advances in Semantics and Explainability for NLP: Joint proceedings of the DeepOntoNLP and X-SENTIMENT Workshops. Vol. 2918 2021. p. 47-53 (CEUR Workshop Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

TY - GEN

T1 - Key information extraction from documents

T2 - Joint 2nd International Workshop on Deep Learning Meets Ontologies and Natural Language Processing and 6th International Workshop on Explainable Sentiment Mining and Emotion Detection

AU - Bensch, Oliver

AU - Popa, Mirela

AU - Spille, Constantin

N1 - Conference code: 2

PY - 2021/1/1

Y1 - 2021/1/1

N2 - Extracting information from documents usually relies on natural language processing methods working on one-dimensional sequences of text. In some cases, for example, for the extraction of key information from semi-structured documents, such as invoice-documents, spatial and formatting information of text are crucial to understand the contextual meaning. Convolutional neural networks are already common in computer vision models to process and extract relationships in multidimensional data. Therefore, natural language processing models have already been combined with computer vision models in the past, to bene efit from e.g. positional information and to improve performance of these key information extraction models. Existing models were either trained on unpublished data sets or on an annotated collection of receipts, which did not focus on PDF-like documents. Hence, in this research project a template-based document generator was created to compare state-of-theart models for information extraction. An existing information extraction model "Chargrid" (Katti et al., 2019) was reconstructed and the impact of a bounding box regression decoder, as well as the impact of an NLP pre-processing step was evaluated for information extraction from documents. The results have shown that NLP based pre-processing is beneficial for model performance. However, the use of a bounding box regression decoder increases the model performance only for fields that do not follow a rectangular shape.

AB - Extracting information from documents usually relies on natural language processing methods working on one-dimensional sequences of text. In some cases, for example, for the extraction of key information from semi-structured documents, such as invoice-documents, spatial and formatting information of text are crucial to understand the contextual meaning. Convolutional neural networks are already common in computer vision models to process and extract relationships in multidimensional data. Therefore, natural language processing models have already been combined with computer vision models in the past, to bene efit from e.g. positional information and to improve performance of these key information extraction models. Existing models were either trained on unpublished data sets or on an annotated collection of receipts, which did not focus on PDF-like documents. Hence, in this research project a template-based document generator was created to compare state-of-theart models for information extraction. An existing information extraction model "Chargrid" (Katti et al., 2019) was reconstructed and the impact of a bounding box regression decoder, as well as the impact of an NLP pre-processing step was evaluated for information extraction from documents. The results have shown that NLP based pre-processing is beneficial for model performance. However, the use of a bounding box regression decoder increases the model performance only for fields that do not follow a rectangular shape.

KW - Bounding Box Regression Decoder

KW - Document Generator

KW - Key Information Extraction

M3 - Conference article in proceeding

VL - 2918

T3 - CEUR Workshop Proceedings

SP - 47

EP - 53

BT - Advances in Semantics and Explainability for NLP: Joint proceedings of the DeepOntoNLP and X-SENTIMENT Workshops

Y2 - 6 June 2021 through 7 June 2021

ER -