A unifying similarity measure for automated identification of national implementations of european union directives

Rohan Nanda, Luigi Di Caro, Guido Boella, Hristo Konstantinov, Tenyo Tyankov, Daniel Traykov, Hristo Hristov, Francesco Costamagna, Llio Humphreys, Livio Robaldo, Michele Romano

Research output: Contribution to conferencePaperAcademic

Abstract

This paper presents a unifying text similarity measure (USM) for automated identication of national implementations of European Union (EU) directives. The proposed model retrieves the transposed provisions of national law at a ne-grained level for each article of the directive. USM incorporates methods for matching common words, common sequences of words and approximate string matching. It was used for identifying transpositions on a multilingual corpus of four directives and their corresponding national implementing measures (NIMs) in three dierent languages: English, French and Italian. We further utilized a corpus of four additional directives and their corresponding NIMs in English language for a thorough test of the USM approach. We evaluated the model by comparing our results with a gold standard consisting of ocial correlation tables (where available) or correspondences manually identied by domain experts. Our results indicate that USM was able to identify transpositions with average F-score values of 0.808, 0.736 and 0.708 for French, Italian and English Directive-NIM pairs respectively in the multilingual corpus. A comparison with state-of-the-art methods for text similarity illustrates that USM achieves a higher F-score and recall across both the corpora.

Original languageEnglish
Pages149-158
DOIs
Publication statusPublished - 2017
Externally publishedYes

Fingerprint

Dive into the research topics of 'A unifying similarity measure for automated identification of national implementations of european union directives'. Together they form a unique fingerprint.

Cite this