Abstract
The amount of data that is generated during the execution of a business process is growing. As a consequence it is increasingly hard to extract useful information from the large amount of data that is produced. Linguistic summarization helps to point business analysts in the direction of useful information, by verbalizing interesting patterns that exist in the data. In previous work we showed how linguistic summarization can be used to automatically generate diagnostic statements about event logs, such as 'for most cases that contained the sequence ABC, the throughput time was long'. However, we also showed that our technique produced too many of these statements to be useful in a practical setting. Therefore this paper presents a novel technique for linguistic summarization of event logs, which generates linguistic summaries that are concise enough to be used in a practical setting, while at the same time enriching the summaries that are produced by also enabling conjunctive statements. The improved technique is based on pruning and clustering of linguistic summaries. We show that it can be used to reduce the number of summary statements 80-100% compared to previous work. In a survey among 51 practitioners, we found that practitioners consider linguistic summarization useful and easy to use and intend to use it if it were commercially available. (C) 2017 Elsevier Ltd. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 114-125 |
Journal | Information Systems |
Volume | 67 |
DOIs | |
Publication status | Published - 2017 |
Externally published | Yes |