ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds

Gijs Wijngaard; Elia Formisano; Bruno L. Giordano; Michel Dumontier

doi:10.23919/EUSIPCO58844.2023.10289793

ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds

Gijs Wijngaard, Elia Formisano, Bruno L. Giordano, Michel Dumontier

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

Abstract

Automated Audio Captioning is a multimodal task that aims to convert audio content into natural language. The performance of audio captioning systems is evaluated on quantitative metrics applied to the text representations. Previously, researchers have applied metrics from machine translation and image captioning to evaluate a generated audio caption. Inspired by cognitive neuroscience research on auditory cognition, in this paper we present a novel metric approach that evaluates captions taking into account how human listeners derive semantic information from sounds: Audio Captioning Evaluation on Semantics of Sound (ACES).

Original language	English
Title of host publication	31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings
Publisher	IEEE
Pages	770-774
Number of pages	5
ISBN (Electronic)	9789464593600
DOIs	https://doi.org/10.23919/EUSIPCO58844.2023.10289793
Publication status	Published - 1 Jan 2023
Event	31st European Signal Processing Conference, EUSIPCO 2023 - Helsinki, Finland Duration: 4 Sept 2023 → 8 Sept 2023 https://eusipco2023.org/

Publication series

Series	European Signal Processing Conference
ISSN	2219-5491

Conference

Conference	31st European Signal Processing Conference, EUSIPCO 2023
Abbreviated title	EUSIPCO2023
Country/Territory	Finland
City	Helsinki
Period	4/09/23 → 8/09/23
Internet address	https://eusipco2023.org/

Keywords

automated audio captioning
evaluation metric
semantics

Access to Document

10.23919/EUSIPCO58844.2023.10289793

Cite this

@inproceedings{33fd49335d3843bdba0db801c3f76f67,

title = "ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds",

abstract = "Automated Audio Captioning is a multimodal task that aims to convert audio content into natural language. The performance of audio captioning systems is evaluated on quantitative metrics applied to the text representations. Previously, researchers have applied metrics from machine translation and image captioning to evaluate a generated audio caption. Inspired by cognitive neuroscience research on auditory cognition, in this paper we present a novel metric approach that evaluates captions taking into account how human listeners derive semantic information from sounds: Audio Captioning Evaluation on Semantics of Sound (ACES).",

keywords = "automated audio captioning, evaluation metric, semantics",

author = "Gijs Wijngaard and Elia Formisano and Giordano, {Bruno L.} and Michel Dumontier",

note = "Publisher Copyright: {\textcopyright} 2023 European Signal Processing Conference, EUSIPCO. All rights reserved.; 31st European Signal Processing Conference, EUSIPCO 2023, EUSIPCO2023 ; Conference date: 04-09-2023 Through 08-09-2023",

year = "2023",

month = jan,

day = "1",

doi = "10.23919/EUSIPCO58844.2023.10289793",

language = "English",

series = "European Signal Processing Conference",

pages = "770--774",

booktitle = "31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings",

publisher = "IEEE",

address = "United States",

url = "https://eusipco2023.org/",

}

Wijngaard, G , Formisano, E, Giordano, BL & Dumontier, M 2023, ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds. in 31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings. IEEE, European Signal Processing Conference, pp. 770-774, 31st European Signal Processing Conference, EUSIPCO 2023, Helsinki, Finland, 4/09/23. https://doi.org/10.23919/EUSIPCO58844.2023.10289793

ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds. / Wijngaard, Gijs ; Formisano, Elia; Giordano, Bruno L. et al.
31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings. IEEE, 2023. p. 770-774 (European Signal Processing Conference).

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

TY - GEN

T1 - ACES

T2 - 31st European Signal Processing Conference, EUSIPCO 2023

AU - Wijngaard, Gijs

AU - Formisano, Elia

AU - Giordano, Bruno L.

AU - Dumontier, Michel

PY - 2023/1/1

Y1 - 2023/1/1

N2 - Automated Audio Captioning is a multimodal task that aims to convert audio content into natural language. The performance of audio captioning systems is evaluated on quantitative metrics applied to the text representations. Previously, researchers have applied metrics from machine translation and image captioning to evaluate a generated audio caption. Inspired by cognitive neuroscience research on auditory cognition, in this paper we present a novel metric approach that evaluates captions taking into account how human listeners derive semantic information from sounds: Audio Captioning Evaluation on Semantics of Sound (ACES).

AB - Automated Audio Captioning is a multimodal task that aims to convert audio content into natural language. The performance of audio captioning systems is evaluated on quantitative metrics applied to the text representations. Previously, researchers have applied metrics from machine translation and image captioning to evaluate a generated audio caption. Inspired by cognitive neuroscience research on auditory cognition, in this paper we present a novel metric approach that evaluates captions taking into account how human listeners derive semantic information from sounds: Audio Captioning Evaluation on Semantics of Sound (ACES).

KW - automated audio captioning

KW - evaluation metric

KW - semantics

U2 - 10.23919/EUSIPCO58844.2023.10289793

DO - 10.23919/EUSIPCO58844.2023.10289793

M3 - Conference article in proceeding

T3 - European Signal Processing Conference

SP - 770

EP - 774

BT - 31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings

PB - IEEE

Y2 - 4 September 2023 through 8 September 2023

ER -