Ted Talk Teaser Generation With Pre-Trained Models

G. Vico*, J. Niehues

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

While we have seen significant advances in automatic summarization for text, research on speech summarization is still limited. In this work, we address the challenge of automatically generating teasers for TED talks. In the first step, we create a corpus for automatic summarization of TED and TEDx talks consisting of the talks' recording, their transcripts and their descriptions. The corpus is used to build a speech summarization system for the task. We adapt and combine pre-trained models for automatic speech recognition (ASR) and text summarization using the collected data. This initial work shows that is more important to adapt the summarization model to the ASR transcripts than to adapt the ASR model to the talks.
Original languageEnglish
Title of host publicationICASSP 2022 - 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
PublisherIEEE
Pages8067-8071
Number of pages5
ISBN (Print)9781665405409
DOIs
Publication statusPublished - 2022
Event47th IEEE International Conference on Acoustics, Speech and Signal Processing - Online, Singapore, Singapore
Duration: 22 May 202227 May 2022
Conference number: 47
https://2022.ieeeicassp.org/

Publication series

SeriesInternational Conference on Acoustics Speech and Signal Processing Proceedings
ISSN1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2022
Country/TerritorySingapore
CitySingapore
Period22/05/2227/05/22
Internet address

Keywords

  • speech summarization
  • automatic speech recognition
  • abstractive summarization

Cite this