Automatic detection of intra-word code-switching

Dong Nguyen, Leonie Cornips

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.
Original languageEnglish
Title of host publicationProceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
EditorsM. Elsner, S. Kubler
PublisherThe Association for Computational Linguistics
Pages82-86
Number of pages5
ISBN (Print)9781945626081
Publication statusPublished - 1 Jan 2016
Externally publishedYes
Event14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 11 Aug 201611 Aug 2016
https://www.ling.ohio-state.edu/sigmorphon/

Workshop

Workshop14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period11/08/1611/08/16
Internet address

Cite this