Automatic detection of intra-word code-switching

Dong Nguyen; Leonie Cornips

Automatic detection of intra-word code-switching

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

Abstract

Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.

Original language	English
Title of host publication	Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Editors	M. Elsner, S. Kubler
Publisher	The Association for Computational Linguistics
Pages	82-86
Number of pages	5
ISBN (Print)	9781945626081
Publication status	Published - 1 Jan 2016
Externally published	Yes
Event	14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany Duration: 11 Aug 2016 → 11 Aug 2016 https://www.ling.ohio-state.edu/sigmorphon/

Workshop

Workshop	14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/Territory	Germany
City	Berlin
Period	11/08/16 → 11/08/16
Internet address	https://www.ling.ohio-state.edu/sigmorphon/

Access to Document

https://aclanthology.org/W16-2013/Licence: CC BY

Cite this

Nguyen, D., & Cornips, L. (2016). Automatic detection of intra-word code-switching. In M. Elsner, & S. Kubler (Eds.), Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 (pp. 82-86). The Association for Computational Linguistics. https://aclanthology.org/W16-2013/

Nguyen, Dong ; Cornips, Leonie. / Automatic detection of intra-word code-switching. Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. editor / M. Elsner ; S. Kubler. The Association for Computational Linguistics, 2016. pp. 82-86

@inproceedings{9728478158c147f3ba683ddbd1e7f871,

title = "Automatic detection of intra-word code-switching",

abstract = "Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.",

author = "Dong Nguyen and Leonie Cornips",

note = "Funding Information: This research was supported by the Netherlands Organization for Scientific Research (NWO), grants 314-98-008 (Twidentity) and 640.005.002 (FACT). Publisher Copyright: {\textcopyright} 2016 Association for Computational Linguistics.; 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 ; Conference date: 11-08-2016 Through 11-08-2016",

year = "2016",

month = jan,

day = "1",

language = "English",

isbn = "9781945626081",

pages = "82--86",

editor = "M. Elsner and S. Kubler",

booktitle = "Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016",

publisher = "The Association for Computational Linguistics",

address = "United States",

url = "https://www.ling.ohio-state.edu/sigmorphon/",

}

Nguyen, D & Cornips, L 2016, Automatic detection of intra-word code-switching. in M Elsner & S Kubler (eds), Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. The Association for Computational Linguistics, pp. 82-86, 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Berlin, Germany, 11/08/16. <https://aclanthology.org/W16-2013/>

Automatic detection of intra-word code-switching. / Nguyen, Dong; Cornips, Leonie.
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. ed. / M. Elsner; S. Kubler. The Association for Computational Linguistics, 2016. p. 82-86.

Research output: Chapter in Book/Report/Conference proceeding › Conference article in proceeding › Academic › peer-review

TY - GEN

T1 - Automatic detection of intra-word code-switching

AU - Nguyen, Dong

AU - Cornips, Leonie

N1 - Funding Information: This research was supported by the Netherlands Organization for Scientific Research (NWO), grants 314-98-008 (Twidentity) and 640.005.002 (FACT). Publisher Copyright: © 2016 Association for Computational Linguistics.

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.

AB - Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.

M3 - Conference article in proceeding

SN - 9781945626081

SP - 82

EP - 86

BT - Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016

A2 - Elsner, M.

A2 - Kubler, S.

PB - The Association for Computational Linguistics

T2 - 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016

Y2 - 11 August 2016 through 11 August 2016

ER -

Nguyen D, Cornips L. Automatic detection of intra-word code-switching. In Elsner M, Kubler S, editors, Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. The Association for Computational Linguistics. 2016. p. 82-86