Optimising a Language Recognition System Through Phoneme-Based Vector Representation

Francisco Charro; Marco Herrera; Nataly Pozo; Andrés Rosales

Optimising a Language Recognition System Through Phoneme-Based Vector Representation

Authors

Francisco Charro Escuela Politécnica Nacional
Marco Herrera Escuela Politécnica Nacional
Nataly Pozo Escuela Politécnica Nacional
Andrés Rosales Escuela Politécnica Nacional

Keywords:

Vector Representation, Language Recognition, Skip-gram, n-grams, embeddings.

Abstract

This article analyzes vector representation of phonemes as an alternative to improve a language identification system (LID). CBOW (Continuous Bag-of-Words) and Skip-gram architectures proposed by Mikolov are studied. These models allow predicting words within a context by generating n-dimensional vectors. In this work we will analyze the application of these models in smaller phonetic units or n-grams.

Downloads

Download data is not yet available.

Author Biography

Francisco Charro, Escuela Politécnica Nacional

References

E.Ambikairajah, H. Li, L. Wang, B.Yin, and V.Sethu. “Language Identification: A tutorial”. IEEE Circuits and Systems Magazine, pages 82-108. May 2011.

E. Singer, P. A. Torres-Carrasquillo, T. P. Gleason, W. M. Campbell, and D. A. Reynolds. “Acoustic, phonetic, and discriminative approaches to automatic language identification”. Interspeech , 2003.

C. Salamea, L.F. D'Haro, R. de Córdoba, M. A., Caraballo “Incorporación de n-gramas discriminativos para mejorar un reconocedor de idioma fonotáctico basado en i-vectores”, Procesamiento del Lenguaje Natural, Revista nº 51, págias145-152, 2013.

M. A. Zissman et al. “Comparison of four approaches to automatic language identification of telephone speech”. IEEE Transactions on Speech and Audio Processing, pages 31 -44, 1996.

L. J. Rodriguez-Fuentes, N. Brummer, M. Penagarikano, A. Varona, G. Bordel, and M. Diez. “The Albayzin 2012 language recognition evaluation”. In Interspeech , pages 1497 -1501, 2013.

S.Lai, K.Liu, L. Xu andJ. Zhao. “How toGenerate a Good Word Embedding”,National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, China, July 2015.

T. Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean.“Distributed Representations of Words and Phrases and their Compositionality”. In Proceedings of NIPS, 2013.

M. Díez, A. Varona, M. Peñagarikano, L. J. Rodríguez-Fuentes, and G. Bordel. “On the use of pone log-likelihood ratios as features in spoken language recognition”. In Slt, pages 274-279, 2012.

P. Schwarz, “Phoneme Recognition based on Long Temporal Context”, PhD Thesis. Brno University of Technology, 2009.

D. A. Reynolds. “A Gaussian mixture modeling approach to text independent Speaker identification”. Ph.D. thesis, Georgia Inst. of Technol., 1992.

Cover Vol 4, No 3 (2017) - SPECIAL ISSUE

Downloads

Published

2017-11-01

Issue

Vol. 4 No. 3 (2017): SPECIAL ISSUE

Section

Research Articles for the Regular Issue

License

Copyright Notice

Authors who publish this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-Non-Commercial-Share-Alike 4.0 International 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

Disclaimer

LAJC in no event shall be liable for any direct, indirect, incidental, punitive, or consequential copyright infringement claims related to articles that have been submitted for evaluation, or published in any issue of this journal. Find out more in our Disclaimer Notice.

How to Cite

[1]

“Optimising a Language Recognition System Through Phoneme-Based Vector Representation”, LAJC, vol. 4, no. 3, pp. 49–54, Nov. 2017, Accessed: Jul. 26, 2026. [Online]. Available: https://lajc.epn.edu.ec/index.php/LAJC/article/view/130