Langkah Praktis Membangun Sistem Pengenalan Suara dengan HTK

Authors

  • Zulkarnaen Hatala Politeknik Negeri Ambon

DOI:

https://doi.org/10.36085/jsai.v2i2.314

Abstract

Dipaparkan prosedur untuk mengembangkan Sistem Pengenalan Suara otomatis, Automatic Speech Recognition System (ASR) untuk kasus online recognition. Prosedur ini  secara cepat dan efisien membangun ASR menggunakan Hidden Markov Toolkit (HTK). Langkah-langkah praktis ini dipaparkan secara jelas untuk mengimplementasikan ASR dengan daftar kata sedikit (Small Vocabulary) dalam contoh kasus pengenalan digit Bahasa Indonesia. Dijelaskan beberapa teknik meningkatkan performansi seperti cara mengatasi noise, pengejaan ganda dan penerapan Principle Component Analysis. Hasil akhir berupa Word Error Rate

Author Biography

Zulkarnaen Hatala, Politeknik Negeri Ambon

Jurusan Teknik Elektro

References

S. Young, E. Gunnar, G. Mark, T. Hain, and D. Kershaw, “The HTK Book version 3.5 alpha,†Cambridge University, 2015.

C. D. Soderberg and K. S. Olson, “Illustration of the IPA: Indonesian,†J. Int. Phon. Assoc., vol. 38, no. 2, pp. 209–213, 2008.

C. Lopes and F. Perdigão, “Phone Recognition on the TIMIT Database,†2009.

K. Lee and H.-W. Hon, “Speaker-Independent Phone Recognition Using Hidden Markov Models,†IEEE Trans. Acoust., vol. 37, no. 11, pp. 1641–1648, 1989.

M. A. Huckvale, D. M. Brookes, L. T. Dworkin, M. E. Johnson, D. J. Pearce, and L. Whitaker, “The SPAR Speech Filing System,†Eur. Conf. Speech Technol., pp. 305–308, 1987.

P. Boersma and V. van Heuven, “Speak and unSpeak with Praat,†Glot Int., vol. 5, no. 9–10, pp. 341–347, 2001.

K. John and A. W. Black, “The CMU ARCTIC Speech Databases,†in 5th ICSA Speech Synthesis Workshop - Pittsburg, 2004, pp. 223–224.

S. B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,†IEEE Trans. Acoust., vol. 28, no. 4, pp. 357–366, 1980.

L. I. Smith, “A tutorial on Principal Components Analysis,†2002.

T. Takiguchi and Y. Ariki, “PCA-Based Speech Enhancement for Distorted Speech Recognition,†J. Multimed., vol. 2, no. 5, pp. 13–18, 2007.

L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,†Proc. IEEE, vol. 77, no. 2, 1989.

Downloads

Published

2019-06-28
Abstract viewed = 430 times