A Review on Domain Based Automatic Speech Recognition Technology

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2019 by IJCTT Journal
Volume-67 Issue-6
Year of Publication : 2019
Authors : Nobel Jacob Varghese , Dr. Cini Kurian
DOI :  10.14445/22312803/IJCTT-V67I6P120

MLA

MLA Style:Nobel Jacob Varghese , Dr. Cini Kurian"A Review on Domain Based Automatic Speech Recognition Technology" International Journal of Computer Trends and Technology 67.6 (2019): 117-120.

APA Style Nobel Jacob Varghese , Dr. Cini Kurian. A Review on Domain Based Automatic Speech Recognition Technology International Journal of Computer Trends and Technology, 67(6),117-120.

Abstract
Automatic Speech Recognition Technology is a multidisciplinary research area having tremendous potential. It has become a integral part of future intelligent system in which speech recognition and speech synthesis are used as the basic mode of communicating with humans. In this paper, a survey on the technological development of domain based automatic speech recognition is presented.

Reference
[1] B. H. Juang and L. R. Rabiner (2005), „Automatic speech recognition–a brief history of the technology?, in Elsevier Encyclopaedia of Language and Linguistics, Second Edition, Elsevier.
[2] Wiqas Ghai , Navdeep Singh "Literature Review on Automatic Speech recognition" International Journal of Computer Applications Volume 41– March 2012.
[3] H. Dudley and T. H. Tarnoczy, The Speaking Machine of Wolfgang von Kempelen, J. Acoust.Soc. Am., Vol. 22, pp. 151-166, 1950.
[4] H. Dudley, R. R. Riesz, and S. A. Watkins, A Synthetic Speaker, J. Franklin Institute, Vol.227, pp. 739-764, 1939.
[5] T. B. Martin, A. L. Nelson, and H. J. Zadell, Speech Recognition by Feature abstraction Techniques, Tech. Report AL-TDR-64-176, Air Force Avionics Lab, 1964.
[6] T. K. Vintsyuk, Speech Discrimination by Dynamic Programming, Kibernetika, Vol. 4, No. 2, pp. 81-88, Jan.-Feb. 1968.
[7] H. Sakoe and S. Chiba, Dynamic Programming Algorithm Quantization for Spoken Word Recognition, IEEE Trans. Acoustics, Speech and Signal Proc., Vol. ASSP-26, No. 1, pp. 43-49, Feb. 1978.
[8] A. J. Viterbi, Error Bounds for Convolution Codes and an Asymptotically Optimal Decoding Algorithm, IEEE Trans. Information Theory, Vol. IT-13, pp. 260-269, April 1967.
[9] D.R Reddy , “ An approach to computer speech recogn ition by direct analysis of the speech wave”, Tech. Report No.C549 , computer Science Dept. , Stanford Univ., 1966.
[10] V.M.Velichko and N.G.Zagoruyko, Automatic Recognition of 200 words, Int.J.Man-Machine Studies, 2:223, June 1970.
[11] B. S. Atal and S. L. Hanauer, Speech Analysis and Synthesis by Linear Prediction of the Speech Wave, J. Acoust. Soc. Am. Vol. 50, No. 2, pp. 637-655, Aug. 1971.
[12] F. Itakura and S. Saito, A Statistical Method for Estimation of Speech Spectral Density and Formant Frequencies, Electronics and Communications in Japan, Vol. 53A, pp. 36-43, 1970.
[13] F. Itakura, Minimum Prediction Residual Principle Applied to Speech Recognition, EEE Trans. Acoustics, Speech and Signal Proc., Vol. ASSP-23, pp. 57-72, Feb. 1975
[14] L. R. Rabiner, S. E. Levinson, A. E. Rosenberg and J. G. Wilpon, Speaker Independent Recognition of Isolated Words Using Clustering Techniques, IEEE Trans. Acoustics, Speech and Signal Proc., Vol. Assp-27, pp. 336-349, Aug. 1979.
[15] Jean Francois, Automatic Word Recognition Based on Second Order Hidden Markov Models , IEEE Transactions on Audio, Speech and Language processing Vol.5,No.1, Jan.1997.
[16] Mark J. F. Gales, Katherine M. Knill, et.al., State-Based Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMM s ,IEEE Transactions On Speech And Audio Processing, Vol. 7,o. 2, March 1999.
[17] Qiang Huo et.al, Bayesian Adaptive Learning of the parameters of Hidden Markov model for speech recognition ,IEEE Transactions on Audio, Speech and Language processing Vol.3,No.5, Sept..1995.
[18] R. P. Lippmann, Review of Neural Networks for Speech Recognition, Readings in Speech Recognition, A. Waibel and K. F. Lee, Editors, Morgan Kaufmann Publishers, pp. 374-392,1990.
[19] B.H. Juang, C.H. Lee and Wu Chou, Minimum classification error rate methods for speech recognition, IEEE Trans. Speech & Audio Processing, T-SA, vo.5, No.3, pp.257-265, May 1997.
[20] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang. Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech and Signal Processing, 37:328–339, 1989.
[21] T. Robinson and F. Fallside. A recurrent error propagation network speech recognition system. Computer, Speech and Language, 5:259–274, 1991.
[22] H. Bourlard and N. Morgan. Continuous speech recognition by connectionist statistical methods. IEEE Transactions on Neural Networks, 4:893–909, 1993.
[23] H. Bourlard and N. Morgan. Connectionist speech recognition: a hybrid approach. Boston: Kluwer Academic, Norwell, MA (USA), 1994.
[24] T. Robinson, M. Hochberg, and S. Renals. The Use of Recurrent Neural Networks in Continuous Speech Recognition (Chapter 19), pages 159–184. Kluwer Academic Publishers, Norwell, MA (USA), 1995.
[25] W. Reichl and G. Ruske. A hybrid rbf-hmm system for continuous speech recognition. In Proceedings of the International Conference on Acoustics,Speech and Signal Processing (ICASSP), pages 3335–3338, Detroit, MI (USA), 1995.
[26] K. Iso and T. Watanabe. Speaker-Independent Word Recognition using a Neural Prediction Model. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 441–444, Alburquerque, New Mexico (USA), 1990 .
[27] J. Tebelskis, A. Waibel, B. Petek, and O. Schmidbauer. Continuous Speech Recognition using Predictive Neural Networks. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP),pages 61–64, Toronto, Canada, 1 991
[28] D. Ellis, R. Singh, and S. Sivadas. Tandem-acoustic modeling in largevocabulary recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 517–520, Salt Lake City, Utah (USA), 2001.
[29] B.H. Juang, C.H. Lee and Wu Chou, Minimum classification error rate methods for speech recognition, IEEE Trans. Speech & Audio Processing, T-SA, vo.5, No.3, pp.257-265, May 1997.
[30] L. R. Bahl, P. F. Brown, P. V. deSouza and L. R. Mercer, Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition, Proc. ICASSP 86,Tokyo, Japan, pp. 49-52, April 1986.
[31] 6. V. N. Vapnik, Statistical Learning Theory, John Wiley and Sons, 1998.
[32] A. Ganapathiraju, J.E. Hamaker, and J. Picone. Applications of support vector machines to speech recognition. IEEE Transactions on Signal Processing, 52:2348–2355, 2004.
[33] N. Thubthong and B. Kijsirikul. Support vector machines for Thai phoneme recognition. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9:803–13, 2001.
[34] J.M. Garc´?a-Cabellos, C. Pel´aez-Moreno, A. Gallar do-Antol´?n, F. P´erez- Cruz, and F. D´?az-de-Mar´?a. SVM Classifie rs for ASR: A Discussion about Parameterization. In Proceedings of EUSIPCO 2004, pages 2067–2070, Wien, Austria, 2004.
[35] A. Ech-Cherif, M. Kohili, A. Benyettou, and M. Benyettou. Lagrangian support vector machines for phoneme classification. In Proceedings of the 9th International Conference on Neural Information Processing (ICONIP?02), volume 5, pages 2507–2511, Singapore, 2002.
[36] D. Mart´?n-Iglesias, J. Bernal-Chaves, C. Pel´aez-M oreno, A. Gallardo-Antol´?n, and F. D´?az-de-Mar´?a. A Speech Recognizer based on Multiclass SVMs with HMM Guided Segmentation, pages 256–266. Springer, 2005.
[37] R. Solera-Ure˜na, D. Mart´?n-Iglesias, A. Gallardo- Antol´?n, C. Pel´aez-Moreno, and F. D´?az-de-Mar´?a. Robust ASR using Support Vector Machines.Speech Communication, Elsevier, 2006.
[38] S.V. Gangashetty, C. Sekhar, and B. Yegnanarayana. Combining evidence from multiple classifiers for recognition of consonant-vowel units of speech in multiple languages. In Proceedings of the International Conference on Intelligent Sensing and Information Processing, pages 387–391, Chennai, India, 2005.
[39] H. Shimodaira, K.I. Noma, M. Nakai, and S. Sagayama. Support vector machine with dynamic time-alignment kernel for speech recognition. In Proceedings of Eurospeech, pages 1841–1844, Aalb org, Denmark, 2001.
[40] H. Shimodaira, K. Noma, and M. Nakai. Advances in Neural Information Processing Systems 14, volume 2, chapter Dynamic Time-Alignment Kernel in Support Vector Machine, pages 921–928. MIT Press, Cambridge, MA (USA), 2002.
[41] K.-F. Lee, Large-vocabulary speaker-independent continuous speech recognition: The Sphinx system, Ph.D. Thesis, Carnegie Mellon University, 1988.
[42] R. Schwartz and C. Barry and Y.-L. Chow and A. Derr and M.-W. Feng and O. Kimball and F. Kubala and J. Makhoul and J. Vandegrift, The BBN BYBLOS Continuous Speech Recognition System, in Proc. of the Speech and Natural Language Workshop, p. 94-99, Philadelphia, PA, 1989.
[43] H. Murveit and M. Cohen and P. Price and G. Baldwin and M. Weintraub and J. Bernstein,SRI`s DECIPHER System, in proceedings of the Speech and Natural Language Workshop,p.238-242, Philadelphia, PA, 1989.
[44] S. Young, et. al., the HTK Book, http://htk.eng.cam.ac.uk/.
[45] Adoram Erell et.al., Energy conditioned spectral estimation for Recognition of noisy speed , IEEE Transactions on Audio, Speech and Language processing, Vol.1,No.1, Jan 1993.
[46] Adoram Erell et.al., Filter bank energy estimation using mixture and Markov models for Recognition of Nosiy Speech , IEEE Transactions on Audio, Speech and Language processing Vol.1,No.1, Jan.1993.
[47] Javier Hernando and Climent Nadeu, Linear Prediction of the One-Sided autocorrelation Sequence for Noisy Speech Recognition ,IEEE Transactions On Speech And Audio Processing, Vol. 5, No. 1, January 1997.
[48] C. J. Leggetter and P. C. Woodland, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Computer Speech and Language, 9, 171-185, 1995.
[49] A. P. Varga and R. K. Moore, Hidden Markov model decomposition of speech and noise, Proc. ICASSP, pp.845-848, 1990.
[50] M. J. F. Gales and S. J. Young, Parallel model combination for speech recognition in noise, Technical Report, CUED/FINFENG/ TR135, 1993.
[51] K. Shinoda and C. H. Lee, A structural Bayes approach to speaker adaptation, IEEE Trans. Speech and Audio Proc., 9, 3, pp. 276-287, 2001.
[52] Mazin G.Rahim et.al., Signal Bias Removal by maximum Likelihood Estimation for Robust Telephone Speech Recognition ,IEEE Transactions on Audio, Speech and Language processing Vol.4,No.1, Jan.1996.
[53] Ananth Sankar, A maximum likelihood approach to stochastic matching for robust speech recognition, IEEE Transactions on Audio, Speech and Language processing Vol.4, No.3, May.1996.
[54] Doh-Suk Kim, Auditory Processing of Speech Signals for Robust Speech Recognition in Real-World Noisy Environments ,IEEE Transactions On Speech And Audio Processing, Vol. 7, No. 1, January 1999.
[55] Mark J. F. Gales, Katherine M. Knill, et.al., State-Based Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMM s ,IEEE Transactions On Speech And Audio Processing, Vol. 7,No. 2, March 1999.
[56] Jen-Tzung Chien, Online Hierarchical Transformation Of Hidden Markov Models for Speech Recognition, IEEE Transactions On Speech And Audio Processing, Vol.7, No. 6, November 1999.
[57] K.H.Davis, R.Biddulph, and S.Balashek, Automatic Recognition of spoken Digits, J.Acoust.Soc.Am., 24(6):637-642,1952.
[58] K. Nagata, Y. Kato, and S. Chiba, Spoken Digit Recognizer for Japanese Language, NEC Res. Develop., No. 6, 1963.
[59] Md Sah Bin Hj Salam, Dzulkifli Mohamad, Sheikh Hussain Shaikh Salleh: Malay isolated speech recognition using neural network: a work in finding number of hidden nodes and learning parameters. Int. Arab J. Inf. Technol. 8(4): 364-371 (2011).

Keywords
Automatic Speech Recognition, Malayalam