Low-Resource Constraints for Speech Recognition using HDNN

  IJCTT-book-cover
 
International Journal of Computer Trends and Technology (IJCTT)          
 
© 2018 by IJCTT Journal
Volume-61 Number-2
Year of Publication : 2018
Authors : S. Kousar Bhanu
  10.14445/22312803/IJCTT-V61P113

MLA

MLA Style: S. Kousar Bhanu "Low-Resource Constraints for Speech Recognition using HDNN" International Journal of Computer Trends and Technology 61.2 (2018):70-73.

APA Style: S. Kousar Bhanu (2018). Low-Resource Constraints for Speech Recognition using HDNN. International Journal of Computer Trends and Technology, 61(2),70-73

Abstract
In Speech Recognition acoustic model is a document to represent the communication between audio signals that make up speech. In many approaches Neural Network develops an attractive acoustic modelling like Speaker Adaptation whereby adapted to acoustic features. The study determines that in Speech Recognition the Highway Deep Neural Network (HDNN’s) contains the two gate units that secured over all the hidden layers to supervise the way of the highway networks. These gate units are shared beyond all the hidden layers to reduce the size of model parameters, all the model parameters are updated in sequence training to improve the results. In this paper, HDNN is used for implementation of Gate functions using Stacked Autoencoder (SAE), a layer-wise approach to train Deep Neural Network based on Speech Repository analysis. These Encoders decides a Machine Learning Model to locate a Low level Dimensional portrayal of model parameters has taken from Direct Voice Input (DVI). DVI intended to voice command-and-control to read the parameters from the speech utterances for each speaker.

Reference
[1] Swietojanski, Pawel, Jinyu Li, and Steve Renals. "Learning hidden unit contributions for unsupervised acoustic model adaptation." IEEE/ACM Transactions on Audio, Speech, and Language Processing 24.8 (2016): 1450-1463.
[2] Dahl, George E., et al. "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition." IEEE Transactions on audio, speech, and language processing 20.1 (2012): 30-42.
[3] Lu, Liang, and Steve Renals. "Small-footprint highway deep neural networks for speech recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing25.7 (2017): 1502-1511.
[4] Lu, Liang. "Sequence training and adaptation of highway deep neural networks." Spoken Language Technology Workshop (SLT), 2016 IEEE. IEEE, 2016.
[5] Srivastava, Rupesh K., Klaus Greff, and Jürgen Schmidhuber. "Training very deep networks." Advances in neural information processing systems. 2015.
[6] Sak, Haşim, Andrew Senior, and Françoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." Fifteenth annual conference of the international speech communication association. 2014.
[7] Abdel-Hamid, Ossama, et al. "Convolutional neural networks for speech recognition." IEEE/ACM Transactions on audio, speech, and language processing 22.10 (2014): 1533-1545.
[8] Zhou, Ju, Li Ju, and Xiaolong Zhang. "A hybrid learning model based on auto-encoders." Industrial Electronics and Applications (ICIEA), 2017 12th IEEE Conference on. IEEE, 2017.
[9] Shinoda, Koichi. "Speaker adaptation techniques for automatic speech recognition." Proc. APSIPA ASC 2011 Xi'an(2011).
[10] Cao, Zihong, et al. "Auto-encoder using the bi-firing activation function." Machine Learning and Cybernetics (ICMLC), 2014 International Conference on. Vol. 1. IEEE, 2014.
[11] Lee, Jae-Neung, and Keun-Chang Kwak. "A performance comparison of auto-encoder and its variants for classification." Signals and Systems (ICSigSys), 2017 International Conference on. IEEE, 2017.
Lin, Szu-Yin, et al. "A Dynamic Data-Driven Fine-Tuning Approach for Stacked Auto-Encoder Neural Network." e-Business Engineering (ICEBE), 2017 IEEE 14th International Conference on. IEEE, 2017

Keywords
Stacked Auto-Encoders, Automatic Speech Recognition, Stacked Auto-Encoders, Speech Recognition, HDNN, Acoustic Models