Developing English to Dawurootsuwa Machine Translation Model using RNN

Elias Asefa; Hussien Seid

doi:https://doi.org/10.14445/22312803/IJCTT-V69I6P108

Research Article | Open Access | Download PDF

Volume 69 | Issue 6 | Year 2021 | Article Id. IJCTT-V69I6P108 | DOI : https://doi.org/10.14445/22312803/IJCTT-V69I6P108

Developing English to Dawurootsuwa Machine Translation Model using RNN

Elias Asefa, Hussien Seid

Received	Revised	Accepted
07 May 2021	11 Jun 2021	17 Jun 2021

Citation :

Elias Asefa, Hussien Seid, "Developing English to Dawurootsuwa Machine Translation Model using RNN," International Journal of Computer Trends and Technology (IJCTT), vol. 69, no. 6, pp. 49-56, 2021. Crossref, https://doi.org/10.14445/22312803/ IJCTT-V69I6P108

Abstract

The idea of language translation is developing recently to solve the issues of linguistic diversity. The translation of English texts into Amharic, Afaan-Oromoo, Tigrigna, China, French, and Somalia are developed. However, as the knowledge of the researchers is concerned, English to Dawurootsuwa machine translation is not developed. This thesis aims to develop unidirectional English to the Dawurootsuwa machine translation model by using Neural Network (NN) approaches. RNN predicted the output text based on the current input and previous output. Under in RNN, an LSTM and GRU contain a neuron, each neuron are replaced with cells having control gates. That used for a memory cell that maintains gates to manage the flow of sentences inaccurate order and fully connected to the model. A parallel corpus, which consists of 20,345 pairs of sentences is prepared from different sources and classified as a 90% training set and a 10% test set. A recurrent Neural Network model with 22 input nodes and 27 output nodes is developed and implemented using Keras toolkit of Python programing language and Adam algorithm. Totally they are contain four results based automatic (BLEU) score and manual evaluation (Arithmetic Mean Value) techniques with hidden layer size of 2. In simple RNN model the BLEU score is 0.5187 with the learning rate of 0.002 and AMV result is 0.60914. In embedding RNN model the BLEU score is 0.5245 with the learning rate of 0.003 and AMV result is 0.60914. In bidirectional RNN model the BLEU score is 0.5452 with the learning rate of 0.004 and AMV result is 0. 0.60914. Finally, in encoder-decoder model the BLEU score is 0.555 with the learning rate of 0.005 and AMV result is 0.60914. And after 0.005 learning rate there is similar score were recorded with the maximum threshold epochs of 100. From the result, concluded that, encoder decoder model of BLEU score 0.555 is fairly good accuracy achieved compare from the rest model and less achieved comparatively from AMV result. In further work to encourage English-Dawurootsuwa parallel corpus improve the accuracy more and minimize the loss and. develop the model from unidirectional to the multidirectional language model.

Keywords

Artificial Neural Network, Dawurootsuwa English, Machine Translation.

References

[1] R. Sebastian, Neural transfer learning for natural language processing, Galway, (2019) 1-16.
[2] T. McARTHUR, The oxford companion to the english language, T. McARTHUR, Ed., New York: oxford unversity press, (1992) 10, 51-54.
[3] N. Prakash, O. Lucila and C. Wendy, Natural language processing, Journal of the American Medical Informatics Association, september (2011).
[4] W. Hirut, Revisiting Gamo: Linguists’ classification versus self identification of the community, International Journal of Sociology and Anthropology, 5(9) (2013) 373-380.
[5] W. Yonghui, S. Mike, C. Zhifeng, V. Quoc and M. Norouzi, Bridging the Gap between Human and Machine Translation, computation and language, 1609 (2016).
[6] I. Sutskever, O. Vinyals and Q. V. Le, Sequence to sequence learning with neural networks, In Advances in Neural Information Processing Systems, (2014) 3104–3112.
[7] Y. Bengio and X. Glorot, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of the Thriteenth International Conference on Artificial Intellegence and Statistics, Sardinia,Italy, (2010).