Use of NLP Techniques for an Enhanced Mobile Personal Assistant: The Case of Turkish

Gulsen Eryigit, Gokhan Celikkaya
  • Gokhan Celikkaya
    Istanbul Technical University,

Abstract

This article introduces a Turkish mobile assistant application which produces state-of-the art results for the Turkish language by using natural language processing (NLP) techniques. The voice-enabled mobile assistant application allows users to enter queries for nine pre-defined tasks; namely, making calls, sending sms messages and emails, getting directions, querying exchange rates, weather forecast and traffic information, searching on the internet and launching applications on the phone. Users’ queries are processed in a multi-stage approach (viz., NLP, query classification and parameter extraction). Either the requested task is performed or the requested information is displayed as the response of the application. The article presents the architecture of the introduced system, its comparison with some prominent mobile assistants as well as the newly created data resources (viz., two query datasets annotated for classification and parameter extraction, two specific datasets for domain adaptation of named entity recognition and syntactic parsing NLP modules) to be used in further research. The evaluations on the impact of NLP preprocessing layers to the query classification performances reveal that the added value by NLP may range from 0.2 to 10.7 percentage points depending on the preferred machine learning algorithm for the query classification stage. The impact of NLP for the parameter extraction stage is also crucial since the outputs of NLP modules are used systematically by the extraction rules. The overall performance of the introduced approach is measured as 70.8% which is very promising under the fact that the system is trained with very limited-size of annotated data. The technology introduced in this article is basically designed for the case of a mobile assistant but it can also be used for every voice-enabled control system to improve the user experience, such as smart homes or smart televisions.    

Keywords

natural language processing;NLP;question answer system;mobile assistant

Full Text:

PDF
Submitted: 2017-03-11 17:24:49
Published: 2017-09-29 16:13:25
Search for citations in Google Scholar
Related articles: Google Scholar

References

Celikkaya, G. and G. Eryiğit, A mobile assistant for Turkish. In Proceedings of Turklang-2014 - Türkiye Bilişim Vakfi Bilgisayar Bilimleri Ve Mühendisliği Dergisi, 2014. 7(1 (Basılı 8).

Apple. SIRI. 2011; Available from: http://www.apple.com/ios/siri/.

Google. Google Now. 2015; Available from: https://play.google.com/store/apps/details?id=com.google.android.googlequicksearchbox.

Microsoft. Cortana. 2014; Available from: https://en.wikipedia.org/wiki/Cortana_(software).

Labs, R. Robin - the Siri Challenger. Available from: https://play.google.com/store/apps/details?id=com.magnifis.parking.

Speaktoit. Assistant 2011; Available from: https://en.wikipedia.org/wiki/Assistant_(by_Speaktoit).

Asistan B - Turkçe Sesli Asist. 2015; Available from: https://play.google.com/store/apps/details?id=com.buronya.asistanb.

CEYD-A Turkçe Sesli Asistan. 2015; Available from: https://play.google.com/store/apps/details?id=com.cenker.yardimci.app.

Oflazer, K., Two-level description of Turkish morphology. Literary and linguistic computing, 1994. 9(2): p. 137-148.

Yuret, D. and F. Türe. Learning morphological disambiguation rules for Turkish. in Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006. Association for Computational Linguistics.

Eryiğit, G., J. Nivre, and K. Oflazer, Dependency parsing of Turkish. Computational Linguistics, 2008. 34(3): p. 357-389.

Seker, G.A. and G. Eryigit. Initial Explorations on using CRFs for Turkish Named Entity Recognition. in COLING. 2012.

Sahin, M., U. Sulubacak, and G. Eryigit. Redefinition of Turkish morphology using flag diacritics. in Proceedings of The Tenth Symposium on Natural Language Processing (SNLP-2013), Phuket, Thailand, October. 2013.

Eryigit, G., et al. Turksent: A sentiment annotation tool for social media. in Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. 2013.

Torunoglu, D. and G. Eryigit. A cascaded approach for social media text normalization of Turkish. in Proceedings of the 5th Workshop on Language Analysis for Social Media (LASM)@ EACL. 2014.

Eryigit, G. ITU Turkish NLP Web Service. in EACL. 2014.

Neustein, A. and J.A. Markowitz, Mobile speech and advanced natural language solutions. 2013: Springer Science & Business Media.

Bellegarda, J.R., Spoken language understanding for natural interaction: The siri experience, in Natural Interaction with Robots, Knowbots and Smartphones. 2014, Springer. p. 3-14.

Ward, W. and S. Issar. Recent improvements in the CMU spoken language understanding system. in Proceedings of the workshop on Human Language Technology. 1994. Association for Computational Linguistics.

He, Y. and S. Young. A data-driven spoken language understanding system. in Automatic Speech Recognition and Understanding, 2003. ASRU'03. 2003 IEEE Workshop on. 2003. IEEE.

Dowding, J., et al. Gemini: A natural language system for spoken-language understanding. in Proceedings of the 31st annual meeting on Association for Computational Linguistics. 1993. Association for Computational Linguistics.

Hawkins, J.C., et al., Integrated personal digital assistant device. 2008, Google Patents.

Acero, A., et al., System for using statistical classifiers for spoken language understanding. 2012, Google Patents.

Bangalore, S., N.K. Gupta, and M.G. Rahim, System and method of spoken language understanding in human computer dialogs. 2013, Google Patents.

Dusan, S. and J. Flanagan, System and method for adaptive language understanding by computers. 2002, Google Patents.

Turkcell. Turkcell - Mobil Asistan. 2015; Available from: http://www.turkcell.com.tr/tr/hakkimizda/video-galeri/reklam-filmleri/mobil-asistan-cell-in.

Idris. 2014; Available from: https://forum.shiftdelete.net/threads/idris-2-turkce-sesli-sanal-asistan-cikti.384057/.

Seker, G.A. and G. Eryigit. Extending a CRF-based Named Entity Recognition Model for Turkish Well Formed Text and User Generated Content. in Semantic Web Journal, doi:10.3233/SW-170253. 2016.

Eryiğit, G. ITU treebank annotation tool. in Proceedings of the Linguistic Annotation Workshop. 2007. Association for Computational Linguistics.

Pamay, T., et al. The Annotation Process of the ITU Web Treebank. in The 9th Linguistic Annotation Workshop held in conjuncion with NAACL 2015. 2015.

Google. Google Speech Recognition API Available from: https://developer.android.com/reference/android/speech/SpeechRecognizer.html.

Sulubacak, U., T. Pamay, and G. Eryiğit, IMST: A revisited Turkish dependency treebank, in The First International Conference on Turkic Computational Linguistics - TurCLing 2016 2016: Konya, Turkey. p. 1-6.

Çelikkaya, G., D. Torunoğlu, and G. Eryiğit. Named entity recognition on real data: a preliminary investigation for Turkish. in Application of Information and Communication Technologies (AICT), 2013 7th International Conference on. 2013. IEEE.

Nivre, J., et al., MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 2007. 13(02): p. 95-135.

Regular Expression. Available from: http://en.wikipedia.org/w/index.php?title=Regular_expression&oldid=654317915.

Le Cessie, S. and J.C. Van Houwelingen, Ridge estimators in logistic regression. Applied statistics, 1992: p. 191-201.

El-Manzalawy, Y. and V. Honavar, WLSVM: integrating libsvm into weka environment. Software available at http://www. cs. iastate. edu/yasser/wlsvm, 2005.

Chang, C. and C. Lin, {LIBSVM}: a Library for Support Vector Machines (Version 2.3). 2001.

John, G.H. and P. Langley. Estimating continuous distributions in Bayesian classifiers. in Proceedings of the Eleventh conference on Uncertainty in artificial intelligence. 1995. Morgan Kaufmann Publishers Inc.

Quinlan, J.R., C4. 5: Programming for machine learning. Morgan Kauffmann, 1993: p. 38.

Aha, D.W., D. Kibler, and M.K. Albert, Instance-based learning algorithms. Machine learning, 1991. 6(1): p. 37-66.

Holmes, G., A. Donkin, and I.H. Witten. Weka: A machine learning workbench. in Intelligent Information Systems, 1994. Proceedings of the 1994 Second Australian and New Zealand Conference on. 1994. IEEE.

Zhang, Y., R. Jin, and Z.-H. Zhou, Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics, 2010. 1(1-4): p. 43-52.

Abstract views:
155

Views:
PDF
156




Copyright (c) 2017 International Journal of Intelligent Systems and Applications in Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
 
© AtScience 2013-2018     -     AtScience is a registered trademark property of AtScience.