Natural language processing is a field within artificial intelligence that studies how to computationally model human language. The representation of words through vectors, known as Word embedding, has become popular in recent years through techniques such as Doc2Vec or Word2Vec. This study evaluates the use of Doc2Vec in a set of conversations collected by ECU911 emergency center. The purpose was to classify incidents, consequently the operator is able to make the best decision regarding the actions to be taken when an emergency occurs. The data were recorded during 2020, in the emergency center located in Cuenca, Ecuador. In addition, Doc2Vec was compared with the Word2Vec technique to verify its performance level both in terms of accuracy and time. Based on the tests performed, it was concluded that Doc2Vec has a solid performance when using trained models with large corpus, outperforming Word2Vec.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Balcerek, J., Pawlowski, P., & Dabrowski, A. (2017). Classification of emergency phone conversations with artificial neural network. Signal Processing - Algorithms, Architectures, Arrangements, and Applications Conference Proceedings, SPA, 2017-Septe, 343–348. https://doi.org/10.23919/SPA.2017.8166890
Basili, V. R., Selby, R. W., & Hutchens, D. H. (1986). Experimentation in Software Engineering. In IEEE Transactions on Software Engineering: Vol. SE-12 (Issue 7). https://doi.org/10.1109/TSE.1986.6312975
Bendraou, R., Combemale, B., Cregut, X., & Gervais, M.-P. (2008). Definition of an Executable SPEM 2.0. 390–397. https://doi.org/10.1109/aspec.2007.60
Blomberg, S. N., Folke, F., Ersbøll, A. K., Christensen, H. C., Torp-Pedersen, C., Sayre, M. R., Counts, C. R., & Lippert, F. K. (2019). Machine learning as a supportive tool to recognize cardiac arrest in emergency calls. Resuscitation, 138(October 2018), 322–329. https://doi.org/10.1016/j.resuscitation.2019.01.015
Dai, X., Bikdash, M., & Meyer, B. (2017). From social media to public health surveillance: Word embedding based clustering method for twitter classification. Conference Proceedings - IEEE SOUTHEASTCON, Table I. https://doi.org/10.1109/SECON.2017.7925400
Gobierno de la República del Ecuador. (2019). Servicio Integrado de Seguridad ECU911. https://www.ecu911.gob.ec/
Gomez-Perez, J. M., Denaux, R., & Garcia-Silva, A. (2020). A Practical Guide to Hybrid Natural Language Processing. In A Practical Guide to Hybrid Natural Language Processing. Springer International Publishing. https://doi.org/10.1007/978-3-030-44830-1
Guti, L., & Keith, B. (2019). A Systematic Literature Review on Word Embeddings (Issue April 2020). Springer International Publishing. https://doi.org/10.1007/978-3-030-01171-0
Heimerl, F., & Gleicher, M. (2018). Interactive Analysis of Word Vector Embeddings. Computer Graphics Forum, 37(3), 253–265. https://doi.org/10.1111/cgf.13417
Kim, S., Park, I., & Yoon, B. (2020). Sao2vec: Development of an algorithm for embedding the subject-action-object (SAO) structure using Doc2Vec. PLoS ONE, 15(2), 1–26. https://doi.org/10.1371/journal.pone.0227930
Lau, J. H., & Baldwin, T. (2016). An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. 78–86. https://doi.org/10.18653/v1/w16-1609
Mayo, M. (2018). Preprocesamiento de datos de texto: un tutorial en Python. https://medium.com/datos-y-ciencia/preprocesamiento-de-datos-de-texto-un-tutorial-en-python-5db5620f1767
McKinney, W. (2013). Python for data analysis. In J. S. and M. Blanchette (Ed.), Journal of Chemical Information and Modeling (Melanie Ya, Vol. 53, Issue 9). O’Reilly Media, Inc.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings, 1–12.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations ofwords and phrases and their compositionality. Advances in Neural Information Processing Systems, 1–9.
Nakata, T. (2017). Text-mining on incident reports to find knowledge on industrial safety. Proceedings - Annual Reliability and Maintainability Symposium. https://doi.org/10.1109/RAM.2017.7889795
Nath Nandi, R., Arefin Zaman, M. M., Al Muntasir, T., Hosain Sumit, S., Sourov, T., & Jamil-Ur Rahman, M. (2018). Bangla News Recommendation Using doc2vec. 2018 International Conference on Bangla Speech and Language Processing, ICBSLP 2018, 1–5. https://doi.org/10.1109/ICBSLP.2018.8554679
Rehurek, R., & Sojka, P. (2011). Gensim — Statistical Semantics in Python (Vol. 6611, Issue May 2010).
Security, H., & Directorate, T. (2011). Computer Aided Dispatch Systems Computer-aided. September.
Senel, L. K., Utlu, I., Yucesoy, V., Koc, A., & Cukur, T. (2018). Semantic structure and interpretability of word embeddings. IEEE/ACM Transactions on Audio Speech and Language Processing, 26(10), 1769–1779. https://doi.org/10.1109/TASLP.2018.2837384
Shao, Y., Taylor, S., Marshall, N., Morioka, C., & Zeng-Treitler, Q. (2019). Clinical Text Classification with Word Embedding Features vs. Bag-of-Words Features. Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018, 2874–2878. https://doi.org/10.1109/BigData.2018.8622345
Truşcă, M. M. (2019). Efficiency of SVM classifier with Word2Vec and Doc2Vec models. Proceedings of the International Conference on Applied Statistics, 1(1), 496–503. https://doi.org/10.2478/icas-2019-0043
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., … Vázquez-Baeza, Y. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 17(3), 261–272. https://doi.org/10.1038/s41592-019-0686-2
Zhang, J., Zhang, M., Ren, F., Yin, W., Prior, A., Villella, C., & Chan, C. Y. (2018). Enable automated emergency responses through an agent-based computer-aided dispatch system. Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 3, 1844–1846.