Intelligent Chatbot for Identifying Scams and Malware on WhatsApp and Generating Recommendations for Users

Authors

  • José Gerardo Chacón Rangel
  • Juan Carlos Escalante
  • Brian David Acevedo Gomez
  • Geron José Vergara García

Keywords:

Chatbot, Cybersecurity, Digital fraud, Optical Character Recognition, WhatsApp

Abstract

The increase in digital scams and the circulation of malicious content on instant messaging applications, especially WhatsApp, has become a problem that compromises the financial security and well-being of users, with greater vulnerability in older adults. In this context, a chatbot was developed oriented towards people without technical knowledge, capable of analyzing messages and images received via WhatsApp, identifying signs of scams or possible malware, and offering clear and actionable preventive recommendations. For this, a quantitative approach and iterative development were followed: a dataset was built with real and simulated messages, anonymized and labeled; the text was normalized using natural language processing techniques and represented with term frequency–inverse document frequency; subsequently, a classification model based on support vector machines was trained. In the visual component, a preprocessing flow was implemented with computer vision and optical character recognition to extract text from screenshots and evaluate it with the same analysis pipeline. The integration of the system was carried out using the official WhatsApp interface and a web service for the reception and response of messages. The results show an accuracy of 93% in message classification, 96% successful operation in functional tests (48 out of 50 interactions processed correctly), and a usable text extraction of 85% in image analysis. In operational validation with real scam messages, a functional accuracy of 90% was obtained. Overall, the chatbot demonstrates technical viability and stable performance for detecting threats in text and images within WhatsApp and guides the user with preventive recommendations, contributing to decreasing the risk of digital fraud in everyday messaging environments.

References

[1] Nasdaq Verafin, "2024 Global Financial Crime Report", 2024. [Online]. Disponible en: https://verafin.com/nasdaq-verafin-global-financial-crime-report/

[2] Policía Nacional de Colombia, Dirección de Investigación Criminal e INTERPOL, Centro Cibernético Policial, "Balance Anual 2024", 2024. [Online]. Disponible en: https://caivirtual.policia.gov.co/sites/default/files/observatorio/BALANCE%20ANUAL%20CECIP%202024_1.pdf

[3] ESET, "ESET Security Report LATAM 2023", 2023. [Online]. Disponible en: https://web-assets.esetstatic.com/wls/es/articulos/reportes/eset-security-report-latam2023.pdf

[4] P. B. Medina, A. Carofilis, E. Fidalgo, and E. Alegre, "Preprocesado de imagen y OCR para mejorar detección de smishing", Jornadas de Automática, vol. 45, 2024.

[5] M. D. P. Rojas Puentes, C. J. Parada, and J. L. Leal Pabón, “Estructuras desglosadas de trabajo (EDT) en la gestión de alcance de proyectos de desarrollo de software”, RCTA, vol. 1, no. 39, pp. 51–58, Jan. 2022, doi: 10.24054/rcta.v1i39.1375.

[6] R. de B. Contreras- Manrique, T. V. Ovalle Lizcano, L. Contreras Manrique, D. L. Coronel Peñuela, and Z. A. Rincón Suárez, “TIC y los delitos informáticos”, RCTA, vol. 1, no. 41, pp. 104–110, May 2023, doi: 10.24054/rcta.v1i41.2511.

[7] M. V. Pineda and A. M. Á. Quiceno, "Análisis de herramientas de ciberseguridad para pymes en Colombia", Revista CIES, vol. 14, no. 2, pp. 221–241, 2023.

[8] J. M. Arengas Acosta, M. Lopez Ramirez, and R. Guzman Cabrera, “Impacto del preprocesamiento en la clasificación automática de textos usando aprendizaje supervisado y reuters 21578”, RCTA, vol. 1, no. 43, pp. 110–118, Mar. 2024, doi: 10.24054/rcta.v1i43.2506.

[9] K. Beck, Extreme Programming Explained: Embrace Change, 2nd ed. Boston, MA, USA: Addison-Wesley, 2004.

[10] N. E. Céspedes Prieto, L. C. Cervantes Estrada, and L. Y. Martínez Fonseca, “Realidad aumentada como recurso de formación en las fuerzas militares Caso policial - Escuela de Cadetes General Santander”, RCTA, vol. 1, no. 41, pp. 66–78, May 2023, doi: 10.24054/rcta.v1i41.2419.

[11] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge, U.K.: Cambridge University Press, 2008. [Online]. Disponible en: https://nlp.stanford.edu/IR-book/

[12] R. Tapiero, A. Gonzalez, and N. Novoa, “Seguridad en redes SDN y sus aplicaciones”, RCTA, vol. 1, no. 37, pp. 108–117, Mar. 2021, doi: 10.24054/rcta.v1i37.1262.

[13] C. Cortes and V. Vapnik, "Support-vector networks", Machine Learning, vol. 20, no. 3, pp. 273–297, 1995, doi: 10.1007/BF00994018. [Online]. Disponible en: https://doi.org/10.1007/BF00994018

[14] I. S. Escobar Martínez, K. Marceles Villalba, and S. Amador Donado, “Fortalecimiento de infraestructuras educativas críticas: un enfoque de Red Team y metodologías avanzadas para la evaluación de vulnerabilidades”, RCTA, vol. 1, no. 45, pp. 159–169, Jan. 2025, doi: 10.24054/rcta.v1i45.2966.

[15] OpenCV, "OpenCV-Python Tutorials", OpenCV Documentation. [Online]. Disponible en: https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html

[16] L. A. . Lasso Cardona, E. . Rincón Reyes, and G. D. . Estrada Holguín, “Introducción a la evaluación de capacidades: una revisión teórica”, RCTA, vol. 2, no. 36, pp. 34–43, Jul. 2020, doi: 10.24054/rcta.v2i36.18.

[17] Meta Platforms, Inc., "WhatsApp Business Platform — Developer Documentation (Cloud API)", Meta for Developers. [Online]. Disponible en: https://developers.facebook.com/documentation/business-messaging/whatsapp/overview

[18] O. A. Tobar Rosero, L. F. Quintero Henao, and E. Pérez González, “Subestaciones digitales: impulsando la sostenibilidad y ciberseguridad para el sector eléctrico a partir de soluciones emergentes”, RCTA, vol. 2, no. 46, pp. 132–140, Jul. 2025, doi: 10.24054/rcta.v2i46.4101.

[19] N. S. Sandoval Carrero, N. M. Acevedo Quintana, and L. M. Santos Jaimes, “Lineamientos desde la industria 4.0 a la educación 4.0: caso tecnología IoT”, RCTA, vol. 1, no. 39, pp. 81–92, Feb. 2022, doi: 10.24054/rcta.v1i39.1379.

[20] PostgreSQL Global Development Group, "PostgreSQL Documentation", PostgreSQL. [Online]. Disponible en: https://www.postgresql.org/docs/

[21] S. F. Schwarz, P. Fonseca, and A. Rocha, “Smishing Detection From a Messaging Platform View,” IEEE Access, 2025, doi: 10.1109/ACCESS.2025.3597903. [Online]. Disponible en: https://doi.org/10.1109/ACCESS.2025.3597903

[22] U. Maqsood, S. ur Rehman, T. Ali, K. Mahmood, T. Alsaedi, and M. Kundi, “An Intelligent Framework Based on Deep Learning for SMS and E-mail Spam Detection,” Applied Computational Intelligence and Soft Computing, vol. 2023, Art. no. 6648970, 2023, doi: 10.1155/2023/6648970. [Online]. Disponible en: https://doi.org/10.1155/2023/6648970

[23] J. R. León Naranjo, M. M. Oleas Morán, and K. Z. Pimentel Salazar, "El papel de la inteligencia artificial en la detección del phishing", LATAM Revista Latinoamericana de Ciencias Sociales y Humanidades, vol. 6, no. 4, pp. 2330–2340, Aug. 2025, doi: 10.56712/latam.v6i4.4437. [Online]. Disponible en: https://doi.org/10.56712/latam.v6i4.4437

[24] N. J. van Eck and L. Waltman, "Software survey: VOSviewer, a computer program for bibliometric mapping", Scientometrics, vol. 84, no. 2, pp. 523–538, 2010, doi: 10.1007/s11192-009-0146-3. [Online]. Disponible en: https://doi.org/10.1007/s11192-009-0146-3

[25] C. Henriquez Miranda, J. D. Rios Perez, and G. Sanchez-Torres, “Hacia la mejora de la enseñanza en programación orientada a objetos: la integración de la asistencia de chatbot inteligente y la implementación del profesor Alex”, RCTA, vol. 1, no. 43, pp. 134–143, May 2024, doi: 10.24054/rcta.v1i43.2803.

[26] Telegram, “Telegram Bot API,” Telegram Developers (actualizado Aug. 15, 2025). [Online]. Disponible en: https://core.telegram.org/bots/api

[27] R. Jiménez Moreno, A. M. Castro Pescador, and A. A. Espitia Cubillos, “Aprendizaje profundo para selección de opciones numéricas por voz como herramientas para chatbot”, RCTA, vol. 1, no. 45, pp. 74–81, Jan. 2025, doi: 10.24054/rcta.v1i45.3044.

[28] Medina-Barahona, C. J., Mora, G. A., Calvache-Pabón, C., Salazar-Castro, J. A., Mora-Paz, H. A., & Mayorca-Torres, D. (2022). Propuesta de arquitectura IOT orientada a la creación de prototipos para su aplicación en plataformas educativas y de investigación. Revista Colombiana de Tecnologías de Avanzada (RCTA), 1(39), 118–125

[29] M. Rojas Puentes, C. Parada, y J. Leal Pabón, «Estructuras desglosadas de trabajo (EDT) en la gestión de alcance de proyectos de desarrollo de software», Revista Colombiana de Tecnologías de Avanzada, vol. 1, n.o 39, pp. 51-58, 2023, doi: 10.24054/rcta.v1i39.1375.

[30] Meneses, J. E., Garavito, F. A., & Meneses, E. (2021). Identificación de fallas en sistemas de bombeo mecánico de petróleo utilizando neurofuzzy. Revista Colombiana de Tecnologías de Avanzada, 1(37), 10-22. https://dialnet.unirioja.es/servlet/articulo?codigo=9377711

Published

2026-05-27

Issue

Section

Articles