EN / ES / PT

C3-Sex: a Chatbot to Chase Cyber perverts

Jossie Murcia Trivino, Sebastián Moreno Rodríguez, Daniel Díaz Lopez . Computer Science Faculty- Escuela Colombiana de Ingeniería Julio Garavito.

Félix Gómes Mármol, Faculty of Computer Science- University of Murcia.

1. Department of Information and Communication Engineering, University of Murcia, Spain

2. Faculty of Computer Science, Escuela Colombiana de Ingeniería Julio Garavito, Bogotá, Colombia

*The articles published in this section are academic publications whose property belongs to their authors and do not imply the cession of the author’s economic rights in favor of the CAP4CITY Project, its members or third parties.

Abstract- Artificial Intelligence (AI) has recently captured the attention of many researchers worldwide, encompassing areas such as Machine Learning (ML), Computer Vision, Knowledge- Based Systems, Planning, Robotics, and Natural Language Processing (NLP), among others. Specifically, NLP aims to perceive and understand human language and to replicate it with empathetic responses. Some of the current NLP challenges include understanding complex structures of natural language, extensibility through syntax adaptation, adaptation of responses influenced by the interaction and extension of the conversation scope to an open context.

In turn, NLP entails the development of Artificial Conversational Entities (ACE), i.e. chatbots, defined as autonomous components interacting dynamically with humans. A chatbot is generally built upon the following elements: an interaction channel (e-mail, instant messaging, web page, mobile app, etc.), a Natural Language Processor (NLP), a Natural Language Generator (NLG), a knowledge-based data, one or more machine learning models and the business logic (see Figure1).

Chatbots are used in a variety of fields for different purposes, such as i) Support bots, designed to solve customer requests related to the delivery of a service or use of a product, and ii) Financial bots, aimed to resolve inquiries about financial services. Chatbots may have some constraints regarding the requests that they can respond and the vocabulary that they can employ, which depends on the specific domain where they are serving on. Furthermore, according to the Hype Cycle for emerging technologies by Gartner [2], conversational AI platforms remain in the phases of “innovation trigger” and “peak of inflated expectations”, meaning that they are getting substantial attention from the industry.

Besides the aforementioned use cases for chatbots, cybersecurity is one of the newest where to apply this technology. Thus, there exist chatbots focused on training end-users [3] or cyber analysts [4] in security awareness and incident response. Further, there are also malicious chatbots devoted to malware distribution through a human-machine conversation [5]. In addition, there is software designed to guide the user in terms of security and privacy, such as Artemis [6], a conversational interface to perform precision-guided analytics on endpoint data. Most of these security chatbots are implemented in a question-answering context [7] using a post-reply technique. As far as we know, the use of chatbots to profile suspects in an active way of child pornography has been little investigated, existing few approaches [8, 9] employing a chatbot to emulate a victim such as a child or a teenager. Likewise, our investigation aims to emulate a vulnerable person while the suspect offers him/her illegal content.

The paper at hand proposes C3-Sex, a chatbot based on the application of Machine Learning and Knowledge-Based Systems, able to interact with suspects around topics related to child pornography. Once the conversation has finished, some additional machine learning algorithms are employed to analyze the chat logs and make a profile of the suspect within three different categories (indifferent, interested and pervert).

The collected chats, joint with the values for each of the defined metrics, could be used for a Law Enforcement Agency (LEA) to identify and process a suspect of child pornography. The remainder of the paper is structured as follows. Section II describes some remarkable related works found in the literature. In Section III, the key goals and components of C3-Sex are introduced, while the main aspects of the data science lifecycle and the achieved proposal are presented in Section IV. Section V discusses the different user profiles that can be deduced from the interaction between the suspect and C3-Sex. Then, in Section VI we perform an exhaustive evaluation of the proposal and analyze the obtained results. At last, Section VII contains some highlights derived from the work done and mentions some future research directions.

[1] Al Rahman, Abdullah Al Mamun, and Alma Islam. “Programming challenges of chatbot: Current and future prospective”. In: Dec. 2017, pp. 75–78. DOI: 10.1109/ R10-HTC.2017.8288910.

[2] Mike Walker. “Hype Cycle for Emerging Technologies, 2018”. In: 2018 Hype Cycles: Riding the Innovation Wave (2018).

[3] Stewart Kowalski, Katarina Pavlovska, and Mikael Goldstein. “Two Case Studies in Using Chatbots for Security Training”. In: Information Assurance and Security Education and Training. Ed. by Ronald C. Dodge and Lynn Futcher. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 265–272. ISBN: 978-3-642-39377-8.

[4] B. A. Sabbagh et al. “A prototype For HI2Ping information security culture and awareness training”. In: 2012 International Conference on E-Learning and E-Technologies in Education (ICEEE). 2012, pp. 32–36. DOI: 10.1109/ICeLeTE.2012.6333397.

[5] Pan Juin Yang Jonathan, Chun Che Fung, and Kok Wai Wong. “Devious Chatbots – Interactive Malware with a Plot”. In: Progress in Robotics. Ed. by Jong-Hwan Kim et al. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 110–118. ISBN: 978-3-642-03986-7.

[6] Bobby Filar, Richard Seymour, and Matthew Park. “Ask Me Anything: A Conversational Interface to Augment Information Security Workers.” In: SOUPS. 2017.

[7] Simon Keizer and Harry Bunt. “Multidimensional Dialogue Management”. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue. Sydney, Australia: Association for Computational Linguistics, 2006, pp. 37–45. URL: http:// aclweb. org/ anthology/W06 –1306.

Read the whole article: