Apoorv Raj Saxena

Posted on Jul 15, 2019 • Edited on Jul 31, 2019 • Originally published at blog.secxena.com

Chatbot Security Framework: Everything you need to know about Chatbot security

#machinelearning #security #architecture

I was hacking my Chat-bot on a Sunday night and I had a sudden insight on how unaware a development team is when developing the features. I was intrigued by the idea of comparing my chatbot's security to popular chat-bots. So I decided to explore the security aspects of 2 India based chat-bot products. which function under the healthcare and finance domain.

I started fuzzing with their chat-bots but both of them were using conversational interfaces like buttons and custom UIs instead of an open keyboard. Chances of getting users' malicious input inside were almost impossible thus I've reversed the app and got API server details.

From there, things escalated rapidly to an insecure server. I got access to publically accessible .git/ repository on the production server.

It took almost 1 hour to get 100 thousand plus users' healthcare data. I reported it to that organization at 11 PM. They replaced all the credentials before 6 AM and pinged me to verify. That was the fastest security fix I have seen in odd hours.

The second chatbot I proxied to my burp-suite and found out they are using unsigned S3 bucket URI's. To my surprise, it had read/write public permission. I browsed through the bucket till I got a folder named 'portfolio-statements'. It was a dump of 10 thousand plus (11091 to be exact) portfolio consisting of all investment details. I dropped an email reporting the problem, never got a revert.

After this incident, I started thinking of formalizing a security structure for chat-bot fraternity. I understand the pain of developing something under the strict timeline and with pen-testing being a hobby, I also know the frustration of seeing trivial bugs. I've asked myself this question multiple times in the past - How can we achieve both qualities of the product in terms of covering security bugs, and speed of development in terms of new features.

the answer is simple - we can do that by starting good security practices at an early age of development. But, the problem is when you start your development project, a common debate springs up between spending time on improving the quality of the software versus concentrating on developing more valuable features.

Usually, the pressure to deliver functionality dominates the discussion, leading many developers to complain that they don't have time to work on architecture and code quality.

the assumption here is that increasing quality also increases cost but the counter-intuitive reality is different. good software quality removes the cruft that slows down developing new features, hence decreasing the cost in the long run.

The solution is time segmentation. Give the developer enough time for designing the solution before development and for security scenario post development.

So if I was able to convince you that It matters a lot, then I am going to give you a solid framework to cover chatbot security.

We should think from the inception of Chat-bot Technology, why we are using it instead of mainstream form-driven apps. There are two differences when you compare a chatbot to a mainstream app. Firstly, it mimics human conversational behavior which provides a sense of privacy. A user won't be alarmed when the bot asks for salary details if there's been a back and forth conversation which has helped in setting the context and reasoning behind asking the salary details. Secondly, it provides a sense of security because the user can ask any questions if they have a doubt on bots' answers. This to and fro communication is missing in mainstream apps.

2-way communicating chatbots help break the barrier of distrust common within users. When private questions are asked with proper reasoning provided in a conversational setting, the user finds it easier to trust the asker. As a result, chatbots are primed to ask questions like 'Do you have any critical disease?', 'How big is your emergency fund?', 'What's your salary package?' etc.

With this power, our prime responsibility is to secure the information our user has entrusted us with. Furthermore, we should comply with the regulation of our industry and implement it for the benefit of the user. lastly, you have to secure your infrastructure.

I urge you to think of authentication and permissions of all user - Non-customers, customers, Internal users, and if you are logging all the system activity somewhere.

Here I have given a question-based structure to decide if your Chat-bot is secure or not.

Domain-specific Security

Let's look at 2 separate use cases:

First, the wealth management use case - If the user wants to manage his/her wealth, he/she needs to disclose their life savings for the sake of risk profiling. This information is important for your user. Example, User entering salary details on the bot, and setting a goal of saving 15% of salary to buy a car in 2 years.

Second, the healthcare use case - If the user wants to track his/her health requirements, he/she needs to disclose their prevailing conditions and ongoing symptoms + medication. This information is important private information of your user. Example, a user saying that he/she is diabetic and looking up the best means to tackle the issue.

Ask these questions first.

Do you know which information given by your customer comes under Personally Identifiable Information(PII) OR Protected Health Information(PHI)?
Do you have clear segregation of PII-PHI and Non-PII-PHI in your business logic and data storage?
is your customer's information accessible to themselves only?. Can any internal team access it?
Is there any way that someone will be able to access a conversation which is not meant for them?
Can you differentiate if your customer is using the Chat-bot instead of someone else from their account?
Does admin have additional privileges on the same bot? If they do then why?

These questions are domain agnostic, you need to have an answer to these questions in order to design a secure Chat-bot solution.

There are enough tools available that you will solve these problems if you have asked these questions. I also designed a Chat-bot Security checklist which will help you solve these problems.

"Without requirements or design, programming is the art of adding bugs to an empty text file.'' -Louis Srygley

Technology Stack

There are three basic components in the context of the chatbot, Client platforms, NLU Tech, and Backend Infrastructure.

Each of these things has equal importance in the security perspective.

Here, ask these questions and you will get an answer if your implementation is secure or not.

Since client application runs on the client machine can they reverse it and if they do, will it disclose any vulnerability like a hardcoded password or access tokens, Insecure API or server access?
Can concerned data PII of your app stored in app's local storage be accessed by other apps or can be read without decryption?
A user can do at most 15 turns(query-response) per minute on your chat-bot manually but do you allow unlimited query per second? If yes then why?
Does your intent classifier have different intent for a different class of users A-Normal user and B-premium user? a user from class A query an utterance which will trigger a class B Intent?
If without login a user can use your bot then if he/she queries your chatbot, is there any intent which will not give any answer and ask him to login?
Do you have a conversation log storage? have you redacted/encrypted the PII from logs or dis-associated logs with its owner so that no-one from the internal team can access them?
Are you sure that best security practices are followed in your backend development and infrastructure deployment in congruence with chatbot security specification? Your system is as secure as its weakest link.

These are the questions that should be entertained before beta.

secxena / Chat-Bot-Security-Checklist

Chat Bot Security Checklist

Chat-bot Security Checklist

The Chat-Bot Security Checklist is an exhaustive list of all elements you need to have before launching your chat-bot to production.

How To Use • Contributing

It is based on Chat-bot developers' years of experience, with the additions coming from some other open-source checklists.

How to use?

All items in the Chat-bot Security Checklist are must for the majority of the projects.exception only be made for a regulatory reason. you can use this checklist to implement three-layered defense to your Chat-bot product. Each point in the list is marked as low medium and high.

means that the item is recommended but can be omitted in some particular situations.
means that the item is highly recommended and can eventually…

View on GitHub

After beta release try to do a VAPT to secure the left out vulnerability. then, in the long run, post a disclosure guideline and start bug bounty.

"Most people are starting to realize that there are only two different types of companies in the world: those that have been breached and know it and those that have been breached and don't know it."

Regulatory Concerns

When you are dwelling in privacy sensitive area which is very common in industrial chat-bot, you need to have a certain domain related accreditation in order to work in that industry plus common government imposed regulations.

Accreditation helps you achieve domain-specific security required by the governing body and is a common industry practice but it does not mean that compliances make you a perfectly secure organization.

There are two kinds of compliances

Common:

Industry agnostic compliances apply on the basis of demographic factors.

ISO/IEC 27002 - It mainly focuses on policies for the company finances, operations, physical security, and incident management.
NIST - Categorisation, implementation, monitoring, and assessment of security controls.
GDPR - Customer's personal data can only be saved until your business process need it for processing.

Industry-specific:

Industry-specific compliances apply on the basis of industry and demography.

HIPPA - It applies to Health-Tech and mainly focuses on ePHI-electronically protected health information security.
PCI-DSS - It is for payment with main focus on Secure storage, transmission, and execution of cardholders data.
FERPA - It is for Edutech industry with main focus on Privacy of students educational records.

There are other Non-Tech accreditations for various industries and based on their demographic, you need to comply with them too. So a good practice is to think of these three points when developing any solution.

You have an authentication mechanism in place for all users. you must know who is inside your system.
You have authorization system in place - give the least permission required by the role.
You have a monitoring system in place to log every activity happening in your system, who is doing it and do they have the permission to do it.

These three points will help you achieve a robust security system and you won't need a lot of work to comply with above compliances.

"Regulations are all very well for drill, but in the hour of danger, they are no more use. You have to learn to think." -Ferdinand Foch

Conclusion

Chat-bot system has all the common tech components that are found in any system. So you can't treat it any lesser. You need to put all the necessary security measures in place so that it is secure. Chat-bot specific tech is secure when they have conversational UI which is controlled by a Dialog Management System, and the intent classifier has authorization in place along with secured logging management.

DEV Community