DEV Community

Cover image for ChatGPT Systems: Prompt Injection and How to avoid ?
Shahwar Alam Naqvi
Shahwar Alam Naqvi

Posted on

2

ChatGPT Systems: Prompt Injection and How to avoid ?

Prompt Injection (definition)

  • Prompt injection refers to a technique used in natural language processing (NLP) models, where an attacker manipulates the input prompt to trick the model into generating unintended or biased outputs.

Prompt Injection (example)

  • A simple example is in the image bit below, User asks to forget the original instructions and tries to allot it a task of his own will.

A simple example of prompt injection

Prompt Injection (impact)

  • Prompt injection can have serious consequences, such as spreading misinformation, promoting biased views, or manipulating the model to generate outputs that may be harmful or unethical.

Prompt Injection (Code Implementation)

  • Delimiter: We will use delimiter, inorder to put the user message in a specific area always. And it should never become part of the original system message we have for the over all system.

For example-

An example delimiter

  • System message: There will be a system message, which is the main prompt for the overall system or let's say the application I have.

For example :

An example system message

  • Here the prompt says that , no matter what the response we get should always be in Arabic. And it also specifies the use of delimiter to wrap the user message.

  • User message:
    After the system message, follows the user message, this is where we will have a prompt which qualifies as a prompt injection.

For example:

User message : prompt injection

  • Here the user is instructing to ignore the original prompt and asks to respond in English. If you re-call the original prompt, it asks to respond in Arabic always.

  • Final user message:
    This is a precautionary step. One of the ways we tackle is this in the below image bit.

Final user message

Let's re-call prompts:

Recalling Prompts

  • Helper Function: Inorder to call the completion API and eventually get a response.

Completion

The Response:

The response suggests , we successfully evaded the english response and got it in Arabic.
Arabic Response

Follow me : https://www.linkedin.com/in/shahwaralamnaqvi?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app

Top comments (2)

Collapse
 
mcharytoniuk profile image
Mateusz Charytoniuk

Thank you! This is useful

Collapse
 
shahwar_ai profile image
Shahwar Alam Naqvi

You’re Welcome, Mateusz😃

AWS GenAI LIVE!

GenAI LIVE! is a dynamic live-streamed show exploring how AWS and our partners are helping organizations unlock real value with generative AI.

Tune in to the full event

DEV is partnering to bring live events to the community. Join us or dismiss this billboard if you're not interested. ❤️