GMO Flatt Security for GMO Flatt Security Inc.

Posted on Jun 5

LLM App Security: Risk & Prevent for GenAI Development

#ai #security #development #llm

Official Podcast
Introduction
OWASP Top 10 for LLM Applications 2025
Security Perspectives in General LLM Applications
Security Perspectives for LLM Applications Performing RAG or Tool Linking
Security Perspectives for LLM Frameworks
Conclusion

Official Podcast

This blog is also officially distributed as a podcast!

Spotify: EP1: Security Risks and Countermeasures in Generative AI Application Development

Introduction

Hello, I am Sato (@Nick_nick310), a security engineer at GMO Flatt Security Inc..

In recent years, the evolution and widespread adoption of Large Language Models (LLM) have been remarkable, and they are being utilized as generative AI in many services and business processes. While LLMs bring significant benefits, new security risks stemming from their characteristics have also been pointed out, making sufficient understanding and countermeasures essential for safe utilization. What kind of security challenges might arise when integrating LLMs into your company's services and operations?

In this article, we will explain the major security risks that should be considered when developing and operating applications that use LLM, using the international index "OWASP Top 10 for LLM Applications." We will also briefly introduce GMO Flatt Security's unique diagnostic approach to these risks, which evaluates specific implementation problems through detailed inspection at the source code level.

OWASP Top 10 for LLM Applications 2025

"OWASP (The Open Web Application Security Project)," an international non-profit organization aimed at improving web application security, publishes "OWASP Top 10 for LLM Applications," a ranked list summarizing security risks specific to LLM applications.

Our LLM application assessment references the "OWASP Top 10 for LLM Applications" and adds its own unique perspectives to the diagnostic items.

The latest version as of April 2025 is the "OWASP Top 10 for LLM Applications 2025," listed below:

Prompt Injection
Sensitive Information Disclosure
Supply Chain Risks
Data and Model Poisoning
Insecure Output Handling
Excessive Agency
System Prompt Leakage
Vector and Embedding Weaknesses
Misinformation
Unbounded Consumption

For details on "OWASP Top 10 for LLM Applications," please see the following:

Security Perspectives in General LLM Applications

Here, we delve deeper into the security perspectives for general LLM applications based on the "OWASP Top 10 for LLM Applications 2025".

Prompt Injection (LLM01)

"Prompt Injection" refers to issues where user instructions cause the LLM to perform actions not originally intended.

For example, with a chatbot given a certain role by an LLM, a malicious user could input an adversarial prompt, potentially causing the LLM to perform actions violating terms or output confidential information.

There are largely two types of Prompt Injection methods:

Direct Prompt Injection: The user inputs a prompt containing direct malicious commands to cause the LLM to perform unintended actions.
Indirect Prompt Injection: When the LLM loads external resources like Google search results, websites, or files, a user can prepare a malicious resource and indirectly cause the LLM to load it, thereby causing the LLM to perform unintended actions.

Furthermore, as detailed techniques, sometimes countermeasures can be bypassed by using commands in other languages like English, Unicode characters, or emojis.

Examples of Prompt Injection incidents include the following:

Source code and data leakage from GitHub Copilot via Prompt Injection: https://hackerone.com/reports/2383092
Inducing malicious behavior via Invisible Prompt Injection: https://hackerone.com/reports/2372363
Confidential information leakage from private Slack channels via Prompt Injection in Slack AI: https://slack.com/intl/ja-jp/blog/news/slack-security-update-082124

Assessment from the "Prompt Injection" perspective evaluates what kind of negative impacts could occur due to malicious prompts. Examples of negative impacts include:

Being made to perform unintended operations using permissions granted to the LLM.
Being made to output confidential information held by the LLM or internal information accessible to it.
Overloading the service by making the LLM perform excessively heavy processing or fall into infinite loops (DoS).

Countermeasures and mitigation for "Prompt Injection" include the following:

Clarify and limit the LLM's actions and role to allow only restricted operations.
Validate input content as much as possible and reject adversarial prompts.
Establish and validate output formats as much as possible, or use structured output mechanisms to strictly limit output formats.

Given that LLM prompts use natural language, completely preventing Prompt Injection is currently difficult.

Therefore, it is important to implement measures to minimize the risk even if Prompt Injection occurs. Regarding such measures, our "LLM Application Assessment" can propose countermeasures based on the source code.

System Prompt Leakage (LLM07)

"System Prompt Leakage" refers to issues where unintended confidential information is included in the system prompt used to control the LLM's behavior. (A system prompt is important configuration content that instructs the LLM on its role, behavior, constraints, output format, etc..)

For example, if the system prompt includes credentials like API keys or access tokens, it can lead to the leakage of this information.

Even if not credentials, if information like the following is included in the system prompt, its leakage can lead to bypassing restrictions:

Internal rules and security control settings
Internal company information or logic information

We have prepared a sample app for demonstration. The following code is an example where authentication information is included in the system prompt:

# Using ChatPromptTemplate to construct messages
chat_prompt = ChatPromptTemplate.from_messages([
  SystemMessagePromptTemplate.from_template(
    "id:secret@localhost password:secret123! You are a helpful assistant. Here is the relevant user information: \n {user_context}"
  ),
  HumanMessagePromptTemplate.from_template(
    "{message}"
  )
])

It is possible to make the system prompt output using a crafted prompt. In this example, it is possible to extract the authentication information contained in the system prompt.

As stated in the explanation of "Prompt Injection," "complete prevention is difficult," similarly, complete prevention of System Prompt Leakage is difficult. Therefore, it is important not to treat the system prompt itself as confidential information in the first place. Countermeasures for "System Prompt Leakage" include the following:

Do not include confidential information in the system prompt.
Do not perform application security control using the system prompt; perform security control in a place unrelated to the LLM.

Unbounded Consumption (LLM10)

"Unbounded Consumption" refers to issues where a user sends input that causes the LLM to excessively waste resources by increasing its output tokens.

For example, a feature that summarizes or composes text from user input might be targeted by a malicious user who inputs long comments or repeating commands, potentially wasting resources intentionally.

Let's check the behavior with a sample app. By commanding the LLM to "display the string repeatedly," it is easy to cause it to generate up to the output limit.

In this example, the countermeasure is to limit the output tokens.

llm = ChatOpenAI(
  temperature=0,
  openai_api_key=OPENAI_API_KEY,
  max_tokens=100 # Limit output tokens
)

Furthermore, even when the system prompt or the application side attempts to limit the LLM's output tokens, there might be cases where the limit can be bypassed due to logic flaws.

The security risks of "Unbounded Consumption" include the following:

Service overload (DoS attack)
Economic loss from the LLM's pay-per-use plan (EDoS attack)

Countermeasures for "Unbounded Consumption" include the following:

Set limits or rate limits on LLM input and output tokens.
Implement size limits or input validation for comments or files.

Security Perspectives for LLM Applications Performing RAG or Tool Linking

LLM applications include not only those released externally but also internal (company-only) applications and useful products. Examples include internal helpdesk agents and document creation agents.

These applications, being for internal use, tend to have more features and higher privilege levels compared to external LLM services.

Here, we introduce important security perspectives for such LLM applications with high privilege levels.

Data and Model Poisoning (LLM04)

"Data and Model Poisoning" refers to issues where training data or embedding data is poisoned, causing changes in behavior or output.

In LLM applications, using proprietary models for training is rare, so here we mainly explain data poisoning. Internal LLM applications often create business efficiency or documents based on data accumulated within the company. If the referenced data contains malicious content, this can be reflected in the behavior and output results.

For example, if an internal technical knowledge base contains malicious content, this malicious content could be reflected when generating documents by training on this knowledge base.

Countermeasures for "Data and Model Poisoning" include the following:

Use only trusted data sources. Example: Define trusted data creators and verify whether the data is by such creators.
Use only trusted content. Example: Attach metadata indicating trustworthiness to data under certain conditions (manual review or mechanical validation, etc.) and use only data with this metadata attached.

Excessive Agency (LLM06)

"Excessive Agency" refers to issues where unintended updates or deletions occur due to overly strong permissions given to the LLM.

Internal LLM applications often grant permissions to external services for their intended use. If the permissions granted are too broad, the LLM may perform unintended operations.

For example, consider the case of an LLM app that reads data from Google Drive. If, when granting Google Drive access permission, write permission is granted in addition to read permission, the LLM can write to Google Drive.

Since the LLM's behavior is not absolute, unintended writes to Google Drive could occur via prompt injection.

Countermeasures for "Excessive Agency" include the following:

Grant the LLM application only the minimum necessary permissions to fulfill the application's specifications. Example: If implementing a feature to reference cloud storage data in an LLM application, grant only read permission for the cloud storage.
If strong permissions are necessary for the LLM application, consider the following mitigation measures:
- Obtain logs of operations performed by the LLM application. If possible, monitor and issue alerts if unexpected operations occur (e.g., destructive changes to a large number of existing documents).
- Obtain the change history from the LLM application's operations and make it possible to roll back in case of unexpected operations.
- If the LLM application needs to perform operations requiring strong permissions, have the content of the operation reviewed and approved by a human before it is applied.

Vector and Embedding Weaknesses (LLM08)

"Vector and Embedding Weaknesses" refers to issues where unintended data is included in the context when using Retrieval Augmented Generation (RAG). (Retrieval Augmented Generation (RAG) is a method where the LLM searches for information from external data and uses it as a basis for answering, summarizing, or generating text.)

Information included in the context can potentially leak unintentionally through prompt injection, so it is necessary to control the information included. If access control is insufficient, other users' information or internal information may be included in the context, potentially leading to information leakage as a result.

Let's look at the behavior with a sample app. The following code creates embeddings from vectorization and performs a search based on the input prompt:

# Get user information and convert to documents
users = db.query(User).all()
docs = [
  Document(
    page_content=f"User: {user.username}, Email: {user.email}",
    metadata={"user_id": user.id}
  )
  for user in users
]

# Create embedding
embedding = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

# Create vector store
vectorstore = InMemoryVectorStore.from_documents(
  documents=docs, embedding=embedding
)

# Search for user information related to the message
relevant_docs = vectorstore.similarity_search(query=params.message, k=3)

# Build context from search results
user_context = "\n".join([doc.page_content for doc in relevant_docs])

# Use ChatPromptTemplate to construct messages
chat_prompt = ChatPromptTemplate.from_messages([
  SystemMessagePromptTemplate.from_template(
    """
    You are a helpful assistant.
    Below is relevant user information.
    {user_context}
    """
  ),
  HumanMessagePromptTemplate.from_template(
    "{message}"
  )
])

# Generate prompt
messages = chat_prompt.format_messages(
  user_context=user_context,
  message=params.message
)

llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)
reply = llm.invoke(messages)

return {"reply": reply}

In this example, since access control for data is not implemented, it is possible to enumerate the email addresses of users in the system with a simple prompt.

The countermeasure for this example is to restrict the retrieved user information to that of the logged-in user by specifying a filter when executing similarity_search.

# Define a filtering function
def filter_by_user_id(doc):
  return doc.metadata.get("user_id") == current_user['user_id']

# Search for user information related to the message
relevant_docs = vectorstore.similarity_search(
  query=params.message,
  k=3,
  filter=filter_by_user_id # Limit authorized directory for operations
)

Countermeasures for "Vector and Embedding Weaknesses" include the following:

Limit the data included in the context to the scope that the user has permission to view.

Misinformation (LLM09)

"Misinformation" refers to issues where the reliability of the LLM's output results is low.

Due to its training data and model characteristics, an LLM may generate non-factual information (hallucinations), and its responses may not align with reality. Using responses that do not align with reality in important situations like decision-making can lead to significant losses.

Countermeasures for "Misinformation" include the following:

Increase the possibility of output aligning with internal data by using RAG or similar methods.
When displaying the LLM's output, also display a message encouraging reconfirmation, such as "Please use as reference".

Security Perspectives for LLM Frameworks

When creating an LLM application, if you want to include many features, you need to implement functions that LLMs cannot perform, such as RAG management or file operations.

There are frameworks to streamline this implementation. Even when using a framework, it is necessary to consider security perspectives specific to LLM apps, and furthermore, perspectives specific to the framework must also be considered.

Here, taking the particularly famous "LangChain" as an example, we introduce important security perspectives in frameworks.

Excessive Agency (LLM06)

This perspective was also covered for internal LLM applications, but it is also an important perspective in frameworks. Frameworks have functions to link with various features. An example is DB linkage. When performing DB linkage, it is necessary to pass authentication information for accessing the DB to the framework. If this authentication information has strong privileges, the LLM can access the DB with strong privileges.

Therefore, operations that are not originally permitted to the user may be executed via the LLM.

The countermeasures are the same as for internal LLM applications.

Template Injection

"Template Injection" refers to issues where arbitrary code execution is possible by exploiting the template engine used in prompt templates.

Frameworks often provide a prompt template function to streamline prompt construction. LangChain can use "jinja2" in addition to formats like "f-string" and "mustache".

The template syntax of jinja2 allows executing Python code, so using untrusted sources to build templates can lead to Template Injection.

LangChain's official documentation also recommends using f-string instead of jinja2.

https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html#prompttemplate

Let's check the behavior with a sample app. The following code is an example using jinja2 for LangChain's template_format and embedding user input with an f-string in the prompt.

prompt = ChatPromptTemplate.from_messages(
  messages=[
    ("system", "You are a helpful assistant."),
    ("human", f"{params.message}"),
  ],
  template_format="jinja2",
)
prompt_value = prompt.invoke({"params": params})

print(prompt_value)
reply = llm.invoke(prompt_value)

return {"reply": reply}

When a string that is interpreted as a jinja2 template is sent, it may not appear to be interpreted in the LLM's generated output, but upon checking the logs, it can be confirmed that the template has been interpreted.

The countermeasure for this example is to specify template_format="f-string" and not use f-string directly in the prompt.

prompt = ChatPromptTemplate.from_messages(
  messages=[
    ("system", "You are a helpful assistant."),
    ("human", "{params}"), # Embed using the framework's mechanism, not direct f-string
  ],
  template_format="f-string",
)
prompt_value = prompt.invoke({"params": params})

Security risks include the following:

(If executed in a sandbox) DoS or restriction bypass
(If executed outside a sandbox) Arbitrary code execution on the server

Countermeasures include the following:

When using a template engine that can lead to arbitrary code execution, sanitize user input.

Insecure Function Usage with Insecure Configuration

"Insecure Function Usage with Insecure Configuration" is a perspective on cases where functions provided by a framework are used with insecure configurations.

When actually pointing this out, the issue will be specific to the function used. Frameworks often allow configurations or options that are weaker in terms of security to support various use cases.

Taking LangChain as an example, the LLMSymbolicMathChain function provides the allow_dangerous_requests option, which can lead to arbitrary code execution.

https://python.langchain.com/api_reference/experimental/llm_symbolic_math/langchain_experimental.llm_symbolic_math.base.LLMSymbolicMathChain.html#langchain_experimental.llm_symbolic_math.base.LLMSymbolicMathChain.allow_dangerous_requests

In File System tools, if the root_dir parameter is not set or an excessively broad range is specified, it can lead to information leakage or unauthorized file writes.

https://python.langchain.com/docs/integrations/tools/filesystem/

When using functions provided by a framework, it is important to verify that the configuration is secure.

Let's check the behavior with a sample app. The following code is an example where the root_dir parameter is not specified for File System tools.

toolkit = FileManagementToolkit(
  selected_tools=["read_file"]
)

For demonstration, a file named "flatt" has been created in the "/etc/" directory.

By using a crafted prompt, it is possible to view files in the "/etc" directory.

The countermeasure for this example is to set an appropriate directory for the root_dir parameter.

toolkit = FileManagementToolkit(
  root_dir="/app/img/", # Limit the directory allowed for operations
  selected_tools=["read_file"]
)

Security risks in the above example include the following:

Arbitrary code execution outside the sandbox
Unintended data reading, writing, or deletion

Countermeasures include the following:

Implement according to the security-related notes and warnings in the official documentation.

Conclusion

This article introduced the major security risks that should be considered when developing and operating LLM applications, as well as GMO Flatt Security's unique diagnostic approach.

Thank you for reading this far.

Security AI Agent "Takumi"

We're excited to announce the launch of our security AI agent, "Takumi"!

It's already making waves in the security world, having reported over 10 vulnerabilities in OSS projects like Vim.

Check it out!

DEV Community

LLM App Security: Risk & Prevent for GenAI Development

Table of Contents

Official Podcast

Introduction

OWASP Top 10 for LLM Applications 2025

Security Perspectives in General LLM Applications

Prompt Injection (LLM01)

System Prompt Leakage (LLM07)

Unbounded Consumption (LLM10)

Security Perspectives for LLM Applications Performing RAG or Tool Linking

Data and Model Poisoning (LLM04)

Excessive Agency (LLM06)

Vector and Embedding Weaknesses (LLM08)

Misinformation (LLM09)

Security Perspectives for LLM Frameworks

Excessive Agency (LLM06)

Template Injection

Insecure Function Usage with Insecure Configuration

Conclusion

Security AI Agent "Takumi"

Top comments (0)