Karthigayan Devan

Posted on Sep 21

Google Cloud Model Armor - LLMs Protection

#cloudarmor #llm #gcp #security

Cloud Armor:

Google Cloud Armor helps protect your infrastructure and applications from Layer 3/Layer 4 network or protocol-based volumetric distributed denial-of-service (DDoS) attacks, volumetric Layer 7 attacks, and other targeted application attacks. It leverages Google's global network and distributed infrastructure to detect and absorb attacks and filter traffic through user-configurable security policies at the edge of Google's network, far upstream of your workloads.

Model Armor takes care of a few significant threats as covered in the OWASP top 10 LLM vulnerabilities list.

Malicious files and unsafe URLs
Prompt injection and jailbreaks
Sensitive data
Offensive material

Core Features:

Floor settings establish the bare minimum security requirements that all your custom configurations within the template must meet. It's the security bedrock.

Organization level:

A floor setting at this level adds minimum requirements to all templates associated with any project and any folder inside the organization

Folder level:

A floor setting at this level adds a minimum requirement to all templates associated with any project inside the folder.

Project level:

A floor setting at this level adds a minimum requirement to all templates associated with a project.

Template:

A template is your control panel, letting you dial in exactly how Model Armor examines prompts and responses.

Confidence level:

Low and above:

Model Armor screens almost everything. At this level, it's going to identify issues with the smallest hint of alignment to the detection criteria.

Medium and above:

Model Armor is a bit more discerning. It flags things that are moderate matches to the detection criteria.

High and above:

Model Armor is pretty darn confident that the information is a strong match to the detection criteria.

How to enable Model Armor in Google Cloud?

Navigate to Security Command Center -> Model Armor -> Enable API

Configure floor settings:

Detections:

Responsible AI:

Saved Floor settings:

Configure template settings:

After you create the template, it will be saved as follows.

Logs:

Model Armor is a multi-tasker. It's screening the text going in and out of the LLM, and it's also taking notes on the activities. These notes are surfaced to you in the form of logs.

Admin Activity audit logs capture details about templates, floor settings, and basic computing (CRUD) operations.
Data access audit logs capture details about screening operations. For example, what template was used to screen a prompt or response, what was the text, and what was the result?

Logs Explorer:

Below are a few filters:

protoPayload.serviceName="modelarmor.googleapis.com"
- This filer shows you audit logs that track template actions like create or update.
protopayload.methodName="google.cloud.modelarmor.v1.ModelArmor.SanitizeUserPrompt"
- This filter shows you the Data Access audit logs that capture prompt and response screening.

Sample Python code:

# pip install google-cloud-modelarmor
from google.cloud import modelarmor_v1
import sys

# Create a client
client = modelarmor_v1.ModelArmorClient(transport="rest", client_options = {"api_endpoint" : "modelarmor.us-central1.rep.googleapis.com"})

# Initialize request argument(s)
user_prompt_data = modelarmor_v1.DataItem()

# Get the prompt from command line argument
if len(sys.argv) > 1: # Check if an argument is provided
    prompt = sys.argv[1] # Take the first argument as the prompt
else:
    # Fallback to a default prompt if no argument is provided
    prompt = "Placeholder prompt."

# Set prompt data for model armor call
user_prompt_data.text = prompt
ma_request = modelarmor_v1.SanitizeUserPromptRequest(
    name="projects/xxx-armor-demo-012346/locations/us-central1/templates/pijb-only", # name contains the project and template
    user_prompt_data=user_prompt_data,
)

# Make the MA request
ma_response = client.sanitize_user_prompt(request=ma_request)

# Take action based on Model Armor's result
if ma_response.sanitization_result.filter_results["pi_and_jailbreak"].pi_and_jailbreak_filter_result.match_state == modelarmor_v1.FilterMatchState.MATCH_FOUND: # A PIJB match was found
    print("Query failed security check. Error.")
else:
    print("Query passed security check. Sending prompt to LLM.")

Pricing model:

https://cloud.google.com/armor/pricing

DEV Community