GCP doesn't log Vertex AI data access by default. Two Terraform resources change that - Cloud Audit Logs for metadata, log sinks for long-term retention. Here's the full setup.
You've deployed your Vertex AI endpoint (Post 1) and added safety filters (Post 2). Your app is generating responses in production. Then your compliance team asks:
"Can you prove who called which model, when, and what was sent?"
GCP gives you two logging layers for this. Cloud Audit Logs capture metadata about every Vertex AI API call - who, when, which model, whether it succeeded. Request-response logging captures the actual prompt and response bodies into BigQuery. Both are disabled by default. Terraform makes sure they're enabled before your first production call. π―
π§± Two Logging Layers, Two Problems They Solve
| Layer | What It Captures | Where It Goes | Terraform Resource |
|---|---|---|---|
| Cloud Audit Logs | Caller identity, model ID, method, timestamp, authorization | Cloud Logging | google_project_iam_audit_config |
| Request-Response Logging | Full prompt body, full response body, token counts | BigQuery | Endpoint config (API/SDK) |
Cloud Audit Logs answer "who called what model and when." Request-response logging answers "what did they send and what came back." Most compliance scenarios need both.
This article focuses on Cloud Audit Logs since they're fully Terraform-manageable and cover the audit trail requirements. Request-response logging is configured per-endpoint via the API and is covered at the end.
ποΈ Step 1: Enable Data Access Audit Logs
GCP Cloud Audit Logs have three types. Admin Activity logs are always on and free - they capture resource creation/deletion. Data Access logs capture read operations like model predictions - these are off by default and the ones you need for AI compliance. System Event logs capture GCP-initiated actions.
For Vertex AI, every generateContent, predict, and streamGenerateContent call is a Data Access event. Without enabling these, you have no record of inference calls:
# logging/audit_config.tf
resource "google_project_iam_audit_config" "vertex_ai" {
project = var.project_id
service = "aiplatform.googleapis.com"
audit_log_config {
log_type = "ADMIN_READ"
}
audit_log_config {
log_type = "DATA_READ"
}
audit_log_config {
log_type = "DATA_WRITE"
}
}
What this enables: Every Vertex AI API call now generates an audit log entry in Cloud Logging with the caller's identity, the model resource path, the method name, and the timestamp.
Cost note: Data Access logs can generate significant volume. In production with high call rates, use exempted members or log sinks with filters to control costs.
π Step 2: Log Sink to Cloud Storage
Audit logs in Cloud Logging have a default 30-day retention in the _Default bucket. For compliance, you need longer retention. A log sink exports Vertex AI audit logs to Cloud Storage:
# logging/sink_gcs.tf
resource "google_storage_bucket" "vertex_ai_logs" {
name = "${var.environment}-vertex-ai-audit-logs-${var.project_id}"
location = var.region
project = var.project_id
force_destroy = var.environment != "prod"
uniform_bucket_level_access = true
versioning {
enabled = true
}
lifecycle_rule {
condition {
age = var.nearline_transition_days
}
action {
type = "SetStorageClass"
storage_class = "NEARLINE"
}
}
lifecycle_rule {
condition {
age = var.coldline_transition_days
}
action {
type = "SetStorageClass"
storage_class = "COLDLINE"
}
}
lifecycle_rule {
condition {
age = var.log_retention_days
}
action {
type = "Delete"
}
}
}
resource "google_logging_project_sink" "vertex_ai_gcs" {
name = "${var.environment}-vertex-ai-audit-to-gcs"
project = var.project_id
destination = "storage.googleapis.com/${google_storage_bucket.vertex_ai_logs.name}"
filter = <<-EOT
protoPayload.serviceName="aiplatform.googleapis.com"
AND logName:"cloudaudit.googleapis.com"
EOT
unique_writer_identity = true
}
resource "google_storage_bucket_iam_member" "sink_writer" {
bucket = google_storage_bucket.vertex_ai_logs.name
role = "roles/storage.objectCreator"
member = google_logging_project_sink.vertex_ai_gcs.writer_identity
}
π Step 3: Log Sink to BigQuery
For queryable analytics - cost tracking per model, usage patterns, anomaly detection - send the same logs to BigQuery:
# logging/sink_bigquery.tf
resource "google_bigquery_dataset" "vertex_ai_logs" {
dataset_id = "${var.environment}_vertex_ai_audit_logs"
project = var.project_id
location = var.region
description = "Vertex AI audit logs for analysis"
default_table_expiration_ms = var.bq_table_expiration_days * 86400000
labels = {
environment = var.environment
purpose = "ai-audit-logging"
}
}
resource "google_logging_project_sink" "vertex_ai_bigquery" {
name = "${var.environment}-vertex-ai-audit-to-bq"
project = var.project_id
destination = "bigquery.googleapis.com/projects/${var.project_id}/datasets/${google_bigquery_dataset.vertex_ai_logs.dataset_id}"
filter = <<-EOT
protoPayload.serviceName="aiplatform.googleapis.com"
AND logName:"cloudaudit.googleapis.com"
EOT
unique_writer_identity = true
bigquery_options {
use_partitioned_tables = true
}
}
resource "google_bigquery_dataset_iam_member" "sink_writer" {
project = var.project_id
dataset_id = google_bigquery_dataset.vertex_ai_logs.dataset_id
role = "roles/bigquery.dataEditor"
member = google_logging_project_sink.vertex_ai_bigquery.writer_identity
}
Partitioned tables are critical here - they partition by ingestion time so queries on recent data scan less and cost less.
βοΈ Step 4: Variables and Environment Configs
# logging/variables.tf
variable "project_id" { type = string }
variable "environment" { type = string }
variable "region" { type = string }
variable "nearline_transition_days" {
type = number
default = 30
}
variable "coldline_transition_days" {
type = number
default = 90
}
variable "log_retention_days" {
type = number
default = 365
}
variable "bq_table_expiration_days" {
type = number
default = 365
}
Per-environment configs:
# environments/dev.tfvars
nearline_transition_days = 15
coldline_transition_days = 30
log_retention_days = 90
bq_table_expiration_days = 90
# environments/prod.tfvars
nearline_transition_days = 90
coldline_transition_days = 365
log_retention_days = 2555 # 7 years for regulated industries
bq_table_expiration_days = 2555
π Step 5: Query Your Audit Logs
Once the BigQuery sink is active, you can run SQL against your audit data:
-- Top models by invocation count (last 7 days)
SELECT
protopayload_auditlog.resourceName AS model,
COUNT(*) AS call_count
FROM `PROJECT.DATASET.cloudaudit_googleapis_com_data_access`
WHERE timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY model
ORDER BY call_count DESC;
-- All calls by a specific service account
SELECT
timestamp,
protopayload_auditlog.authenticationInfo.principalEmail,
protopayload_auditlog.methodName,
protopayload_auditlog.resourceName
FROM `PROJECT.DATASET.cloudaudit_googleapis_com_data_access`
WHERE protopayload_auditlog.authenticationInfo.principalEmail
LIKE '%my-cloud-function-sa%';
-- Daily token usage trend (from request-response logs)
SELECT
DATE(logging_time) AS day,
model,
SUM(JSON_EXTRACT_SCALAR(response, '$.usageMetadata.totalTokenCount')) AS total_tokens
FROM `PROJECT.DATASET.request_response_logging`
GROUP BY day, model
ORDER BY day DESC;
π¨ Step 6: Alerting on Anomalies
Create log-based metrics and alerts for suspicious patterns:
# logging/alerts.tf
resource "google_logging_metric" "vertex_ai_errors" {
name = "${var.environment}-vertex-ai-error-rate"
project = var.project_id
filter = <<-EOT
protoPayload.serviceName="aiplatform.googleapis.com"
AND severity>=ERROR
EOT
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
}
}
resource "google_monitoring_alert_policy" "vertex_ai_errors" {
display_name = "Vertex AI High Error Rate"
project = var.project_id
combiner = "OR"
conditions {
display_name = "Error rate spike"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/${google_logging_metric.vertex_ai_errors.name}\""
comparison = "COMPARISON_GT"
threshold_value = 50
duration = "300s"
aggregations {
alignment_period = "300s"
per_series_aligner = "ALIGN_SUM"
}
}
}
notification_channels = var.notification_channels
}
π Production Architecture
ββββββββββββββββββββββββββββββββββββ
β Vertex AI API Call β
β (generateContent / predict) β
βββββββββββββββββ¬βββββββββββββββββββ
β
Cloud Audit Logs
(Cloud Logging)
β
βββββββββββββΌββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββ ββββββββββ ββββββββββββ
β GCS β β BQ β β Alerting β
β Bucket β β Datasetβ β Policies β
β β β β β β
β Long β β SQL β β Real- β
β term β β query β β time β
β archiveβ β & dash β β alerts β
ββββββββββ ββββββββββ ββββββββββββ
Dual-sink pattern: GCS for long-term compliance retention (lifecycle to Coldline, years of data). BigQuery for queryable analytics (partitioned tables, SQL access, Looker dashboards).
π Request-Response Logging (Prompt/Response Bodies)
Cloud Audit Logs capture metadata but not the actual prompt and response content. For full prompt/response bodies, Vertex AI offers request-response logging to BigQuery. This is configured per-endpoint via the API, not through Terraform:
# Enable via Python SDK when creating/updating endpoint
from google.cloud import aiplatform
endpoint = aiplatform.Endpoint.create(
display_name="my-endpoint",
predict_request_response_logging_config={
"enabled": True,
"sampling_rate": 1.0, # Log 100% in prod
"bigquery_destination": {
"output_uri": f"bq://{project_id}.{dataset_name}.request_response_logging"
}
}
)
This captures the full JSON body of every prompt and response. In production, set sampling_rate to 1.0 for full compliance coverage, or lower it in dev to reduce BigQuery costs.
π‘ GCP vs AWS: Key Differences
| Aspect | GCP (Vertex AI) | AWS (Bedrock) |
|---|---|---|
| Metadata logging | google_project_iam_audit_config |
aws_bedrock_model_invocation_logging_configuration |
| Prompt/response bodies | Request-response logging to BigQuery | Inline in CloudWatch/S3 logs |
| Scope | Per-project, per-service | Per-region, per-account |
| Long-term storage | Log sinks to GCS/BigQuery | S3 with lifecycle policies |
| Query engine | BigQuery SQL (native) | Athena (requires Glue catalog) |
| Real-time alerts | Log-based metrics + Cloud Monitoring | CloudWatch metric filters + alarms |
The biggest difference: GCP separates metadata from content logging. AWS bundles everything into one logging configuration. GCP's approach gives you finer cost control since Data Access logs (metadata) are cheaper than storing full prompt/response bodies in BigQuery.
βοΈ What's Next
This is Post 3 of the GCP AI Infrastructure with Terraform series.
- Post 1: Deploy Vertex AI: First AI Endpoint
- Post 2: Vertex AI Safety Filters π‘οΈ
- Post 3: Audit Logging (you are here) π
Every Vertex AI call now has a paper trail. Caller identity, model, timestamp in Cloud Audit Logs. Full prompts and responses in BigQuery. All managed by Terraform, all queryable with SQL. π
Found this helpful? Follow for the full GCP AI Infrastructure with Terraform series! π¬
Top comments (0)