DEV Community

Cover image for Ship FSx for ONTAP Audit Logs to New Relic via Serverless Lambda Pipeline

Ship FSx for ONTAP Audit Logs to New Relic via Serverless Lambda Pipeline

TL;DR

New Relic's 100 GB/month free tier makes it the lowest-barrier entry point for FSx for ONTAP audit log observability. Deploy a single CloudFormation stack, and your file access events are queryable in NRQL within 30 minutes — no credit card required.

FSx for ONTAP audit volume
  → S3 Access Point
    → EventBridge Scheduler (every 5 min)
      → Lambda (reads + formats)
        → New Relic Log API v1 (HTTP 202)
          → NRQL queryable in ~30 seconds
Enter fullscreen mode Exit fullscreen mode

This is Part 7 of the Serverless Observability for FSx for ONTAP series. Previous parts covered Datadog, OTel Collector, and Grafana Cloud.


Why New Relic for FSx for ONTAP Audit Logs?

Criteria New Relic Comparison
Free tier 100 GB/month (permanent, no expiry) Grafana: 50 GB/month (14-day retention), Sumo Logic: 500 MB/day (~15 GB/month), Datadog: none
Retention (free) 30 days Grafana: 14 days
Query language NRQL (SQL-like) Familiar to anyone who knows SQL
Alert conditions NRQL-based, free Included in free tier
Credit card required No Some vendors require CC for trial

For a typical FSx for ONTAP file server generating 1-10 GB/month of audit logs, New Relic's free tier covers the entire workload at zero cost.

Architecture

┌──────────────────────────────────────────────────────────┐
│ FSx for ONTAP                                            │
│                                                          │
│  Audit Volume ──→ S3 Access Point                        │
│                        │                                 │
│                        ▼                                 │
│  EventBridge Scheduler (rate: 5 min)                     │
│                        │                                 │
│                        ▼                                 │
│  Lambda (Python 3.12)                                    │
│    • Reads audit logs via S3 AP                          │
│    • Parses JSON/EVTX                                    │
│    • Formats for New Relic Log API                       │
│    • Sends gzip-compressed payload                       │
│    • Checkpoints in SSM Parameter Store                  │
│                        │                                 │
│                        ▼                                 │
│  New Relic Log API v1                                    │
│  https://log-api.newrelic.com/log/v1                     │
│  Header: Api-Key: <license-key>                          │
│  Response: HTTP 202 + requestId                          │
│                                                          │
│                        ▼                                 │
│  NRQL: SELECT * FROM Log WHERE source='fsxn-ontap'       │
└──────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Quick Start (30 Minutes)

Prerequisites

  • AWS account with FSx for ONTAP + S3 Access Point configured (see Prerequisites Guide if not yet set up — allow 1-2 hours for initial FSx for ONTAP + S3 AP configuration)
  • New Relic free account (sign up here — no credit card)

Step 1: Get Your License Key

New Relic removed the "Copy key" button from the UI in September 2024. Use the NerdGraph API:

# 1. Create a User API Key in New Relic UI (API Keys page)
# 2. Store it in an environment variable (never pass keys inline):
export NR_USER_API_KEY="NRAK-YOUR-USER-KEY"
export NR_ACCOUNT_ID="YOUR_ACCOUNT_ID"

# 3. Use it to query the License Key via NerdGraph:
curl -s -X POST "https://api.newrelic.com/graphql" \
  -H "Content-Type: application/json" \
  -H "API-Key: ${NR_USER_API_KEY}" \
  -d "{\"query\":\"{ actor { apiAccess { keySearch(query: {types: INGEST, scope: {accountIds: ${NR_ACCOUNT_ID}}}) { keys { ... on ApiAccessIngestKey { key name type } } } } } }\"}" \
  | jq -r '.data.actor.apiAccess.keySearch.keys[0].key'

# The output is your License Key — store it securely, do not log it.
Enter fullscreen mode Exit fullscreen mode

Step 2: Store in Secrets Manager

aws secretsmanager create-secret \
  --name "new-relic/fsxn-license-key" \
  --secret-string '{"license_key":"YOUR_LICENSE_KEY_HERE"}' \
  --region ap-northeast-1
Enter fullscreen mode Exit fullscreen mode

For automated credential rotation, see the Secrets Rotation Sample Template which provides a Lambda-based rotation pattern for all vendor integrations.

Step 3: Deploy

aws cloudformation deploy \
  --template-file integrations/new-relic/template.yaml \
  --stack-name fsxn-new-relic-integration \
  --parameter-overrides \
    S3AccessPointArn=arn:aws:s3:ap-northeast-1:123456789012:accesspoint/fsxn-audit \
    NewRelicLicenseKeySecretArn=<SECRET_ARN> \
    NewRelicRegion=US \
    S3BucketName=<YOUR_BUCKET> \
  --capabilities CAPABILITY_NAMED_IAM \
  --region ap-northeast-1
Enter fullscreen mode Exit fullscreen mode

Step 4: Verify

-- In New Relic Query Builder:
SELECT count(*) FROM Log WHERE source = 'fsxn-ontap' SINCE 15 minutes ago
Enter fullscreen mode Exit fullscreen mode

If you see results, you're done. The pipeline is working.

What Gets Shipped

Each audit log event is formatted as a New Relic log entry with structured attributes:

{
  "message": "{\"EventID\":\"4663\",\"UserName\":\"admin@corp.local\",...}",
  "timestamp": 1716508800000,
  "attributes": {
    "source": "fsxn-ontap",
    "service": "ontap-audit",
    "event_type": "4663",
    "svm": "svm-prod-01",
    "user": "admin@corp.local",
    "client_ip": "10.0.1.50",
    "operation": "ReadData",
    "path": "/vol/data/report.pdf",
    "result": "Success",
    "s3_key": "audit/svm-prod-01/2026/05/24/audit-001.json"
  }
}
Enter fullscreen mode Exit fullscreen mode

Important: New Relic requires timestamp as Unix epoch in milliseconds (not seconds, not ISO 8601 strings). Our Lambda handles this conversion automatically.

NRQL Query Examples

-- Failed access attempts (security investigation)
SELECT count(*) FROM Log
WHERE source = 'fsxn-ontap' AND result = 'Failure'
FACET user, path
SINCE 1 hour ago

-- Top users by file access volume
SELECT count(*) FROM Log
WHERE source = 'fsxn-ontap'
FACET user
SINCE 1 day ago

-- Operations breakdown (pie chart)
SELECT count(*) FROM Log
WHERE source = 'fsxn-ontap'
FACET operation
SINCE 1 hour ago

-- Time series for anomaly detection
SELECT count(*) FROM Log
WHERE source = 'fsxn-ontap'
TIMESERIES 5 minutes
SINCE 6 hours ago

-- Specific user investigation
SELECT * FROM Log
WHERE source = 'fsxn-ontap' AND user = 'admin@corp.local'
SINCE 1 hour ago
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

Alert Configuration

Create a NRQL alert condition to detect failed access spikes:

mutation {
  alertsNrqlConditionStaticCreate(
    accountId: YOUR_ACCOUNT_ID,
    policyId: YOUR_POLICY_ID,
    condition: {
      name: "FSxN Failed Access Spike",
      nrql: {
        query: "SELECT count(*) FROM Log WHERE source = 'fsxn-ontap' AND result = 'Failure'"
      },
      terms: [{
        threshold: 10,
        thresholdOccurrences: AT_LEAST_ONCE,
        thresholdDuration: 300,
        operator: ABOVE,
        priority: CRITICAL
      }]
    }
  ) { id }
}
Enter fullscreen mode Exit fullscreen mode

This fires when more than 10 failed access attempts occur in a 5-minute window.

Threshold rationale: In typical FSx for ONTAP environments, failed access attempts are 0-2 per 5-minute window during normal operations (e.g., stale NFS handles, permission misconfigurations). A threshold of 10 indicates a potential brute-force attempt or systematic misconfiguration. Adjust based on your baseline — monitor for 1 week before setting production thresholds.

Cost Analysis

Monthly Log Volume AWS Cost New Relic Cost Total
1 GB ~$2 $0 (free tier) $2
10 GB ~$8 $0 (free tier) $8
50 GB ~$25 $0 (free tier) $25
100 GB ~$41 $0 (at limit) $41
200 GB ~$80 $35 (overage) $115

For comparison, the EC2-based approach (syslog-ng + Universal Forwarder) costs ~$66/month in fixed infrastructure regardless of log volume.

Gotchas We Discovered

1. License Key Retrieval (Post-September 2024)

New Relic removed the "Copy key" button from the API Keys UI. You must use NerdGraph to retrieve the full License Key value using the Key ID.

2. Timestamp Format

New Relic's Log API rejects ISO 8601 strings with Error unmarshalling message payload. Timestamps must be Unix epoch in milliseconds (integer).

# Wrong: "2026-01-15T12:00:00Z" → rejected
# Right: 1768478400000 → accepted
Enter fullscreen mode Exit fullscreen mode

3. New Account Initialization Lag

First-time data ingestion on a new account may take 5-10 minutes to appear in the UI. Subsequent deliveries appear within 30 seconds.

4. US vs EU Region

Choose your region at signup. It cannot be changed later. For Japan-based workloads, US is recommended until the Tokyo data center launches in July 2026 (announced March 2026).

What's Next

  • EMS webhooks: Receive ONTAP system events (ARP ransomware detection, quota alerts) in real-time
  • FPolicy events: Stream file operations with sub-second latency
  • OTel Collector path: If you later need multi-backend delivery, the same log format works with the Part 5 Collector
  • OTLP ingest: New Relic supports OTLP natively — if you adopt OTel Collector, you can switch from Log API v1 to OTLP without changing the Lambda code
  • logtype attribute: Add "logtype": "fsxn-ontap-audit" to enable New Relic's automatic Parsing Rules for structured field extraction
  • Production Readiness: Progress from Level 1 (this Quick Start) to Level 4 (Enterprise) — see the Pipeline SLO Definitions

Production Readiness

This integration follows the project's Production Readiness Levels:

Level What You Get Go/No-Go to Next
Level 1 (this Quick Start) Audit poller + DLQ Logs arrive, checkpoint advances, DLQ empty 24h
Level 2 + NRQL dashboards + alerts SLOs met 7 days, security review done
Level 3 + DynamoDB ledger + poison-pill SLOs met 30 days, compliance pack
Level 4 + OTel Collector + redaction Multi-backend, PII redaction, DR tested

Data classification: New Relic receives user and path fields (PII/sensitive). New Relic currently processes data in US and EU regions. For Japan-based workloads requiring data residency, evaluate cross-border transfer requirements with your compliance team. See Data Classification Guide.

Full criteria: Pipeline SLO Definitions | DLQ Replay Runbook

Note on S3 AP network path: The Lambda in this integration is deployed outside VPC (no VPC configuration) for simplest S3 Access Point access. If your Lambda needs VPC access for other reasons, add a NAT Gateway — see the S3 AP Specification for network constraints.

Resources

Series Navigation


Questions about the New Relic integration or the 100GB free tier? Drop a comment below.

Previous: Part 6 — Direct-to-Grafana: Shipping Logs via OTLP Gateway

GitHub: github.com/Yoshiki0705/fsxn-observability-integrations

Top comments (0)