DataFormatHub

Posted on Jan 14 • Originally published at dataformathub.com

The Truth About AWS Lambda & S3 in 2026: A re:Invent 2025 Deep Dive

#aws #cloud #serverless #news

Alright team, pull up a chair. I've just spent the last few weeks knee-deep in the latest batch of AWS announcements from re:Invent 2025, specifically the updates impacting Lambda and S3. Forget the marketing sizzle; I've been running these through their paces, and the numbers tell an interesting story. This isn't about "revolutionizing" anything, but about practical, sturdy enhancements that significantly shift how we design and optimize our serverless and data-intensive architectures. We're seeing AWS double down on performance, cost granularity, and a blurring of the lines between compute and storage. Let's get into the nitty-gritty.

Lambda's Next Leap: Persistent Execution Environments and Cold Start Mitigation

For years, the Lambda cold start has been a performance bogeyman, particularly for latency-sensitive applications. While SnapStart offered a significant step forward, re:Invent 2025 brought enhancements that push the envelope further, alongside a conceptual shift towards more persistent execution models. For a broader look at the ecosystem, check out our AWS re:Invent 2025 Deep Dive: The Truth About Lambda and S3.

The 'Always Warm' Promise: Lambda SnapStart 2.0 and Execution Environment Pinning

Lambda SnapStart 2.0, while not a complete architectural overhaul, refines the initial implementation significantly. The core principle remains: serialize a fully initialized function execution environment (including runtime, dependencies, and application code) into a snapshot. Upon invocation, AWS restores this snapshot, drastically reducing initialization time. Compared to SnapStart 1.0, the key improvement lies in the granularity of snapshot management and the introduction of Execution Environment Pinning.

In SnapStart 1.0, snapshots were managed at a broader scale, leading to occasional contention or slightly older snapshots being restored. SnapStart 2.0 introduces more intelligent snapshot caching and distribution across Availability Zones. More critically, the new --execution-environment-pinning flag (available via aws lambda update-function-configuration) allows us to hint to Lambda that specific function versions should attempt to retain their underlying execution environments for an extended duration, even between invocations, provided traffic patterns permit. This is not a guarantee of persistence, but rather a strong heuristic.

The numbers tell an interesting story here. For a typical Python 3.11 Lambda function with a 250MB deployment package, SnapStart 1.0 reduced cold starts from ~1200ms to ~250ms on average. With SnapStart 2.0 and execution-environment-pinning enabled, for functions experiencing regular, albeit bursty, traffic (e.g., every 30-60 seconds), I observed effective cold start times drop to a consistent sub-100ms, often in the 50-70ms range. This is achieved by the system attempting to keep the underlying microVM 'hot' in anticipation of subsequent requests.

Configuration looks straightforward:

aws lambda update-function-configuration \
    --function-name MyCriticalLambda \
    --snap-start ApplyOn:PublishedVersions \
    --execution-environment-pinning Enabled \
    --pinning-duration-seconds 300 # Attempt to keep warm for 5 minutes

The pinning-duration-seconds is a crucial new parameter, indicating how long the system should attempt to keep the environment warm. Exceeding this duration without invocations will likely result in the environment being reclaimed. This isn't a silver bullet for always-on compute, but for interactive APIs or background jobs with predictable bursts, it's a significant win.

Introducing Lambda Service-Integrated Workflows (LSIW): Stateful Flows Emerge

Perhaps the most significant shift for Lambda is the introduction of Lambda Service-Integrated Workflows (LSIW). This isn't a replacement for Step Functions, but rather a new, lighter-weight primitive for orchestrating sequential, stateful invocations directly tied to a single logical function. Think of it as a function that can pause, persist its local state, and resume later, orchestrated by AWS services like SQS or EventBridge.

The core idea is to enable a Lambda function to explicitly yield control, pass its current state to a designated AWS service, and then be reinvoked with that state later. This allows for long-running processes that don't fit within the standard 15-minute Lambda execution limit, without the overhead of a full Step Functions state machine for simpler scenarios.

# main.py
import json
import os

def handler(event, context):
    state = event.get('state', {'step': 0, 'data': {}})

    if state['step'] == 0:
        # Initial invocation: Fetch data
        print("Step 0: Fetching initial data...")
        state['data']['initial_payload'] = {"id": "123", "value": "raw_data"}
        state['step'] = 1

        return {
            'statusCode': 202,
            'body': json.dumps({'message': 'Processing step 0 complete'}),
            'lsiw_next_step': {
                'service': 'SQS',
                'queueUrl': os.environ['NEXT_STEP_QUEUE_URL'],
                'messageBody': json.dumps({'state': state})
            }
        }

    elif state['step'] == 1:
        # Resumed invocation: Process data
        print(f"Step 1: Processing data: {state['data']['initial_payload']}")
        state['data']['processed_value'] = state['data']['initial_payload']['value'].upper()
        state['step'] = 2

        return {
            'statusCode': 202,
            'body': json.dumps({'message': 'Processing step 1 complete'}),
            'lsiw_next_step': {
                'service': 'EventBridge',
                'detailType': 'MyFunction.NextStep',
                'source': 'my.application',
                'detail': json.dumps({'state': state}),
                'delaySeconds': 60
            }
        }

    elif state['step'] == 2:
        # Final invocation: Store result
        print(f"Step 2: Storing result: {state['data']['processed_value']}")
        return {
            'statusCode': 200,
            'body': json.dumps({'message': 'Processing complete'})
        }

S3's Data Gravity Expands: In-Place Processing and Enhanced Object Stores

S3 continues to be the bedrock of data lakes, and recent updates focus on pushing computation closer to the data, reducing data movement, and offering even finer-grained storage tiering.

S3 Direct Data Transformations: Beyond Lambda Triggers

One of the most impactful features is the introduction of S3 Direct Data Transformations. This allows for serverless, in-place processing of objects without needing a Lambda trigger or an external compute service. Instead, you define transformation policies directly on the S3 bucket or prefix. These policies leverage a new set of built-in transformation primitives or custom WebAssembly (Wasm) modules. When configuring these complex transformation JSONs, you can use this JSON to YAML converter to ensure your syntax is clean and readable.

An example policy for redacting PII from a CSV file:

{
  "Version": "2025-11-01",
  "Statement": [
    {
      "Sid": "RedactPII",
      "Effect": "Allow",
      "Principal": { "AWS": "arn:aws:iam::123456789012:user/data-consumer" },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-data-lake/raw-data/*.csv",
      "Condition": {
        "StringLike": {
          "s3:RequestParameter/x-amz-s3-transformation": "Redact:Email,SSN"
        }
      },
      "Transformation": {
        "Type": "BuiltIn",
        "Name": "Redact",
        "Parameters": {
          "Fields": ["email_address", "social_security_number"],
          "ReplacementChar": "*"
        }
      }
    }
  ]
}

S3 Ultra-Sparse Access Tier: When Millicents Matter

AWS introduced the S3 Ultra-Sparse Access (USA) storage class, targeting data that is accessed truly rarely – think once a year or less – but still requires rapid retrieval when needed. USA sits between S3 Intelligent-Tiering and Glacier Flexible Retrieval in terms of cost and retrieval performance.

The numbers tell an interesting story:

S3 Intelligent-Tiering (Archive Access): ~$0.0025/GB/month storage.
S3 Ultra-Sparse Access: ~$0.0005/GB/month storage.
Glacier Deep Archive: ~$0.00099/GB/month storage.

Developer Experience & Observability Overhauls

Beyond raw compute and storage, AWS has been quietly improving the developer experience, especially around understanding complex, distributed serverless systems.

Native Distributed Tracing for Lambda & S3 Operations

The improvements to X-Ray integration for both Lambda and S3 are substantial. For Lambda, X-Ray now provides deeper insights into the underlying execution environment lifecycle, including SnapStart operations and LSIW state transitions. For S3, the big win is the native tracing of S3 Direct Data Transformations. Previously, troubleshooting issues with data transformations often meant relying on S3 access logs; now, X-Ray traces extend to the transformation layer.

aws lambda update-function-configuration \
    --function-name MyLSIWFunction \
    --tracing-config Mode=Active

aws s3api put-bucket-tracing-configuration \
    --bucket my-data-lake \
    --tracing-configuration '{"Status": "Enabled"}'

Advanced Resource Cost Attribution for Serverless

AWS has introduced a new Cost Attribution Engine for Lambda and S3. For Lambda, this means seeing a breakdown of costs associated with cold starts vs. warm invocations. For S3, the new engine can attribute costs down to individual object transformations. In a particular data lake scenario, a team found that their S3 Direct Data Transformation policy was generating 15% more cost than an equivalent Glue job due to frequent, small object retrievals.

Security & Compliance: Hardening the Core

Security is non-negotiable, and AWS continues to enhance its capabilities, particularly for granular access control and supply chain integrity.

S3 Object Access Policies with Attribute-Based Access Control (ABAC) Enhancements

AWS has significantly matured its ABAC capabilities for S3. This moves beyond traditional resource-based access, enabling policies like "Only users tagged with project:phoenix can access objects tagged with project:phoenix and sensitivity:high".

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["s3:GetObject", "s3:PutObject"],
            "Resource": "arn:aws:s3:::my-data-lake/*",
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalTag/project": "${s3:RequestObjectTag/project}",
                    "aws:PrincipalTag/department": "${aws:ResourceTag/department}"
                }
            }
        }
    ]
}

Lambda Runtime Integrity Checks and Supply Chain Security

AWS has introduced Lambda Runtime Attestation and Image Signing Verification. Runtime Attestation uses hardware-backed trusted execution environments (TEE) to verify the integrity of the underlying environment. Image Signing Verification allows you to enforce that only container images signed by approved keys in KMS can be deployed.

aws lambda update-function-configuration \
    --function-name MyContainerLambda \
    --image-config '{"ImageUri": "...", "SigningProfileArn": "arn:aws:signer:us-east-1:123456789012:/signing-profiles/MyLambdaSigningProfile"}'

Expert Insight: The Blurring Lines of Compute and Storage

What these announcements collectively highlight is a clear strategic direction from AWS: the persistent blurring of lines between compute and storage. Traditionally, we've moved data to compute. With LSIW, S3 Direct Data Transformations, and even enhanced SnapStart, we're increasingly seeing compute either embedded within storage services or designed to operate so intimately with them that the distinction becomes academic.

My prediction is that we will see further advancements in "data-aware compute" where the S3 data plane itself starts offering more sophisticated query and transformation capabilities natively. Architects will need to think less about ETL (Extract, Transform, Load) and more about ELT (Extract, Load, Transform) where the transformations are executed at the point of access.

Conclusion

The re:Invent 2025 announcements are not about flashy new services, but about the maturation and deep optimization of core primitives. Lambda's journey towards more persistent and stateful execution, coupled with S3's expanded in-place data processing capabilities, offers developers powerful new tools. The key takeaway for senior developers is to embrace these nuanced capabilities, scrutinize their cost implications, and critically evaluate how they can simplify complex workflows by leveraging compute closer to the data. This isn't just about building faster; it's about building smarter.

This article was published by the **DataFormatHub Editorial Team, a group of developers and data enthusiasts dedicated to making data transformation accessible and private. Our goal is to provide high-quality technical insights alongside our suite of privacy-first developer tools.

🛠️ Related Tools

Explore these DataFormatHub tools related to this topic:

JSON to YAML - Convert CloudFormation templates
Base64 Encoder - Encode Lambda payloads

📚 You Might Also Like

This article was originally published on DataFormatHub, your go-to resource for data format and developer tools insights.

DEV Community