DEV Community

Cover image for What Building a Zero Trust Serverless Architecture on GCP Taught Me (Including the Mistakes)
Sesank Munukutla (Naga)
Sesank Munukutla (Naga)

Posted on

What Building a Zero Trust Serverless Architecture on GCP Taught Me (Including the Mistakes)

Coming from a stronger AWS background, I decided to stop comparing cloud providers from documentation and instead build something end-to-end on Google Cloud Platform.

The objective was simple:

Build a secure serverless architecture using:

  • Terraform
  • Cloud Run
  • Cloud Armor
  • Global Load Balancer
  • Cloud Logging
  • Cloud Monitoring

The actual experience turned out to be much more interesting than I expected.


The Original Plan

My initial architecture looked like this:

orginal Plan

Internet --> Global Load Balancer -->Cloud Run

Simple Deploy infrastructure.
Validate connectivity.
Write a blog.
Done.

That lasted about 30 minutes.


First Reality Check: Cloud Run Isn't AWS Lambda

One mistake I made initially was trying to think in AWS patterns.

In AWS, my instinct would be:

CloudFront
   │
WAF
   │
ALB
   │
Lambda
Enter fullscreen mode Exit fullscreen mode

GCP approaches this differently.

To place Cloud Run behind a load balancer, I had to learn about:

Serverless Network Endpoint Groups (NEG)
Enter fullscreen mode Exit fullscreen mode

That was my first major lesson.

The cloud services may look similar.

The architecture patterns are not.


The First Error

My Terraform deployment failed because I accidentally duplicated resources across multiple files.

Terraform wasn't happy.

Error:

Duplicate resource configuration
Enter fullscreen mode Exit fullscreen mode

The issue wasn't Terraform.

The issue was my project structure.

I had the same resources declared in multiple files while reorganizing the code.

Lesson learned:

Keep resource ownership clear when splitting Terraform configurations.


The Authentication Surprise

At one point I exposed the application through the load balancer and expected traffic to work.

Instead I received:

403 Forbidden
Enter fullscreen mode Exit fullscreen mode

Initially I assumed something was broken.

It wasn't.

Cloud Run was enforcing IAM authentication exactly as designed.

That became one of the most important lessons of the project.

Just because a load balancer is public doesn't mean the backend becomes public.


Adding Cloud Armor

ztpolicy
The next step was implementing Cloud Armor.

I configured protections for:

  • SQL Injection
  • Cross-Site Scripting (XSS)

Then I started testing.

Example:

curl "http://LOAD_BALANCER_IP/?id=' or 1=1 --"
Enter fullscreen mode Exit fullscreen mode

Response:

403 Forbidden
Enter fullscreen mode Exit fullscreen mode

Exactly what I wanted.

The attack never reached the backend service.

Cloud Armor blocked it first.

sql block

Monitoring Was Harder Than Expected

One of the more frustrating issues appeared while configuring Cloud Monitoring.

Terraform kept failing.

The alert policy looked correct.

The deployment did not.

After troubleshooting I discovered:

resource.type was missing from the monitoring filter
Enter fullscreen mode Exit fullscreen mode

A small configuration issue.

A lot of wasted troubleshooting time.

But that's also where the learning happened.

monitor

What Surprised Me Most

Three things stood out.

1. Identity Is the Security Boundary

Cloud Run's IAM integration is powerful.

Instead of relying primarily on network controls, identity becomes the primary enforcement mechanism.


2. Cloud Armor Was Easier Than Expected

I expected WAF integration to be complicated.

The integration with the load balancer was straightforward.

The difficult part wasn't deployment.

It was understanding where protection was actually occurring.


3. Observability Matters More Than Blocking

Blocking attacks is useful.

Seeing blocked attacks is better.

The logs ended up being just as valuable as the WAF itself.

Without visibility:

  • you cannot investigate
  • you cannot validate
  • you cannot improve

What I Learned

This project wasn't really about learning Cloud Run.

It was about understanding how security architecture changes between cloud providers.

Some takeaways:

  • Cloud services with similar names often solve problems differently.
  • Identity-based security is more important than network-based assumptions.
  • WAF rules are only valuable if you can validate them.
  • Monitoring should be treated as part of security architecture.
  • Terraform destroy is just as important as Terraform apply.

If I Did It Again

I would:

  • Design the architecture first.
  • Map security controls before deployment.
  • Plan observability from the beginning.
  • Keep Terraform modules cleaner.
  • Capture evidence throughout implementation instead of after.

Final Thoughts

The biggest lesson wasn't learning another cloud platform.

It was realizing that cloud migrations are often security architecture migrations.

Applications are usually the easy part.

Identity models, monitoring, logging, ingress controls, and security enforcement are where the real work happens.

And that's where most of the interesting engineering challenges live.

Top comments (0)