Maame Afua A. P. Fordjour

Posted on Mar 12

I Got Lost in Canary Wharf for 30 Minutes, But I Found the Future of SRE

#sre #cloud #opensource #networking

If you've ever been to Canary Wharf, you know it's less of a business district and more of a high-stakes escape room designed by someone who really dislikes Google Maps.

I arrived for SRE Day London 2026 at the Everyman Cinema feeling prepared. I had my bag, my notes, and a general idea of where I was going. Fast forward 30 minutes, and I was still wandering around Crossrail Place like a lost protagonist in a sci-fi movie. Between the multiple levels and the "Level -2" hidden entrance, the frustration was real. I spent nearly half an hour pacing back and forth, trying to figure out how to actually get into the venue. Honestly speaking I probably hit 2k on my daily steps from walking around in circles to find the exact location. And for some weird reason, google maps and apple maps don't work really well in Canary Wharf, did some research. Apparently it is due to something called The urban canyon effect. If you are interested in reading more about how that affects GPS signals, you can read this article: urban canyon

But honestly? The second I stepped inside, the frustration evaporated.

The Swag Haul

First things first: the swag. As a student, you quickly learn that the quality of an event is often proportional to the stickers on the table. SRE Day did not disappoint. I loaded up on fridge magnets and stickers, but the highlight was definitely the SRE Day t-shirts. I managed to cop one, and let's just say it's going straight into my weekly rotation.

The Knowledge Drop: Morning to Midday

The talks kicked off at 09:00, and sitting on those comfortable Everyman sofas made it feel more like a movie premiere than a tech conference. Here is a breakdown of what I learned before 15:00:

Peter Marshall (Imply): He opened with a keynote on Decoupled Observability. The big takeaway here was that we are often limited by "tightly coupled" architectures where data is stuck to specific tools. By decoupling the data layer, we can scale our detection and investigation without the costs spiralling out of control.

Dewan Ahmed (Harness): He talked about Secure by Default in AI-driven delivery. As we move faster with AI, we run the risk of "automating insecurity at scale." He challenged us to look beyond just scanners and build confidence directly into the pipeline.

Matt Henderson (Phoebe): This was fascinating he compared software reliability to the human immune system. Instead of just reacting to alerts and scrambling when things break, we need systems that can predict and prevent failures, just like our bodies handle threats before we even feel sick.

Tyler Hannan (ClickHouse): He asked a controversial question: Do Metrics Matter? While metrics are the fastest way to see if a system is unhealthy, Tyler pointed out that as systems get more complex, metrics alone aren't enough to understand the "why" behind unpredictable failures.

Birol Yildiz (ilert): He showed AI SRE in action. We're moving toward a world where AI agents can diagnose and remediate outages autonomously. Imagine an incident fixing itself without anyone getting paged at 3 AM!

Lunch and the Afternoon Sprint

After a much-needed pizza break (shoutout to incident.io for powering that!), we dove into the afternoon sessions:

Deniz Yalcin & William Ravensbergen (ING): They reminded us that reliability starts with Customer Data. In banking, if upstream data is malformed or delayed, everything else fails from fraud prevention to user trust. It's an underrated SRE dependency.

Adriana Villela (Dynatrace): She emphasized that Observability is a Team Sport! We often fall into the trap of creating "Observability Silos" just like we did with DevOps. To succeed, observability needs to be integrated across the whole organization, not just a single team.

Heather Thacker (Gatling): She broke down the Performance Testing Arsenal. It's one thing for your app to work on a laptop, but another to handle 10x traffic during a marketing campaign. She covered load, stress, and soak testing—essential tools for any SRE.

Tasmia Niazi: This session was super relatable for me. She shared her journey from learner to leader, explaining that SRE is a mindset and a culture, not just a job title. Adopting that state of mind is what builds resilient teams.

Getting Involved: Tracer Cloud and Open Source

During the networking and sponsor crawl at 14:30, I had the coolest discovery: Tracer Cloud's Open SRE Agent. It's a tool focused on cloud-native alert investigation, using AI to figure out root causes before humans even have to step in.

Because I've been looking for ways to get more "hands-on" experience, I officially signed up to be a contributor! If you want to jump in and contribute to the repo as well, you can find it here:

Tracer-Cloud / opensre

Build your own AI SRE agents. The open source toolkit for the AI era ✨

Open SRE — Build Your Own AI SRE Agents

An open-source framework so you can build AI-powered SRE agents that automate incident investigation and root cause analysis. Plug in the alerting sources you already use (Slack, Grafana, Datadog, PagerDuty and more), and compose custom workflows tailored to your infrastructure

Slack · Getting Started · Tracer Agent · Docs · FAQ · Security

Quick Start

git clone https://github.com/Tracer-Cloud/open-sre-agent
cd open-sre-agent
make install
make install-hooks
cp .env.example .env
# run opensre onboard to configure your local LLM provider
# and optionally validate/save Grafana, Datadog, Slack, AWS, GitHub MCP, and Sentry integrations
opensre onboard
make local-grafana-live

Choose a Path

Local Grafana RCA Demo Run Tracer against a real local Grafana + Loki stack and get a first RCA report with one command Start here: Local Grafana RCA Demo
Bundled Local RCA Demo Skip Docker and run a bundled alert plus bundled evidence fixture…

View on GitHub

Contributing to open-source is one of the best ways to move from theory to reality.

What's Next?

Despite the 30-minute maze-running session at the start, SRE Day was a massive win. I came for the t-shirt, but I left with a contributor invite and a much clearer picture of where the industry is heading.

If you are curious about cloud or have deep interests in reliability, it doesn't matter if you're not in London! They host events all around Europe. You should definitely subscribe to their page on Luma to check out their upcoming events.

Also, if you're looking to give back, they have an option to help out as a Community Hero! It's a great way to support the ecosystem while growing your own network.

Now, if someone could just build an agent to help me navigate Canary Wharf next time... that would be great because honestly, I became Maame 'the explorer' in those 30 minutes😂.

Top comments (23)

Harsh • Mar 12

Great write-up! The urban canyon effect is real Canary Wharf is basically a GPS black hole. Next time use the indoor maps in the Crossrail Place app, they actually work underground.

The sessions you covered are gold. The decoupled observability point from Peter Marshall is crucial we've been fighting tightly coupled telemetry pipelines at work and it's a nightmare. Also, Birol's point about AI agents fixing incidents without paging anyone? That's the dream. 😅

Quick question: any demos from the Tracer Cloud agent at the event? Curious how it handles multi cloud scenarios.

Maame Afua A. P. Fordjour • Mar 12

With the Tracer Cloud agent, I only just forked the repo today when I got home from the event, will play around with it a bit to see how efficient it is. But from the debrief I got from one of the founders he did mention that its an AI agent that investigates data pipeline incidents automatically.... it pulls in logs, metrics, traces, and configs from tools like Airflow, Kafka, Grafana, and Datadog, then generates a root cause report. I think the entire concept is a smart on-call engineer that does the detective work for you. With how it handles multiple cloud scenarios, I am not so sure since I personally haven't tried it. But as time goes on i would give a feedback on that Harsh

vincenthus • Mar 16

Thank you for the kind words :)

I hope we can set you up soon to have a real world demo so can you can personally try it as well.

One small clarification: we started in data pipelines, and Tracer is indeed very strong there, but we have already expanded to a much broader set of use cases, including Kubernetes and web applications.

Have replied in a separate comment on the multi cloud question.

Maame Afua A. P. Fordjour • Mar 16

Nice one! Glad you were able to get back to Harsh :)

vincenthus • Mar 16

Hi Harsh, Vincent here, co-founder of Tracer.

Thanks for the great question!

Multi-cloud is generally not a problem for Tracer when the data comes through platforms like Grafana or Datadog, since they already aggregate across cloud environments.

For teams that prefer not to send all logs or telemetry to an observability vendor, whether for cost, privacy, or self-hosting reasons, Tracer also supports direct cloud integrations. Right now that is AWS, and we plan to add more cloud-provider connectors over the coming weeks and months.

What cloud providers are you using?

Richard Djarbeng • Mar 14 • Edited

Nice post! Took me a while to realise that the words in brackets are probably organizations they work for.
Matt Henderson (Phoebe) first thought was, maybe Matt showed up with another person called Phoebe.

Maame Afua A. P. Fordjour • Mar 15

Haha no it’s the name of their cloud services😂😂really strange names

Victor Okefie • Mar 12

The urban canyon effect broke your GPS, but it also broke your assumption that directions are reliable. That's the SRE lesson wrapped in a metaphor: you don't know your system's failure modes until the signal drops. The lost 30 minutes taught you more about Canary Wharf than a map ever could.

Maame Afua A. P. Fordjour • Mar 12

I honestly knew nothing about the urban canyon till today, it is quite interesting as to how that can affect gps. And I definitely will not depend a 100% on directions from today😂. Tbh in situations like this I think a physical map would be more helpful

vincenthus • Mar 16

Hi Maame, thank you so much for joining our open-source community and for opening the first few issues. We really appreciate it!

Also, thank you for the great write-up. You captured the event and the ideas behind it really well.

Great to have you with us and look forward to catching up soon 💪

Maame Afua A. P. Fordjour • Mar 16

Awesome Vincenthus! You would be the best person to answer the questions :)

Julien Avezou • Mar 12

Great post! I have really been enjoying your posts these last weeks Maame.
The day I realized I accumulated too much swag from events and my previous company was the day I had to move flats haha

Maame Afua A. P. Fordjour • Mar 12

Thank you Julien! Really glad you like it. I do agree with the swag being too much . I have way too many stickers for both of my laptops that i'd have to put the rest on my forehead at this rate😂

Maame Afua A. P. Fordjour • Mar 13

Exactly Luftie! It is quite interesting how we learn everyday and there's so many interesting theories behind stuff. Thanks for the support Luftie and yeah fingers crossed! lets see where tech would lead us :)

klement Gunndu • Mar 15

The "urban canyon effect" messing with GPS is a perfect metaphor for SRE — infrastructure that looks fine from outside but has hidden signal loss. Curious if the decoupled observability talk addressed how to detect those silent signal gaps before they become outages.

Maame Afua A. P. Fordjour • Mar 15

They didn’t address that from what I remember Klement

Harjot Singh • Jun 1

getting lost in a place like Canary Wharf sounds like quite the adventure. it's wild how technology can fail us in such a high-tech area. on another note, if you're ever looking to spin up a project quickly, Moonshift lets you deploy a full next.js + postgres + auth setup in about 7 minutes, with the code on your github. hit me up if you want to give it a free run.

sam jha • Mar 19

Love how you turned getting lost into a great observation about the future of SRE. The shift toward AI-assisted incident response is real — I've seen teams go from "alert fatigue" to actually meaningful pages by layering smarter routing and context-aware suppression. The "you can't automate what you don't understand" point is especially important. A lot of teams are jumping to AI ops without having solid runbooks first, and it shows. Thanks for writing this up — always appreciate when event writeups capture the hallway conversations, not just the keynotes.

View full discussion (23 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.