Taavi Rehemägi for Dashbird

Posted on Feb 25, 2021 • Edited on Apr 22, 2021 • Originally published at dashbird.io

Introducing: Dashbird's serverless Well-Architected Insights feature

#cloud #serverless #aws

TL;DR: Dashbird now scans your serverless infrastructure for industry best practices. It's the antidote for chaos.

We're excited to introduce the Dashbird Well-Architected Insights -- a continuous insights scanner combined with Well-Architected reports. The new feature provides serverless developers with insights and recommendations to continually improve their applications and keep them secure, compliant, optimized, and efficient.

With all the data Dashbird already holds for its users, as an end-to-end observability platform with failure detection capabilities, it was the logical next step to build on top of this and build a next abstraction-layer, almost like a trusted advisor, continuously overseeing the posture and state of the application.

During the last three years, we have worked with thousands of companies building serverless applications in the cloud, helping them better scale and operate their online services. During our time, we have observed that with the rising complexity of cloud applications, it becomes increasingly difficult to manage security, cost, and performance, comply with the best practices and maintain the posture of the environment.

The reason for these challenges is two-fold: large amounts of moving parts in the infrastructure and lack of know-how of best practices across engineering teams.

Deeply rooted in community know-how and the Well-Architected Framework

Inspired by the best-practices of the AWS Well-Architected Framework, Dashbird is continuously running over 80 checks against users' serverless infrastructure, giving them actionable advice on how to improve their application in order to align with the Well-Architected best practices in each of its five pillars:

Cost Optimization
Security
Reliability
Performance optimization
Operational excellence

The nature of the checks spans from detecting metric anomalies like increased latency, error rate or being close to a memory limit to configuration settings and posture, examples include unused resources or lack of security practices.

Checks are categorized by criticality and vertical, giving users a structured overview of the findings and a clear overview on a single pane of glass.

All of the checks are also published in the Events Library, with details on intervals, conditions, reasoning and for some insights, remedy steps.

Instantly spot risky and inefficient parts of your cloud stack in a single pane of glass report

When the scanner finds misconfigurations, best practice violations, inefficient or problematic resources across AWS Lambda, API Gateway, DynamoDB, ECS, Step-Functions, SQS, Kinesis streams, it aggregates that into a single report.

For each detected insight, the user is also presented with an explanation of the issue and a step-by-step guide for troubleshooting.

Broken down by the five pillars of the Well-Architected framework, the report shows you how well you're doing against each one of the pillars on a progress bar, as well as a break-down of your resources by regions and services, which gives you a clear overview of which region or service you have the most issues in. Additionally, you can see all the expandable events listed in detail and the exact resources you're having those difficulties with.

Results of all the checks Dashbird is constantly running against your system: how many of them were passed successfully, how many are critical, what you should keep your eye on.

Arming teams with knowledge and eliminating heavy lifting

When testing out the early versions of the reports with our users, we have seen knowledge inside the teams significantly increasing and engineers becoming more and more aware of the state of their applications and how to build serverless products.

It takes a lot of effort to work through the Well-Architected whitepapers and to internalize them inside the organisation. Therefore, one of the main reasons for bringing out this new feature was to break down the knowledge to bite-sized pieces to make it as easily understandable as possible and learn serverless best practices on the go by doing. We've already seen this educating our user base in various disciplines that contribute to security, reliability and compliance improvements.

Another benefit users are experiencing is reduction of manual tasks -- in complex cloud environments with hundreds or even thousands of resources, going over the metrics, logs, configurations and workings of each resource is a tremendous challenge and oftentimes problems slip though unnoticed. In many cases, it's also hard for developers to detect waste or potential threats, making it crucial to have automated checks that run with short intervals.

The importance and benefits of being Well-Architected:

Build and deploy reliable systems faster

Often, customers start in experimentation and workloads tend to develop organically with increasing additions. It's common that in this growth process, a deviation from best practice happens. For this reason, it's important to align what you already have with the industry best practices to ensure faster deployment and a better security posture.

Lower or Mitigate Risks

Risks span across all five pillars of the Well-Architected Framework (Reliability, Performance, Operational Excellence, Cost, and Security), and continuously optimizing your system for its best performance will lower or mitigate risks over a period of time. It's common with serverless users not to know exactly what their risk profile is, which becomes a big worry for C-level profiles who start asking "where do we sit in our risk profile?" and "how do we work to reduce that over a 6-12month period?"

Make Informed Decisions

Tracking the activity, health, performance, cost and other crucial metrics of your system enables data-driven decision making. Having visibility into how changes of the infrastructure impact the customer-facing properties of the system is critical. This is also at the core of Dashbird's features.

Learn AWS Best Practices

We have found that the happiest serverless users are those that feel well-educated. By instilling end encouraging serverless best practices, the Dashbird Well-Architected insights is a simple way to continuously learn about building truly resilient serverless infrastructures by doing, without having to spend hours or days going through lengthy whitepapers.

Dashbird is the leading serverless monitoring platform in the market

Started three years ago, Dashbird is an end-to-end monitoring platform for debugging, troubleshooting, monitoring, and optimizing serverless applications. So far, the platform has attracted thousands of serverless enthusiasts and is adopted by the leading enterprises relying on serverless. The platform is the only completely frictionless service in the market, relying solely on the outputs of the system and not requiring any code instrumentation. Customers enjoy the 2-minute setup process, intuitive UI and prebuilt alarms and insights; and being able to start debugging and working with their data immediately after signing up.