DEV Community: João Guimarães

Performance Development

João Guimarães — Tue, 09 Jun 2020 00:00:00 +0000

Abstract

Carefully planning all stages of development can ease our expectations in terms of architectural decisions, roadmaps, costs, technical dept, among other aspects.

Opening and keeping a communication channel between the engineering and product teams is an import part of the process and should be taken seriously.

The concepts of load testing and stress testing add valuable information about the performance of your application and how you can act to mitigate degradation of service you provide to the users.

Opinions are my own and not of my employer.

The concept

Performance has become an important piece in development as it provides historical and real-time value on how an application (ex: a micro-service) reacts to different demands. With this data, we can also tailor our infrastructure to be optimised to those demands.

Performance development is thinking through every step in our development and it should be considered as a daily practice.

Considering that we as developers take advantage of many tools to help us comply with code standards (ex: ESLint or Commitlint), so why can't we use tools that will enable us to comply with performance as well?

So how can we track performance?

Tracking performance

Performance can be impacted by two factors, the codebase and the provisioned infrastructure. Monitoring/Observability is implied in both.

The codebase can have poor architectural decisions that will affect in many ways its performance such as maintainability, memory and CPU consumption, among others.

The infrastructure can also have a huge impact in performance, mainly if it's under dimensioned which can crash the application for no apparent reason or by making poor decisions that lead to vertically scaling the infrastructure.

Both factors are strongly connected in the early stages as you need to provisioning infrastructure to deploy the codebase.

Monitoring can impact the codebase hard, but it definitely can provide valuable insights on the performance of the codebase and infrastructure with a well thought observable system implementation.

If you are interested in Monitoring, please read my Modern Monitoring post for a more in-depth analysis.

Monitoring incurs additional costs. Logs can drastically affect performance (ex: throughput of a micro-service). Infrastructure monitoring also incurs costs with your cloud provider.

If you rely on a cloud provider to manage and deploy, consider starting first with a small compute engine machine, or low impact image when using container orchestrators such as Kubernetes. These are also valid for on-prems solutions.

For serverless solutions, the scenario is slightly different since the infrastructure is mainly managed by the provider itself. But apart from this difference, the rest should also apply.

As new features are developed, chances are your application will get more "greedy" and demanding or perhaps the user demand itself changed. Running the load tests against these new scenarios will detect new issues.

Now we can act upon that!

We can review/refactor the codebase and/or apply changes to the provisioned infrastructure.

Development process

As mentioned above, starting a new project often involves setting up code analysis tooling and that gets defined somewhere or by someone, normally a company or a team policy.

With this common ground in mind, we also need "policies" for what we are targeting in terms of business value. This somewhere or someone is the stakeholders/product team. These "policies" can be defined as SLAs but we will discuss more about this later on.

This means we need numbers to track performance and develop our process of measuring it.

Continuing with a micro-service example, what's the excepted maximum number of simultaneous requests hitting it? And how can we guarantee that our codebase is up to that challenge and our infrastructure is well provisioned to withstand that demand? The answer is load testing.

The stakeholders/product expectations define our threshold to work with.

With this threshold in mind, a well-implemented load testing step ensures us that any modification on the codebase or infrastructure never falls below the threshold. And if it comes to that, we can immediately take action and make amends or changes to surpass it.

Don't run load testing blindly! Your consumers are your own. Act accordingly.

If stakeholders/product fails to provide solid and reliable load scenarios then the architecture/development is crippled. Same goes the other way around, If the designed and implemented architecture is not optimised and maintainable then most likely you will fail delivering your service below stipulated SLAs.

Load testing can be effective to identify real markers such as latency, 3rd parties response timeouts, or any other particular dependency your application has.

How can we identify all of these?

In order to gather, digest, correlate and analyse all these markers we need an observable system, as mentioned above.

Monitoring & Observability

Observability describes the ability of how accurate we can observe the characteristics of an external output of a system and infer it as a measure of its internal state.

A system should have health checks, metrics, log entries and end-to-end tracing in order to be considered observable.

Observability is not Monitoring!

To make performance development possible it requires a good monitoring & observability system capable or retrieving valuable information from your codebase and infrastructure.

Every action depends on the accuracy and reliability of all those data types a system provides. Choose your service wisely and most of all that it provides most of the tools you need. Different services are harder to maintain and sync data.

Load Testing

As we have been discussing, load testing evaluates the performance of an application allowing us to detect issues with the current infrastructure and codebase.

But how can we add this step to our development process?

Run against the production environment;
Provision an environment that mirrors production.

Please read Load testing micro-services by João Tiago for a deep dive on a real example of the first approach.

The approach you choose to take depends on how much value it adds to the application. The reason I choose the second approach lays the foundation for stress testing, which is described in the next section.

If your application depends on 3rd party APIs you may need to contact your liaison to understand what impact running the tests will have on their servers. Solutions to this problem are for them to provisioning a sandbox or for you to mock their API.

There's a chance the 3rd party starts blocking requests from your load tests if you don't contact them for clarification on their limitations if any.

If you mock their API take special attention to mock their limitations as well, such as maximum timeouts, throttling, etc.

The infrastructure provisioned MUST be identical to production but with the difference that it will be an ephemeral environment.

Consider using the CI/CD pipeline workflow to trigger the load testing step before the application is deployed to production, although it can also be a parallel step which won't affect the deployment to production. The key takeaway is that it's a really important step to keep taps on the performance of your application, and should be part of the pipeline workflow.

One important thing to take into account is where you're making the requests to your application. If you're targeting people all over the world and you have a multi-region deployment, consider adding this geographic constrains as well as it influences latency your consumers might experience.

Here's a simple example of an application.

After running a load test we detect that this machine type has a low CPU and a high memory for its usage, so it's easy to figure out that we may need to change the provisioned type to one with higher CPU and lower memory.

The next step, stress testing, is an additional step to load testing due to its unique purpose.

Stress Testing

A particularly interesting step of performance development is stress testing, which simulates scenarios of abnormal peaks until the application crashes, providing awareness and predictability of the application.

By knowing the hard limitations of the infrastructure, we can start thinking about ways to mitigate the loss of QoS - Quality Of Service (ex: scalability).

The purpose of any stress testing is to crash something - always!

First, we need to understand the importance of SLI, SLO and SLA.

SLI (Service Level Indicators) "...are well-defined metrics that describe the behaviour of the system";
SLO (Service Level Objectives) "...are specific targets for those SLI";
SLA (Service Level Agreement) "...lists SLO that define the performance guarantees to customers and its consequences if they are not met".

Stress testing is extremely important to profile your infrastructure and identify signs of degradation in our application.

By monitoring & observability, we can define, for example, when our infrastructure needs to scale up.

There are some interesting and most common ways of scaling your application.

Proactive cyclic is defined by a periodic scaling that occurs at a fixed interval (daily, weekly, monthly, quarterly);
Proactive event-based scales just when we expect a big surge of traffic due to a scheduled business event (new product launch, marketing campaigns);
Auto-scaling based on demand requires a monitoring service so the system can send triggers to take appropriate actions such as scaling up or down based on metrics (utilisation of the servers or network i/o, for instance).

Another "side effect" of running this step is that we can see similarities with a (D)DoS - Distributed Denial of Service.

There is a distorted relation between what a DoS infrastructure layer attack will havoc, and a stress testing tries to accomplish as both are meant to damage the availability of your service to the point of collapsing.

This is a perfect step to validate all the mechanisms in place to stop a (D)DoS attack, for example, throttling requests, and so on.

This is all great, but there are a lot of nuances here. When running a stress testing you don't want those mechanisms to act because it can be a legitimate scenario of abnormal peak usage. This can be tricky to configure or provision.

Costs

Everything discussed so far comes with a price, and that price often is a steep one at the end of the month.

The key takeaway from all of this is that there will be a tradeoff between costs and the confidence developers will have deploying.

Conclusions

Although a bit extensive, the purpose of this post was not to be in-depth on each step. There are many approaches to each one and honestly, it may be different in your situation.

I hope you can identify yourself with all of the above. 😊

Please share positive feedback about this topic. 🙏

Thank you! ❤️

Designing an API

João Guimarães — Thu, 22 Aug 2019 13:15:58 +0000

Over the last year I have been given the chance to work with some amazing people and we all have been developing micro-services that expose RESTful APIs.

Through out all this year we had to integrate with other APIs, such as RESTful, NVP - Name-Value Pair, etc, and both internal or external 3rd parties.

Making sure the project runs without any major incident is hard work. It involves a lot of people from a lot of departments and their ability to share information is the most vital part before developing the API.

In no way am I stating that these are the right approaches to take when designing an API. I am just sharing what I have learned the hard way.

So here are some points I find important to keep in mind when given a new task, aka, build a new micro-service.

Understand the API dependencies;
- a dependency can be an internal API, external API or an AWS service you need to integrate with, etc...
Understand if those dependencies provide all the functionality the API requires (both tech and business);
- does it fit the purpose;
- how well it is documented/supported;
- is it too strict;
- how flexible regarding unforeseen/future changes.
Understand what the API MUST deliver;
- current deliver;
- future deliveries (although in an agile environment this is a fuzzy topic).
An API contract;
- a contract is an agreement between two or more parties that develop/consume the API.
API documentation;
API testing.

All above topics can be grouped into 3 main categories:

API dependencies
API specification
API development

API dependencies

As I stated above, this is where typically you will have the business and technical requirements and you or your team start brainstorming/spiking on the best course of action.

This means, understanding what are the API dependencies and their added value.

From personal experience, you always miss some edge cases so it is a good practice to use UML sequence diagrams to help you structure how your endpoints will behave, such as request headers, payloads, responses, etc.

What is an UML sequence diagram?

An UML sequence diagram describes how operations are carried out in an interactive format. They are time based and they show the order of those interactions. They also specify all the participants in the workflow.

A visual interpretation helps you keep track of the workflow and define happy paths as well as errors from any 3rd party API and how your own API should deal with that information.

How can we as a developer take advantage of sequence diagrams? By using tools such as PlantUML or mermaidJS which allows us to generate diagrams from textual representations.

A simple example with PlantUML (taken from the official website):

@startuml
Alice -> Bob: Authentication Request
Bob --> Alice: Authentication Response

Alice -> Bob: Another authentication Request
Alice <-- Bob: Another authentication Response
@enduml

which will generate the next image:

This is a cool feature because it can be control versioned.

I find that mermaidJS is still a little behind of PlantUML in terms of integrations and functionalities but they are both powerful tools and I have used both in different contexts. You should use the one that best fits your needs.

If you use Confluence, there is a nice plugin for PlantUML.

API specification

After you have defined the diagrams, the next step is to start drafting the contract.

This contract should be "signed off" in a standard specification that most developers can be familiarised with. Luckily the OpenAPI Specification has been here for a while.

The specification is written in yaml and it can also be control versioned.

Once again and from personal experience, the drafted contract may suffer small to medium changes. Which is normal, it's a contract where more than one team is involved and feedback is always a good thing.

Always be open to suggestions but don't forget that your team owns the API.

Discussion is healthy and allows us to see different angles to achieve the same goal.

Keep in mind that your API may also suffer changes in the future which may impact production environments. So think wisely on versioning your API being that through path versioning such as /v1, etc. Or by headers such as GitHub's example Accept: application/vnd.github.v3+json.

If you are like me and your OCD kicks in when the topic "API versioning" comes to the table then read this interesting post about Evolvable APIs from Fagner Brack - To Create An Evolvable API, Stop Thinking About URLs.

API development

It's time to take all the value we gained before to start implementing it into code. Just make sure to follow the contract and protect the micro-service against unexpected 5xx popping into production.

But depending on the language you choose to code, a big part of the development is testing - unit, functional, etc...

With the right tools you can prepare functional test scenarios by using Postman or Insomnia.

Postman has a neat feature which is called Newman where you can run a collection against a file to check if your endpoints follow the contract.

At this point I had shared tools that can be version controlled along with the current code. Making it easy to keep all of them synced.

Demoing with an example

Nothing is better than an "almost real" example to demonstrate everything described above.

This example is based on making capture, void and refund transactions, given an authorization identifier.

an authorization identifier means that we locked some amount from the payment method used by a customer.

Fictional requirements

capture the authorization;
- charge the account an amount lower or equal than the locked funds in a specified currency;
- returns a transaction identifier for possible refund.
void the authorization;
- release the locked funds.
refund the account;
- providing a valid transaction identifier;
- returns a refund transaction identifier.
all above actions MUST be validated against another fictional internal API;
- validate that accountId is linked to the authorizationId.
all above actions MUST have a required X-Api-Key header;
- for security reasons.
all above actions SHOULD have an X-Correlation-Id header.
- for keeping track of workflows.

Let's name the micro-service as process-transactions.

Planning with sequence diagrams

From the previous fictional requirements we can define 3 participants:

USER - The API consumers;
MS - The API micro-service;
API - The API consumed.

A draft of the diagram should resemble as the following image:

Tooling for PlantUML

Consider the following file structure.

./images
./plantuml
├── capture.puml

Where capture.puml has the following content.

@startuml

participant "USER" as A
participant "MS" as B
participant "API" as C

title //process-transactions// micro-service capture workflow

rnote left A
**headers**
  X-Api-Key<font color="red">*</font> //<string>//
  X-Correlation-Id //<string>//
end note

activate A
A -> B: **POST** ""/capture/:authorizationId""

rnote left A
**payload**
  accountId:<font color="red">*</font> //<string>//
  amount:<font color="red">*</font> //<number>//
  currency:<font color="red">*</font> //<string>//
end note

rnote left B
**headers**
  X-Api-Key<font color="red">*</font> //<string>//
  X-Correlation-Id //<string>//
end note

activate B
B -> C: **POST** ""/validate""

rnote left B
**payload**
  accountId:<font color="red">*</font> //<string>//
  authorizationId:<font color="red">*</font> //<string>//
end note

alt success request

rnote right B
**headers**
  X-Correlation-Id //<string>//
end note

activate C
B <-- C: ""**200** OK""

rnote right B
**payload**
  success: //true//
end note

|||

B -> B: capture amount
activate B
deactivate B

rnote right A
**headers**
  X-Correlation-Id //<string>//
end note

A <-- B: ""**200** OK""

rnote right A
**payload**
  transactionId: //<string>//
end note

|||

else failure request

rnote right B
**headers**
  X-Correlation-Id //<string>//
end note

B <-- C: ""**200** OK""
deactivate C

rnote right B
**payload**
  success: //false//
end note

rnote right A
**headers**
  X-Correlation-Id //<string>//
end note

A <-- B: ""**422** UNPROCESSABLE ENTITY""
deactivate B

rnote right A
**payload**
  error: //true//
  reason: Conditions could not be met
end note

|||

end
deactivate A

@enduml

We can use the package node-plantuml for generating the sequence diagram as an image.

npm install node-plantuml
puml generate -s -o ./images/capture.svg ./plantuml/capture.puml

Now we have a version controlled file that describes our /capture endpoint.

Writing the contract in the OpenAPI Specification

PlantUML gives us a pretty good view of what the capture endpoint expects as a request and responses.

Remember that at this point the micro-service logic is still a black-box, and it should remain that way for now.

We are trying to achieve a contract that does what business is expecting.

It's also expected that all the dependencies of the micro-service regarding 3rd/internal parties APIs are clear on their purposes and which of their endpoints suits our needs.

In our capture endpoint we assume some generic response. But we could be calling x number of endpoints if needed before the capture replies with anything.

Anyways, the OpenAPI is defined as a yaml file with all the specifications.

But if we have a few endpoints and a lot of responses, it might be useful to have separate files for each section of the Specification.

Ultimately this will ease the burden of maintaining the specification.

Organising the contract structure

Updating the above file structure.

./images
├── capture.svg
./plantuml
├── capture.puml
./open-api
├── components
│   ├── headers
│   │   └── x-correlation-id.yaml
│   ├── headers.yaml
│   ├── parameters
│   │   ├── authorization-id.yaml
│   │   └── x-correlation-id.yaml
│   ├── responses
│   │   ├── capture-200.yaml
│   │   └── capture-422.yaml
│   ├── responses.yaml
│   ├── schemas
│   │   ├── capture-200.yaml
│   │   ├── capture-422.yaml
│   │   └── capture.yaml
│   └── schemas.yaml
├── components.yaml
├── index.yaml
├── info.yaml
├── paths
│   └── capture.yaml
└── paths.yaml

Instead of using the normal '#/components/...', ref is a relative link to the file, which after the compile step will be properly OAS "reffed".

Content of ./open-api/index.yaml:

openapi: 3.0.2
tags:
  - name: capture
info:
  $ref: './info.yaml'
paths:
  $ref: './paths.yaml'
components:
  $ref: './components.yaml'
security:
  - X-Api-Key: []

Content of ./open-api/paths.yaml:

/capture/{authorizationId}:
  post:
    $ref: './paths/capture.yaml'

Content of ./open-api/paths/capture.yaml:

summary: Capture an amount
tags:
  - capture
operationId: capturePost
parameters:
  - $ref: '../components/parameters/authorization-id.yaml'
  - $ref: '../components/parameters/x-correlation-id.yaml'
requestBody:
  content:
    application/json:
      schema:
        $ref: '../components/schemas/capture.yaml'
responses:
  '200':
    $ref: '../components/responses/capture-200.yaml'
  '422':
    $ref: '../components/responses/capture-422.yaml'

Tooling for OAS

We can use the package swagger-cli for generating the compiled Specification file.

npm install swagger-cli
swagger-cli bundle -o open-api.yaml --type yaml open-api/index.yaml.

And the full specification:

openapi: 3.0.2
tags:
  - name: capture
info:
  version: 1.0.0
  title: Process Transactions Micro-service
  description: 'Capture, void and refund an account.'
paths:
  '/capture/{authorizationId}':
    post:
      summary: Capture an amount
      tags:
        - capture
      operationId: capturePost
      parameters:
        - name: authorizationId
          description: Authorization Id which allows to capture the locked funds
          in: path
          required: true
          schema:
            type: string
        - name: X-Correlation-Id
          description: Correlation Id to keep track of workflow
          in: header
          schema:
            type: string
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/capture'
      responses:
        '200':
          $ref: '#/components/responses/capture-200'
        '422':
          $ref: '#/components/responses/capture-422'
components:
  securitySchemes:
    X-Api-Key:
      type: apiKey
      in: header
      name: X-Api-Key
  responses:
    capture-200:
      description: Capture of funds has succedeed
      headers:
        X-Correlation-Id:
          $ref: '#/components/headers/X-Correlation-Id'
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/capture-200'
    capture-422:
      description: Capture did not succeed
      headers:
        X-Correlation-Id:
          $ref: '#/components/headers/X-Correlation-Id'
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/capture-422'
  schemas:
    capture:
      type: object
      required:
        - accountId
        - amount
        - currency
      properties:
        accountId:
          type: string
        amount:
          type: number
          example: 9.99
        currency:
          type: string
          example: EUR
    capture-200:
      type: object
      properties:
        transactionId:
          type: string
          description: The transaction id which will allow to refund
    capture-422:
      type: object
      properties:
        success:
          type: boolean
          default: false
        reason:
          type: string
          example: Conditions could not be met
  headers:
    X-Correlation-Id:
      schema:
        type: string
security:
  - X-Api-Key: []

Testing against the API

Now that we defined diagrams and the contract is settled, we are ready to implement it.

Let's say you have a server up and running with all of the requirements in place.

Wouldn't it be better to have a Postman collection based on the OAS instead of manually creating it?

Converting OpenAPI to Postman Collection

We can use the package openapi-to-postmanv2 for generating the Postman collection.

npm install openapi-to-postmanv2
openapi2postmanv2 -s open-api.yaml -o postman-collection.json -p it will generate the a Postman collection almost pre-filled.

Obvious you will need to fill the blanks such as the X-Api-Key and alike.

Conclusions

Building the Postman collection through the OpenAPI Specification will help find breaches in the development.

Just keep in mind that changes to the contract often happen during development or even when all the functionality is being tested.

Hope this workflow helps in any way possible to fasten and tighten the development.

What are your thoughts? Please share your experience and all constructive feedback is welcome!

Links

Specifications

Tools

Modern monitoring

João Guimarães — Tue, 20 Aug 2019 08:08:07 +0000

Abstract
The concept
Monitoring
Conclusion

Abstract

I want to share my experience when dealing with monitoring applications such as micro-services and/or lambdas and how too much monitoring (or lack of it) causes an impact in ways beyond just analysis or pretty dashboards.

If you have another opinion on the concepts described here, I encourage you to provide constructive feedback so I too can learn with other people's ideas and thoughts.

Opinions are my own and not of my employer.

The concept

Monitoring is the combination of logging and adding metrics, but usually, they are treated as separate areas of our applications when in fact they can both work together and we can take advantage of this union.

Instead of logging everything (which is unhealthy) and add metrics for everything (which is also unhealthy), we can complement a metric that has triggered an alert with detailed logs. This is modern monitoring!

Monitoring

Very briefly, logs represent information that can't be grouped but they provide unique details about an event that happened, while metrics represent information that can be grouped by events but don't provide unique details.

Consider that an HTTP request returned a 404 status code. We can use a counter metric called clientError, as an example, which will continue to increment whenever another 4xx error occurs. Detailed information about individual errors can be logged to provide additional information for troubleshooting purposes. You can correlate them by their timestamp.

Consider that the above error was caused by the following HTTP/1.1 request to my-service application:

GET /my-path?id=my-resource HTTP/1.1
Host: www.my-host.com
Content-Type: application/json

and its corresponding response:

HTTP/1.1 404 NOT FOUND
Date: Wed, 17 Jun 2019 10:36:20 GMT
Server: Apache/2.2.14 (Win32)

If your application, after processing the request, logs something like:

{
  "app"         : "my-service",
  "xcid"        : "uuid",
  "time"        : "2019-07-17T10:36:19.000Z",
  "host"        : "www.my-host.com",
  "method"      : "get",
  "path"        : "my-path",
  "statusCode"  : 404,
  "msg"         : "client error"
}

Would be a waste of resources as it does not provide any value to understand the reasons why it happened!

Although the next log may give us data to understand why the event occurred, it leaks sensible data and should be avoided:

We SHOULD only log information that will help identify why a certain event occurs without exposing sensible data.

{
  "app"         : "my-service",
  "xcid"        : "uuid",
  "time"        : "2019-07-17T10:36:20.000Z",
  "host"        : "www.my-host.com",
  "method"      : "get",
  "path"        : "my-path",
  "statusCode"  : 404,
  "msg"         : "client error",
  "data": {
    "status"        : "deleted",
    "sensibleKey1" : "sensibleValue1",
    "sensibleKey2" : "sensibleValue2"
  }
}

The 3rd log entry example can provide valuable information to understand why this event occurred.

{
  "app"         : "my-service",
  "xcid"        : "uuid",
  "time"        : "2019-07-17T10:36:20.000Z",
  "host"        : "www.my-host.com",
  "method"      : "get",
  "path"        : "my-path",
  "statusCode"  : 404,
  "msg"         : "'my-resource' does not exist"
}

This subtle difference allows you to investigate the reason why that resource was deleted without exposing sensible data.

Bottom line:

The metric clientError triggered an alarm for the event;
The log entry provided a reason why it happened.

We now have all the information to troubleshoot this event outside the monitoring scope.

This is a pretty simple example but it shows how we need to weight metrics and logs accordingly to the situation and business value.

Logs should tell you why an event occurred, but not explain the specific reason it happened, or you'll risk exposing, once again, sensible data.

There is no need to have logs that won't serve any purpose, they'll just cost you or your company money.

If your application is behind a High Availability application it most likely is backed-up by a load balancer / Auto Scaling Group of some sort or you are simply spinning up some containers yourself, your application SHOULD log only exit codes that aren't expected.

When your services are under heavy load, they will spin up more containers, and when that load drops, containers are going to be spin down. Logging those predictable shutdowns, again, will have no meaningful information.

AWS and/or Kubernetes set the exit code when a container has been ordered to shut down, allowing the application to read that code and log meaningful information.

Having predictable log objects can also help you manage and estimate the service daily capacity for your LMS - Log Management Service of choice. This is not the same described here.

What should represent a metric?

Business value to dashboards;
Information for triggering alerts (on its own or aggregated with other metrics).

Some LMS can add dimensions/tags/labels to metrics which is great but can turn into a nightmare in terms of costs.

A bad example of adding a dimension is the hostname, instead, it SHOULD be a part of the logs (if applied).

The region where your application is running is a good dimension (if applied) as it can provide insight on which regions some services have more load.

At the end it's a trade-off between costs and business value.

Any unique combination of a metric with its dimensions represents a new time series, which will increase the amount of data that will be stored in your provider. Once again, this also increases the overall costs.

A dimension MUST have a low cardinality. High cardinality means that the dimension will have many different values.

Conclusion

We should not jump into adding logs and metrics for everything. We are tempted to do this while developing to find issues and bugs but we will certainly leave them wandering around in production as well.

This is not a topic to fire and forget. Keep them sane and most important, secure and that they provide the minimum information for proper troubleshooting outside the monitoring scope.

DEV Community: João Guimarães

Performance Development

Abstract

The concept

Tracking performance

Development process

Monitoring & Observability

Load Testing

Stress Testing

Costs

Conclusions

Links

Designing an API

API dependencies

What is an UML sequence diagram?

API specification

API development

Demoing with an example

Fictional requirements

Planning with sequence diagrams

Tooling for PlantUML

Writing the contract in the OpenAPI Specification

Organising the contract structure

Tooling for OAS

Testing against the API

Converting OpenAPI to Postman Collection

Conclusions

Links

Modern monitoring

Table Of Contents

Abstract

The concept

Monitoring

Conclusion