Wes Nishio

Posted on Nov 22, 2022 • Edited on Nov 28, 2022

21 DevOps Output Metrics for Developers to Track

#devops #productivity #performance #saas

1. Introduction

There is a persistent opinion that metrics cannot measure the output of teams and organizations involved in system development. And when it comes to individual personnel evaluations, there is even an emotional backlash. The famous Martin Fowler also seems to be of the same opinion, judging from his comments on Twitter.

On the other hand, some believe that some kind of indicator should be able to measure the output of an organization. If we can't measure it, how can we determine if we have been doing well or if we will improve in the future? Google, one of the groups advocating DORA Metrics, seems to be of this opinion.

I believe that there are things that can be measured. Of course, I do not believe that everything can be measured correctly. In the first place, a balance between qualitative and quantitative indicators is important, and even for quantitative indicators, there is a trade-off between accuracy and the cost of measurement in the real world. So I have put together a list of concrete steps to discuss how to do this.

2. What We Measure

I feel that discussions often take place without a unanimous definition of what we want to measure.

What we want to measure this time is the output of a team or organization.

2-1. What is the “Team Output”

Although individual units may be necessary as a breakdown, the main focus of the aggregate unit should be the team or organizational unit in the development of the system, not the individual unit.

Then, what is the output? We would like to define output as the amount of action, the amount of activity, the amount of production, and the amount of contribution to the team or organization. More specific indicators will be discussed below. Output may be paraphrased as throughput, performance, or productivity.

2-2. Outputs are not Outcomes

Conversely, outputs are not outcomes or results. In other words, the output is not sales or user satisfaction.

Of course, if you can measure how well any task, or at least any project, has improved user satisfaction, maintained user retention, increased user spending, and ultimately tied to sales, then that is better. However, we have assumed here that outcomes are hard to come by, either because they cannot be measured or because the cost of measuring them is at an unaffordably high level.

While understanding that it is essentially important to look at outcomes, the difficulty of obtaining those outcomes is high, so as an alternative, we will consider how to make better use of the flawed outputs.

3. Why We Measure

The purpose, of course, is to address the issues.

The following assumptions are made about what kind of pain can be felt if we do not know the numbers and trends of a team's or organization's output.

3-1. Issue Hypothesis 1

Not knowing objectively whether the team or organization is doing well or poorly

We don't know if we should take action to improve them.
Even if we take action, we do not know if our team's or organization's performance has improved.

3-2. Issue Hypothesis 2

Less transparency and objectivity within the team or organization

Qualitative goals are too vague for team members.
Ambiguous criteria for individual performance evaluations may lead to high turnover rates.

3-3. Issue Hypothesis 3

Less transparency and objectivity to other teams and organizations within the company

It is difficult to explain and appeal to members who do not know anything about development how much output your team or organization is currently producing.

3-4. Issue Hypothesis 4

Less transparency and objectivity to potential hires outside the company

It is difficult to explain the speed and quality of the process.
As a result, it is hard to convey the appeal, which makes it a black-box risk for potential hires. As a result, it becomes a black-box risk for potential hires, as it is difficult to convey the appeal of the service.
It is also difficult to have the perspective of asking the same questions to potential hires since they do not have a perspective of the numbers and transition of their output

3-5. Issue Hypothesis 5

Less transparency and objectivity toward users of the service.

We don't know how much the service is continuing to evolve, or whether it is at a standstill.
It is easy to understand when an innovative service like GitHub Copilot is created, but it is not always the case with other services.

We believe that the above issues must essentially exist. At this point, we have not yet had enough time to hear about these issues and pains, so these are hypotheticals.

4. What Can be Measured Specifically?

In this chapter, we will introduce specific indicators, definitions, and views.

4-1. Deployment Frequency (DORA)

Definition

Frequency of deployment to production, one of the DORA Metrics.

Applicable parts of the business cycle

Frequency of 13
If 13 frequencies are difficult to measure, 12 frequencies are used.

Formula

Deploy frequency = f(A, B) x C/D

A = Last date of a target period
B = The newer of the first date of a target period and the merge date of the first pull request of the entire period
f (A, B) = number of weekdays between dates A and B
C = Business hours per day
D = Number of pull requests merged into production in a target period
Example: f (Nov 14, 2022 - Max (Jan 1, 2022, Nov 1, 2022)) x 8 hours per day / 10 times = f (Nov 14, 2022 - Nov 1, 2022) x 8 hours per day / 10 times = 10 days x 8 hours/day / 10 times = 8 hours/times

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

Compare the numbers in a target period to the baseline values provided by the State of DevOps (see figure below).
Compare a target period number or improvement with other teams in the same repository or other repositories.
Compare the number of a target period with the previous period in your team.

Improvement action

It is essential to discuss the hypotheses with the team to find out what the bottleneck is in the first place.

If there is too much rework for content confirmation in 3 and 4, then devise a way to create tickets and use them as templates.
Reduce the size of the release unit in 4 and 7 (this is not a bad thing).
If 5-6 are time-consuming, add requirement definition documentation as an essential part of 4 and 7 (strengthen the documentation in the relevant areas each time you deploy)
At 7, speed up 11 by using videos instead of screenshots for evidence.
At 8, introduce automated testing (in the short term, it will reduce the frequency of deployments, but in the medium to long term, it will contribute).
At 9, if there are a lot of reverts, improve the template of 7.
Introduce automatic deployment at 13 (not only improves deployment frequency but also reduces human errors)
Analyze the reasons for frequent deployments and share the results with other members.
Increase the number of developers.

Reference URL

4-2. Lead Time for Changes (DORA)

Definition

Lead time from 1st commit to deployment, one of the DORA Metrics.

Applicable part of the business cycle

Time from 6 to 13
Or, time from 6 to 12, if time 13 is difficult to measure.

Formula

Σ(A-B)/C

A = Merge time of the pull request merged into a target period
B = Time of the first commit of the pull request
C = Number of pull requests merged in a target period
Example: (2 hours + 4 hours + 6 hours) / 3 times = 4 hours
Note: Due to the ambiguity of the original definition, there is a wide range of interpretations of the formula. In particular, the definition of the start of a target period in B is ambiguous, and it is sometimes defined as 4 or 5 of the business cycle.

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

Compare the numbers in a target period to the baseline values provided by the State of DevOps (see figure below).
Compare a target period number or improvement with other teams in the same repository or other repositories.
Compare the number of a target period with the previous period in your team.

Improvement action

It is essential to discuss the hypotheses with the team to find out what the bottleneck is in the first place.

Reduce the size of the release unit in 4 and 7 (this is not a bad thing).
If 5-6 are time-consuming, add requirement definition documentation as an essential part of 4 and 7 (strengthen the documentation in the relevant areas each time you deploy)
At 7, speed up 11 by using videos instead of screenshots for evidence.
At 8, introduce automated testing (in the short term, it will reduce the frequency of deployments, but in the medium to long term, it will contribute).
At 9, if there are a lot of reverts, improve the template of 7.
If the reviewer’s reaction is slow at 9 or 11
- Review the alerting method (e.g., I usually look at Slack, but didn't see the alerts in the email notifications)
- Review the TODO management method (e.g., there are Slack notifications, but they are not registered in Jira, the TODO management system, so they are lost in the flow and forgotten)
Introduce automatic deployment at 13 (not only improves deployment frequency but also reduces human errors)

Reference URL

4-3. Change Failure Rate (DORA)

Definition

probability (%) of a pull request leading to a failure in the production environment; one of the DORA Metrics.

Applicable part of the business cycle

13 to 14

Formula

A/B

A = Number of reverted pull requests among the pull requests merged in a target period = MAX (C, D, E)
- C = Number of pull requests with "Revert" in the title of the pull request an odd number of times
  - If it is an even number of times, it is considered to be a pull request for modification
- D = Number of pull requests containing "incident" in the label
  - In most cases, pull requests are manually processed after an incident, so the accuracy is low.
- E = Number of pull requests with "bug" in the label
  - Since this is a label for a pull request for correction, it is not strictly speaking the pull request itself that led to the production failure, but since the timing is not after the fact, it may be more accurate than D.
B = Total number of pull requests among the pull requests merged in a target period
Example: 2/100 = 2%.
Note: Deployment ≈ Merge.
- Deployment is considered almost the same as merge
- Even if the reason for being reverted is a simple work error, it will be counted.

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

Compare the number of a target period to the baseline provided by the the State of DevOps (see below).
Compare a target period number or improvement with other teams in the same repository or other repositories.
Compare the number of a target period with the previous period in your team.

Improvement action

It is essential to discuss the hypotheses with the team to find out what the bottleneck is in the first place.

Reduce the size of the release unit by 4 or 7. This is also effective from a risk perspective.
At 7, make it easier to notice defects by using videos instead of screencaps as evidence.
At 8, introduce automated testing. This will prevent "we didn't test it in the first place".
At 9, make it so that merging is not possible unless all comments on the source are "resolved" (this can be done in the repository settings).
At 11, make review mandatory (can be done in repository settings)
At 11, set the number of reviewers to 2 or more (possible in repository settings)
Create a QA department and make it have 11 approval authority
Add a non-engineer from the business department who has submitted the 3 to the approval process at 11.
13 and introduce automatic deployment to eliminate deployment errors.

Reference URL

4-4. Time to Restore Service (DORA)

Definition

the time taken from failure to recovery in a production environment; one of the DORA Metrics.

Applicable part of the business cycle

The time from 13, which caused the failure, through 14, to 13, which has been addressed.

Formula: Σ(A-B)/C

A = Time when the pull request that corrected failure X was merged.
B = Time when the pull request that directly caused failure X was merged
- B is often not properly recorded.
- So, after compromising the loss of accuracy, the pull request immediately before A is considered to be B
C = Number of times of failure X within a target period
Example: (2 hours + 4 hours + 6 hours) / 3 times = 4 hours
Notes:
- Although the definition of B is not precise, it is assumed that teams and organizations with poor performance in this indicator are often unable to identify the exact B and thoroughly implement the operation to link it to A.
- Even if the above operation is thoroughly implemented, it is also difficult to identify the above from APIs such as GitHub.
- In addition, failures caused by infrastructures, such as server downtime or 100% CPU utilization, are not included in this method of calculation.

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

Compare the numbers in a target period to the baseline values provided by the State of DevOps (see figure below).
Compare a target period number or improvement with other teams in the same repository or other repositories.
Compare the number of a target period with the previous period in your team.

Improvement action

It is essential to discuss the hypotheses with the team to find out what the bottleneck is in the first place.

If there is a problem with the time between noticing a problem and being able to contact the necessary team

If the monitor only contacted the engineer in question via slack and the engineer did not see it, then the method of contacting the engineer in case of failure has to be changed to phone calls.
If the monitor called the engineer in question, but the engineer didn't see it, then we have no choice but to make a rule to call repeatedly until the engineer notices the call.
If the monitor called the engineer repeatedly but the engineer did not pick up the phone, then the only way is to increase the number of engineers in the network.
If in-house resources are limited, consider outsourcing failure response.

If there is a problem with the time it takes to identify the cause of the failure or to resolve it

Simulate a simulated training using an actual failure (often difficult or time-consuming to reproduce).
If the test environment is running separately from the production environment, keep the test environment healthy and clean regularly (although, even if the test environment is down, it is often not a priority depending on the team's guidelines)
If it is due to a lack of documentation of the product specifications, make URL attachment of the documentation mandatory in 7 of the work cycle (although, it is necessary to enforce the operational rules)

Reference URL

4-5. Number of Pull Requests

Definition

Number of times a pull request is merged

Applicable part of the business cycle

13 times (≈12 times)

Formula

Number of pull requests merged during the period in question

Example: 10 times

Source

GitHub, GitLab, BitBucket, etc.

How to read the numbers

Same as 4-1

Deploy frequency alone does not give an intuitive picture of absolute volume, so it is useful to look at it as a set with deploy frequency.
Especially for business departments, absolute amounts may be easier to understand.
However, it is important to remember that not only the absolute volume but also the content of what was released is important.

Improvement actions

Same as 4-1

4-6. Number of Reviews

Definition

Number of pull requests reviewed

Applicable parts of the business cycle

Formula

Number of times a user approved a pull request that was merged within a target period.

Example API Endpoint: https://api.github.com/search/issues
Example query: is:pr is:merged repo:${ownerName}/${repoName} reviewed-by:${userName} merged:${since}...${until}

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

The number of approvals is the number of reviews, and the higher the number of reviews, the more you are contributing to the team as a reviewer.
On the other hand, of course, the content of the review is also important.
For example, there may be small pull requests that do not require any comments from anyone. But often, I have never seen a pull request that needed no comment whatsoever. Especially if you have only LGTM without any comments every time, or if you have approved a pull request that caused a glitch without any comments.

Improvement Action

If the Change Failure Rate of DORA Metrics is high, it means that the defects are slipping through the review process, so fundamentally, the review method needs to be improved or the number of reviews needs to be increased.
In the case of GitHub, it is possible to enforce approvers and the number of approvers on a per-repository basis. If you didn't have a mandatory setting before, set one first.
If possible, set it to two or more people where it makes sense
Otherwise, request an optional peer review for each pull request

4-7. Number of Lines of Code Added

Definition

Number of lines of code newly added or edited

Applicable part of business cycle

6 or 7

Formula

Number of lines of code added or edited in a target period

The GitHub API cannot determine whether it is newly added or newly edited.
Example of newly added code
Example of edited code

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

The so-called old LOC, a metric that has received a lot of emotional backlashes, especially from those who argue that it is meaningless.

On the other hand, it is also a simple indicator, as it was recently rumored that LOC was used as the basis for Elon Musk's global layoffs at Twitter.

Compare the number or improvement in a target period to other teams in the same repository or other repositories.
- It is meaningless to win or lose rankings by a narrow margin. For example, it is not possible to simply compare the output of a team that added 100 lines with that of a team that added only one line, since even a single line may elegantly achieve the goal.
- In the big picture, however, the number of lines of code added can be taken as the output itself. For example, if a team adds 10,000 lines and another team adds 100 lines, the former team is more likely to output more.
Compare the numbers in a target period with the previous period's figures for your team.
- As the team's learning curve increases, this indicator is likely to increase as well.
- As the team becomes more efficient at reusing code, the number of additional lines per pull request may decrease, but the number of pull requests itself should increase, so this indicator is also likely to increase.

Improvement Action

Same as 4-1

4-8. Number of Lines of Code Deleted

Definition

Number of lines of code deleted or edited

Applicable part of the business cycle

6 or 7

Formula

Number of lines of code deleted or edited during the period

The GitHub API cannot determine whether the code was deleted or edited.
Example of deleted code: Code that is simply no longer needed, or code that has been refactored and is no longer needed. In the latter case, there is strictly replacement code.
Example of edited code: The only modification made was to change the padding value from 1.5 to 1, so it is one line, but due to formatters such as Prettier, it may be counted as a deletion of 5 lines (understood as noise and accepted, as a low-cost indicator).

Sources

GitHub, GitLab, Bitbucket, etc.

How to read the numbers

Compare the numbers or improvements for a target period with other teams in the same repository or other repositories.
Compare the number of a target period with the previous period in your team.

Improvement action

Same as 4-1

Delete as much unnecessary code as possible.
Establish a culture that it is better to erase unnecessary code than to leave it alone (if code is left because it is somehow scary, no one will be able to erase it).
Establish a culture in which it is okay to delete or refactor code, even code written by others or your boss, if there is a reason to do so, and in fact, you should do so.

4-9. Number of Created Tasks

Definition

Number of times tickets are created

Applicable part of the business cycle

Number of times for 4

Formula

Number of tickets created in a target period

For Jira
- Example API endpoint: https://api.atlassian.com/ex/jira/${*atlassianCloudId*}/rest/api/3/search
- Example query: creator = currentUser() AND createdDate < 2022-10-01 AND createdDate >= 2022-09-01

Source

GitHub Issues, GitLab Issues, Bitbucket Issues
Jira, Trello, Asana, Monday.com, ServiceNow, ClickUp, Wrike, etc.

How to read the numbers

Tickets created in 4 are the source of features that will be developed in subsequent years.
First, we can simply see how many seeds of future functionality have been created, and who created them in particular.
However, always understand that what matters is both the content and the number.
In addition, there are projects where there are still many Slack instructions or verbal instructions, without requesting work-through tickets in the first place. In such projects, by looking at the number of tickets created by each member, it is possible to separate whether the manager created the tickets himself or whether the manager had the members create the tickets themselves.

Improvement Action

By creating a template of what to write on the ticket, the ticket creator can create tickets faster.
This ticket creation itself is often tedious. Therefore, a message that the number of times a ticket is created is also considered one of the contributions that may motivate people to create tickets properly.
The number of tickets created will be increased by making a rule that work requests must be made in the form of a ticket. This rule may also increase work efficiency itself. If work requests can be made through various routes, it will be a breeding ground for work request interruptions, sneaky work requests, and messy work requests.

Reference URL

The 23 Best Task Management Software in 2022 (Free And Paid!) | ClickUp Blog | July 1, 2022 | Max 20min read

4-10. Number of Closed Tasks

Definition

Number of tickets completed

Applicable part of the business cycle

Number of times 13

Formula

Number of tickets completed in a target period

Condition: Tickets must still be completed at the end of a target period.
In the case of Jira
- Example API: https://api.atlassian.com/ex/jira/${atlassianCloudId}/rest/api/3/search
- Example query: assignee = currentUser() AND status CHANGED TO 10006 DURING (${since}, ${until}) AND status was 10006 ON ${until}

Sources

GitHub Issues, GitLab Issues, Bitbucket Issues
Jira, Trello, Asana, Monday.com, ServiceNow, ClickUp, Wrike, etc.

How to read the numbers

Compare the number or improvement in a target period to other teams in the same repository or other repositories.
- Since the granularity and difficulty of each ticket is different, simple comparisons are not possible.
- Therefore, it is completely pointless to discuss small differences, but you may be able to use the data to identify large differences in trends.
- Also, there must be teams or organizations that do not have any visualization at all, which can be seen if you try to compare them.
Compare the figures of a target period to the previous period in your team.
- This may be more meaningful than comparing with other teams or organizations.
- If both this period and the comparison period are based on the same rules, it will be a cheap indicator to get, even if it is not accurate.
- Above all, the number of tickets that will be digested in the future can be predicted based on this record of ticket digestion

Improvement Action

Clarify completion conditions. I often see tickets that cannot be closed because the completion conditions are unclear, and they have turned into garbage.
Make team members aware that ticket management is being watched, so they will complete tickets properly.
Make it a habit to complete one ticket at a time. Do not simply stop a ticket because of a blocker, even if it is a "waiting" flag.
Managers should create an environment where members feel comfortable discussing blockers when they occur.

4-11. Number of Open Tasks

Definition

Number of uncompleted tickets

Applicable parts of the business cycle

Formula

Number of tickets not completed at the end of a target period

For Jira
- Example API: https://api.atlassian.com/ex/jira/${*atlassianCloudId*}/rest/api/3/search
- Example query: assignee = currentUser() AND status ! = DONE AND resolution = Unresolved

Sources

GitHub Issues, GitLab Issues, BitBucket Issues
Jira, Trello, Asana, Monday.com, ServiceNow, ClickUp, Wrike, etc.

How to read the numbers

Compare the number or improvement in a target period to other teams in the same workspace or other workspaces.
- Since the granularity and difficulty of each ticket is different, it is not possible to make a simple comparison.
- Therefore, it is completely pointless to discuss small differences, but it may be useful to identify large differences in trends.
- It can be used to understand the difference in the number of tasks between teams and to help balance the resources of the teams.
Compare the number of a target period with the number of the previous period in your team.
- By looking at the trend, you can see whether it is increasing or decreasing.
- If you are in the early stages of a project when tasks are being identified or when the resolution of the project is increasing, the number of tasks will increase.
- However, if the number is still increasing in the latter half of the project or when the deadline is approaching, it may be a risk of delay.

Improvement Action

4-10. same as Number of Closed Tasks

Clarify completion conditions. I often see tickets that cannot be closed because the completion conditions are unclear, and they have turned into garbage.
Make team members aware that ticket management is being watched, so they will complete tickets properly.
Make it a habit to complete one ticket at a time. Do not simply stop a ticket because of a blocker, even if it is a "waiting" flag.
Managers should create an environment where members feel comfortable discussing blockers when they occur.

4-12. Closing Velocity

Definition

velocity at which tickets are completed

Applicable parts of the business cycle

4 to 13

Formula

A/B

A = Number of tickets completed within a target period
B = Number of days in a target period
Example: 180 tickets/90 days = 2 tickets/day = 14 tickets/week
Note: To make the numbers easy to calculate, weekends and after-hours are included in the calculation. If you do not wish to include them, please exclude them

Sources

GitHub Issues, GitLab Issues, Bitbucket Issues
Jira, Trello, Asana, Monday.com, ServiceNow, ClickUp, Wrike, etc.

How to look at the numbers

Considering that we organize our to-dos at the beginning and end of each workday, many of us recognize what we need to do that day and what we need to do tomorrow as a single lump sum.

Following this idea, it is desirable to have at least one ticket per day or five tickets per week as an absolute value. one ticket per week makes it difficult for managers and other members around you to understand the progress.

Improvement Action

4-10. same as Number of Closed Tasks

Clarify completion conditions. I often see tickets that cannot be closed because the completion conditions are unclear, and they have turned into garbage.
Make team members aware that ticket management is being watched, so they will complete tickets properly.
Make it a habit to complete one ticket at a time. Do not simply stop a ticket because of a blocker, even if it is a "waiting" flag.
Managers should create an environment where members feel comfortable discussing blockers when they occur.

4-13. Estimated Completion Date

Definition

the date when all outstanding tickets are expected to be completed

Applicable parts of the business cycle

Formula

today()+A/B

today(): Today's date
A： Number of uncompleted tickets as of today
B： Speed of ticket completion in a target period *Saturday and Sunday included

Source

GitHub Issues, GitLab Issues, BitBucket Issues
Jira, Trello, Asana, Monday.com, ServiceNow, ClickUp, Wrike, etc.

How to read the numbers

You can see at a glance when the tasks that have accumulated in your team or organization are expected to be completed. There are many situations where you may not be able to say this quickly enough.

If it simply seems too far away (e.g., one year from now)
A possibility that the amount of A is too large
A possibility that the speed of B is slow
If the amount of A is appropriate, we need to devise a way to speed up B. See the remedial measures for B.
Fundamentally, it may be the right time to increase the number of members to speed up the digestion of B
If it is getting farther and farther away in terms of the transition
The speed at which tasks are accumulated (A) and the speed at which tasks are digested (B) are not in balance.

Improvement Action

4-10. same as Number of Closed Tasks

Check the contents of uncompleted tickets created long ago, and delete them if they are no longer needed.
Establish a rule to automatically mark incomplete tickets as "Won't Do" in the first place.
Sort A by priority, tally only high-priority tickets, and check again.
Clarify completion conditions. I often see tickets that cannot be closed because the completion conditions are unclear, and they have turned into garbage.
Make team members aware that ticket management is being watched, so they will complete tickets properly.
Make it a habit to complete one ticket at a time. Do not simply stop a ticket because of a blocker, even if it is a "waiting" flag.
Managers should create an environment where members feel comfortable discussing blockers when they occur.
If members are taking a long time to become competitive, it may be time to rethink the onboarding process.
Do what you haven't done above, but increase the number of members in the first place.

4-14. Number of Times You Have Been Mentioned

Definition

Number of times you have been mentioned

Formula

Number of times you were mentioned in a target period

Slack: ${memberId}+after:${since}+before:${until}
- Example API: https://slack.com/api/search.messages
- Example Query: U02DK80DN9H+after:2022-01-01+before:2022-10-24

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

See the next section

Improvement actions

See the next section

4-15. Intervals You Have Been Mentioned

Definition

Interval at which you have been mentioned

Formula

Σ(A-B)/C

A = the time you have been mentioned in a target period
B = the time you had been mentioned before A
C = Number of times you have been mentioned in a target period

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

This is an indicator of whether a person is needed, wanted, or relied upon by team members or other teams. It can also be viewed as a communication burden on the person.

Compare the number of a target period with other team members.
- For managers and above
  - Compare the numbers among managers, taking into account the size of the team, if the numbers are relatively large
    - Members' psychological safety toward managers is high, and members' curiosity toward their work may be high.
    - On the other hand, high is not necessarily good. The team may be overloaded with questions and reports, and it may be time to have a further manager in between.
  - If the numbers are relatively small when compared among managers and taking into account the size of the team
    - Psychological safety for managers may be low, and members may be less curious about their work
    - On the other hand, low does not mean bad. It is also possible that there is no excessive reporting and the members can stretch out and do their work. Other managers' workloads may just be too large, and this manager's workload may be appropriate
- In the case of players
  - Compare among players and if the numbers are relatively high
    - High exposure is not a bad thing, but if it is interfering with the work, the load may need to be reduced
    - If the numbers are relatively small compared to other players
    - Low exposure is not necessarily a good thing but maybe a state of being able to concentrate on the work
Compare the numbers for a target period to the previous period within your team.

Improvement actions

If you want to increase
- Increase the number of messages to let other team members know who you are.
- Comment and react to other people's messages.
To decrease
- Leave the channel you are participating in
- Delegate authority to team members
- Put a manager between you and your team members (not necessarily a bad thing)

4-16. Number of Times You Have Replied

Definition

Number of times you have replied to threads

Formula

Number of times you have replied to threads in a target period

Slack: from:@${memberId}+after:${since}+before:${until}+is:thread
- API example: https://slack.com/api/search.messages
- Example query: from:@U02DK80DN9H+after:2022-01-01+before:2022-10-24+is:thread

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

See the next section

Improvement actions

See the next section

4-17. Intervals You Have Replied

Definition

The interval between your have replied to threads.

Formula

Σ(A-B)/C

A = the time you have replied to the threads within a target period
B = the time you have replied to the threads before A
C = Number of times you have replied to threads in a target period

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

This is a reference indicator of how much information, feedback, and collaboration you have given to team members and outside of your team. It can also be viewed as a person's communication load

4-18. Number of Times You Have Sent New Posts

Definition

Number of times you posted new messages

Formula

Number of times you posted new messages in a target period

Slack: from:@${memberId}+after:${since}+before:${until}+-is:thread The difference from *4-15 is that is:thread is preceded by "-".
API example: https://slack.com/api/search.messages
Example query: from:@U02DK80DN9H+after:2022-01-01+before:2022-10-24+-is:thread

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

see the next section

Improvement actions

See the next section

4-19. Intervals You Have Sent New Posts

Definition

Interval at which you posted new messages

Formula

Σ(A-B)/C

A = Time when you posted new messages within a target period
B = Time when you posted new messages before A
C = Number of new messages you posted in a target period

Sources

Slack, Microsoft Teams, Discord, Chatwork, etc.

How to read the numbers

A reference indicator for how many spontaneous messages you posted, shared, or asked someone questions, etc.

Compare the numbers in a target period with other members.
- First, think about whether you think you are sending out enough messages.
- Then, compare it with those who are sending more messages than you.
- Then, find out how much difference there is between you and them.
- Ask them what they are conscious of when transmitting.
Compare the figures of a target period with your changes up to the previous period.

Improvement actions

Improvement action: Discuss the possibility that information that you think is meaningless may be meaningful to the other person.
If you have done some research and learned something, there must be people who don't know about it yet, or who feel the need to know but haven't researched it. Share it.
If you work on one thing, send out one thing. What you are doing is also valuable information to those around you.
Post it just like you would on Twitter, just like tweeting.
When you simply see news, feel free to share it without commenting on it.
Create a channel that makes it easy to post.

4-20. Number of Meetings

Definition

Number of meetings

Formula

Number of meetings in a target period

Conditions: At least 2 participants

Sources

Google Calendar, Microsoft Outlook

How to read the numbers

See the next section

Improvement Actions

See the next section

4-21. Meeting Occupancy Ratio

Definition

Ratio of meetings during business hours

Formula

A/B

A = Total meeting time within a target period (h)
- Condition: Two or more participants
B = Total business hours in a target period (h)
Note: If a meeting lasts longer or ends earlier than scheduled, the calendar should be changed to reflect the actual situation for a more realistic calculation.

Sources

Google Calendar, Microsoft Outlook

How to read the numbers

There is no right answer to the ratios. However, it is common to hear that people could not concentrate on their work because of the time taken up by meetings.

You may also hear that it is easy to waste time in meetings, especially meetings where you have no role, meetings where you just listen, or meetings where you are doing internal work and don't even listen in the first place. To review this, use it as a reference indicator.

Compare the numbers in a target period with those of other members
- Teams: Since it is a ratio, it is affected by the ratio of managers to players, but it can be compared to some extent among teams.
- Individuals: Compare the usage with that of people who are said to be successful.
Compare the figures of a target period with your changes up to the previous period.
- The numbers will slowly rise without you even noticing it.
- Always use this as an opportunity to review unnecessary meetings.

Improvement Action

Reduce unnecessary meetings
- If you only listen and don't speak at meetings, either stop attending or be proactive and speak up.
- Do not participate in meetings where you have other work to do. It is also disrespectful to other participants.
- Propose the elimination of meetings that have become regular but no longer have a specific agenda.
- The more people attend, the more people don't participate as much. We should not unnecessarily increase such meetings. Abolish them.
- Time for 1-on-1 and communication is important, so don't cut it as much as possible
Even if meetings are necessary, shorten each meeting to one hour or less.
- Decide what the agenda is for the meeting and put it on the calendar.
- If you have materials, hand them out in advance with a link to the materials. Discard the desire to work on them until the last minute and distribute them once, at least as of the day before the meeting, even if they are still in the process of being prepared.
- Read and comment on the materials distributed in advance

5. Common Things to Keep in Mind

5-1. Don't Overly Chase Numbers (Goodhart's Law)

"When a measure becomes a target, it ceases to be a good measure."

This phrase is often cited when talking about indicators. The phrase was originally used by British economist Charles Goodhart in his 1975 British Monetary Policy. Later, in 1996, it was further generalized and used in a book published by Keith Hoskins

The metrics that should be tracked are the outcomes, which is how happy the customer is, which is much more complex and difficult to measure, as Open AI also says, hence the various simpler and less costly metrics above as alternatives. Thus, we need to be careful not to get too hung up on optimizing those alternative metrics.

Source 1: Goodhart's law | Wikipedia
Source 2: Measuring Goodhart's Law (5 mins to read) | April 13, 2022 | Open AI

5-2. Don't Take the Numbers for What They Are.

Numbers often contain noise. Therefore, don't take the numbers for granted.

For example, even if the frequency of deployments in DORA Metrics was high last month, it is still too early to be satisfied. This is because, for example, there may be the following cases

It may have simply been a series of minor releases (which in itself is not a bad thing)
Your team may have picked only the easy ones from the backlog (this depends on what's in it, but it's probably not a good trend)
It may be because your team was too lazy to write automated tests, despite your instructions to write automated tests, and instead used manual tests, even though the truth is that it would be easier in the medium to long term to write automated tests.
Maybe worse, your team didn't think through the test scenario well enough and got away with it messily.
Worst of all, your team may not have tested in the first place (believe it or not, this usually happens when you rush things excessively)

On the other hand, if you didn't deploy often enough last month, you can't be too quick to get depressed. That's because, for example, there are cases like this.

Maybe it's because your team has a slightly larger release coming up.
Your team has never done automated testing before, but they started doing it last month, so maybe it's temporarily slowing them down
Maybe your team is slowing down because a new member has joined the team and you've intentionally devoted your most talented ace's resources to onboarding
Maybe your team has increased the number of checkpoints so that the code doesn't become a liability, which only leads to more rework by unfamiliar members

The only way to avoid being fooled or misled by these is to be open and have a proper discussion about how to read the numbers. To get a more accurate view, we should look at trends, not just single months, and of course, we need to know what is going on.

5-3. Don't Chase All Numbers at Once (Focus on Specific Numbers)

There are a wide variety of indicators. This is because there are different indicators to look at depending on the various phases and tasks of the team or organization.

However, in a target period, it may be better to have fewer and simpler indicators for your team to track. It will be easier to spread awareness of the team and the organization if you narrow it down so that your team can memorize it and quickly say the number at any time.

The number of these types of indicators should be discussed within the team, but it may be better to limit the number to two or three per month, for example, and preferably one goal per month. It is said, "He who runs after two hares will catch neither.

6. At the End

Thank you for reading this far.

If you have any comments or suggestions, please feel free to provide feedback in the comments section, or via Twitter DM (Twitter) or email address (hnishio0105@gmail.com).

We have also developed and are operating WorkStats, a tool that makes it easy to measure these specific indicators. It is still an MVP, but you are welcome to try it out if you like. We are sure you will discover something new.

Let's work together to implement the PDCA cycle of knowing the numbers, identifying issues, taking action to improve, and looking at the numbers again.