Yishai Zehavi

Posted on Nov 27, 2023 • Updated on Nov 28, 2023

Building Git Commit Linter Type in Typescript (Medium Difficulty)

#webdev #typescript #tutorial #programming

Linting is the process of statically analyzing a piece of code and validating it against a set of rules.
Wouldn't it be cool to build a Git commit linter using Typescript types? Join me in this challenge, and let's build a linter together in Typescript!

In this challenge, we'll build a Typescript custom type to validate a commit message against a commit template and set of constraints.

Defining the challenge

Given three parameters:

A commit message.
A commit template.
A constraints object for the template's tokens.

Build a linter that validates the commit message against the template and returns either null if validation succeeded or an error message if validation failed.

Let's define these requirements in Typescript:

// Type definitions:
type CommitMessage = string;
type CommitTemplate = `${string}*${string}*${string}`;
type Constraints = Record<string, {
    required?: boolean;
    kind?: string|number; // union type
    minLength?: number;
    maxLength?: number;
}>

// The type we need to implement:
type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints> = /* todo */;

Let's explain these types:

CommitTemplate should contain at least one token, where token is a placeholder for some text. To indicate that a string is a token we wrap it with asterisks.

Examples of commit templates:

type ValidTemplate<Template extends string> = Template extends CommitTemplate ? true : false;

type ValidTemplateResult1 = ValidTemplate<"[]">;
// => false

type ValidTemplateResult2 = ValidTemplate<"*message*">
// => true

type ValidTemplateResult3 = ValidTemplate<"*message* - <*author*>">;
// => true

Constraints is an object where each key corresponds to a token name in the template, and its value is an object with conditions that should apply to this token.

Now, we're ready to begin 💪

Stop here if you want to solve this challenge by yourself!

Thinking about a solution

Let's take a moment to think about how to approach this challenge. This is essentially a string matching problem: we need to match the parts of the template that are not tokens to the commit message in order. We also need to verify that each token in the commit message conforms to the constraints imposed by the Constraints object.

To help us complete these two tasks, we need a Token Parser.

What is a Token Parser?

A Parser is a tool (or a type, in our case) that takes a string and breaks it down into pieces called "tokens". Other tools can take these tokens and use them for various tasks (e.g., analyze the structure of the string). In our case, the parser will receive the commit template and return an array containing the token names (words wrapped in asterisks) along with the template parts preceding and following each token.

// Define toker parser signature:
type TokenParser<Template extends CommitTemplate> = /* todo */;

// Our parser should return these results:

type ParseResult1 = TokenParser<"*message*">;
// => [["message", ["", ""]]]

type ParseResult2 = TokenParser<"*message* - <*author*>">;
// => [["message", ["", " - <"]], ["author", ["", ">"]]]

type ParseResult3 = TokenParser<"[*type*] *message*">;
// => [["type", ["[", "] "]], ["message", ["", ""]]]

Don't worry if this looks a bit scary; let's break it down.

Our parser will return an array of TokenResults, where each TokenResult is a tuple containing two elements: token name and a sub-tuple containing the strings that surround the token:

type TokenResult = [TokenName, [Prefix, Suffix]]

// Where

type TokenName = string;
type Prefix = string;
type Suffix = string;

So, if the commit template looks like this:

"--*value*--"

The parser will return:

[                       <-- array of TokenResults
    [                   <-- TokenResult tuple
        "value",        <-- token name
        ["--", "--"]    <-- token's prefix and suffix
    ]
]

Now, let's analyze the examples from earlier:

TokenParser<"*message*"> // => ?

The token name is "message". No other strings appear before or after the token name, so the prefix and suffix are both empty strings:

TokenParser<"*message*"> // => [["message", ["", ""]]]

Continuing with the next example:

TokenParser<"*message* - <*author*>"> // => ?

In this example, we've got two tokens, so our parser will return two TokenResults.

The first token name is "message". There is no string before the token name, so its prefix is empty. What about the suffix? Well, we'll take the part of the template from the end of the token name until the beginning of the next token, and we get that the suffix is " - <".

The name of the second token is "author". What is the prefix of this token? The answer is nothing! Since we related the slice " - <" to the first token, no other string appears before this token's name, so the prefix is empty. The suffix will be the string following this token, ">".

In conclusion, we're getting the following answer:

TokenParser<"*message* - <*author*>">
// => [["message", ["", " - <"]], ["author", ["", ">"]]]

I'll leave the analysis of the last example for you as an exercise 😉

Phew, that was a lot of theory. Let's move to implementation.

Building the token parser

We'll start by defining the signature of the parser:

// `Tokens` parameter below is used to store the found tokens between calls.
type TokenParser<Template extends CommitTemplate, Tokens extends TokenResult[] = []> = /* todo */

Each TokenResult should contain the token name, its prefix, and suffix. So, we'll start by looking for the token name.
Remember, a token name is any string enclosed in asterisks:

type TokenParser<Template extends CommitTemplate, Tokens extends TokenResult[] = []> = 
    Template extends `${infer Prefix}*${infer Token}*${infer After}`
        ? /* todo */
        : never;

We split our Template type into three parts: first, the Prefix, which is the string slice that appears before the first asterisk in the template. After that comes the Token, which is the slice enclosed between the two asterisks. And last comes the rest of the string, which I named After. I didn't call it "Suffix" because this part of the string might contain another token (we'll check for that next).

Since the type CommitTemplate requires it to contain at least one token, We return never if it doesn't - to indicate that the provided template is invalid.

Extracting the token's suffix

We have the token's name and its prefix. All that is left is to extract its suffix from the After type. But how?

Let's think about it for a moment. Either the After type contains another token(s), or it doesn't. If it does, the string slice before that token is the suffix. If it doesn't - then the suffix is the entire string.

Let's begin by checking if the After type contains a token or not:

type TokenParser<Template extends CommitTemplate, Tokens extends TokenResult[] = []> = 
    Template extends `${infer Prefix}*${infer Token}*${infer After}`
        ? After extends `${infer Suffix}*${string}*${string}` // notice the asterisks!
            ? /* todo */
            : [...Tokens, [Token, [Prefix, After]]]
        : never;

We look for a token in the After type. If no token is found, we know that our suffix is the entire string. We append our token to the TokenResult array and return it.

If a token is found, we will extract the suffix from the string. To do that, we'll use a helper type, StripLeft. This type accepts two strings and "strips" the second string from the beginning of the first string:

type StripLeft<T extends string, U extends string> = T extends `${U}${infer V}` ? V : T;

Now we can implement the last branch of our TokenParser type:

type TokenParser<Template extends CommitTemplate, Tokens extends TokenResult[] = []> = 
    Template extends `${infer Prefix}*${infer Token}*${infer After}`
        ? After extends `${infer Suffix}*${string}*${string}` // notice the asterisks!
            ? TokenParser<StripLeft<After, Suffix>, [...Tokens, [Token, [Prefix, Suffix]]]>
            : [...Tokens, [Token, [Prefix, After]]]
        : never;

Let's run some tests:

type TokenParseResult1 = TokenParser<"*message*">;
// => [["message", ["", ""]]]

type TokenParseResult2 = TokenParser<"*message* - <*author*>">;
// => [["message", ["", " - <"]], ["author", ["", ">"]]]

type TokenParseResult3 = TokenParser<"[*type*] *message*">;
// => [["type", ["[", "] "]], ["message", ["", ""]]]

Great, our parser is ready! 🥳

Before constructing our linter, remember we mentioned having a Constraints type? We need a Validator type to validate our commit message tokens against these constraints. Let's build it next 💪

Enforcing constraints with a validator

Let's call this type ValidateValue and define its signature:

type ValidateValue<C extends Constraints, TokenName extends keyof C, Value extends string, Checks = C[TokenName]> = /* todo */

Our validator accepts a Constraints object, a token name, and the token's value in the commit message.

To better understand what Value represents, let's imagine our commit template is [*kind*] *message*, and the actual commit message is [feat] implemented sso login. For the first token, kind, its Value will be feat; for the second token, message, its Value will be implemented sso login.

Last, we have the Checks type, which is set to the specific constraints of TokenName.

We begin by looking for various constraints in the Checks type. If they match, we'll validate the provided Value against the appropriate validator:

type ValidateValue<C extends Constraints, TokenName extends keyof C, Value extends string, Checks = C[TokenName]> = 
    TokenName extends string
        ? Checks extends { required: boolean }
            ? /* todo: validate required value */
            : Checks extends { kind: string|number }
                ? /* todo: validate value's kind (i.e., enum type) */
                : Checks extends { minLength: number }
                    ? /* todo: validate value's min length */
                    : Checks extends { maxLength: number }
                        ? /* todo: validate value's max length */
                        : null // all checks pass
        : 'Invalid key';

First, we ensure that TokenName is a string (and not a symbol, for example). This will be useful when we need to print an error message.

Then, we compare the Checks object to each type of constraint. If it matches, we perform the check. The validator returns null if all the checks pass. Otherwise, it returns an error message.

We'll build a separate validator for each type of check, which we'll use in the respective branch in our ValidateValue type. We'll start with the RequiredValidator type:

type RequiredValidator<Value extends string, IsRequired extends boolean> = IsRequired | Value extends true | '' ? false : true;

This validator accepts a string value and an IsRequired boolean, denoting whether the value is required. The validator returns true only if the value is not empty or not required and false otherwise.

type RequiredValidatorResult1 = RequiredValidator<'', true>; // => false

type RequiredValidatorResult2 = RequiredValidator<'value', true>; // => true

type RequiredValidatorResult3 = RequiredValidator<'', false>; // => true

The next validator we'll look at is the KindValidator type:

type KindValidator<Value extends string, Options extends string|number> = Value extends Options ? true : false;

Options is a union that contains all the possible values for Value.

type KindValidatorResult1 = KindValidator<'debug', 'info'|'warning'|'error'>; // => false

type KindValidatorResult2 = KindValidator<'warning', 'info'|'warning'|'error'>; // => true

For the last two validators, MinLengthValidator and MaxLengthValidator, we'll first build a helper type, WordLength:

type WordLength<W extends string, Counter extends number[] = []> = W extends ''
    ? Counter['length']
    : W extends `${string}${infer T}`
        ? WordLength<T, [...Counter, 1]>
        : [...Counter, 1]['length'];

type WordLengthResult1 = WordLength<''>; // => 0
type WordLengthResult2 = WordLength<'typescript'>; // => 10

This type accepts a string and returns the length of that string.

Now, we're ready to build our last two validators. Let's start with MinLengthValidator:

type MinLengthValidator<Value extends string, Length extends number, Counter extends number[] = []> = Counter['length'] extends Length
    ? true
    : Counter['length'] extends WordLength<Value>
        ? false
        : MinLengthValidator<Value, Length, [...Counter, 1]>;

We increase the size of Counter every round until its length equals Length or the length of Value.

This validator returns true only if the counter length reaches the length of Value before it reaches Length or false otherwise.

type MinLengthResult1 = MinLengthValidator<'hello', 4>; // => true

type MinLengthResult2 = MinLengthValidator<'', 4>; // => false

type MinLengthResult3 = MinLengthValidator<'', 0>; // => true

For the last validator, MaxLengthValidator, instead of duplicating the code, we'll use MinLengthValidator and flip its result:

type MaxLengthValidator<Value extends string, Length extends number> = MinLengthValidator<Value, Length> extends true ? false : true;

Since we're reversing the result of MinLengthValidator there is an edge case we need to watch out for:

type MaxLengthResult = MaxLengthValidator<'hello', 5>; // => false???

Since the word "hello" is five letters long, MinLengthValidator has returned true - which our validator flipped, and we got false.

The solution is simple: first, check whether the length of Value equals Length. If it is - return true; otherwise, call MinLengthValidator and flip its result:

type MaxLengthValidator<Value extends string, Length extends number> = WordLength<Value> extends Length
    ? true
    : MinLengthValidator<Value, Length> extends true ? false : true;

type MaxLengthResult = MaxLengthValidator<'hello', 5>; // => true ✅

Now we can finish our ValidateValue type:

type ValidateValue<C extends Constraints, TokenName extends keyof C, Value extends string, Checks = C[TokenName]> = 
    TokenName extends string
        ? Checks extends { required: boolean }
            ? RequiredValidator<Value, Checks['required']> extends true
                ? ValidateValue<C, TokenName, Value, Omit<Checks, 'required'>>
                : `${TokenName} must not be empty`
            : Checks extends { kind: string|number }
                ? KindValidator<Value, Checks['kind']> extends true
                    ? ValidateValue<C, TokenName, Value, Omit<Checks, 'kind'>>
                    : `Invalid value "${Value}"`
                : Checks extends { minLength: number }
                    ? MinLengthValidator<Value, Checks['minLength']> extends true
                        ? ValidateValue<C, TokenName, Value, Omit<Checks, 'minLength'>>
                        : `${TokenName} should be at least ${Checks['minLength']} characters long.`
                    : Checks extends { maxLength: number }
                        ? MaxLengthValidator<Value, Checks['maxLength']> extends true
                            ? ValidateValue<C, TokenName, Value, Omit<Checks, 'maxLength'>>
                            : `${TokenName} cannot exceed ${Checks['maxLength']} characters.`
                        : null // all checks pass
        : 'Invalid key';

This code should make sense to you now. After performing a check, if the result is positive, we call ValidateValue again, removing this check from the Checks object. If the result is negative, we return immediately with an appropriate error message. Let's run some tests:

type DemoConstraints = {
    level: { required: true, kind: 'debug'|'info'|'warn'|'error' },
    message: { required: true, minLength: 10, maxLength: 30 }
}

type ValidationResult1 = ValidateValue<DemoConstraints, 'level', ''>;
// => "level must not be empty"

type ValidationResult2 = ValidateValue<DemoConstraints, 'level', 'critical'>;
// => "Invalid value 'critical'"

type ValidationResult3 = ValidateValue<DemoConstraints, 'message', 'test'>;
// => "message should be at least 10 characters long."

By the way, now you also understand why we had to ensure that TokenName is a string: we can only print it if it's a string or number, while the original type of it was string | number | symbol.

Assembling our linter

We have all the parts of our linter ready to be assembled. Let's start as usual by defining the signature of our linter:

type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints, Tokens = TokenParser<Template>> = /* todo */;

Our linter inspects one token at a time and matches the beginning of Message to that token pattern. If all the tokens have been matched successfully (i.e., the Tokens array is empty), it returns a success value, null.

type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints, Tokens = TokenParser<Template>> = 
    Tokens extends [infer Token, ...infer RestTokens]
        ? Token extends [infer TokenName extends string, [infer Prefix extends string, infer Suffix extends string]]
            ? /* todo */
            : 'Invalid token'
        : null;

We extract the first token's name, prefix, and suffix from the' Tokens' array. Then, we match Message to the pattern of this token:

type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints, Tokens = TokenParser<Template>> = 
    Tokens extends [infer Token, ...infer RestTokens]
        ? Token extends [infer TokenName extends string, [infer Prefix extends string, infer Suffix extends string]]
            ? Message extends `${Prefix}${infer Value}${Suffix}${infer RestOfMessage}`
                ? /* todo */
                : 'Commit message does not match the template'
            : 'Invalid token'
        : null;

Notice the two infer keywords in the string we match to Message. This will become significant later on.

If the message does not match the token's pattern, we return an error message. If it does, we extract the token's value and the remaining of the commit message.

type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints, Tokens = TokenParser<Template>> = 
    Tokens extends [infer Token, ...infer RestTokens]
        ? Token extends [infer TokenName extends string, [infer Prefix extends string, infer Suffix extends string]]
            ? Message extends `${Prefix}${infer Value}${Suffix}${infer RestOfMessage}`
                ? ValidateValue<C, TokenName, Value> extends `${infer Error}`
                    ? Error
                    : Linter<RestOfMessage, Template, C, RestTokens>
                : 'Commit message does not match the template'
            : 'Invalid token'
        : null;

We run the token's value through the ValidateValue validator. This validator returns either null or an error message. If it has produced an error message, we return it. Otherwise, we call Linter with the remainder of the commit message and the remainder of the tokens array.

It looks like our linter is ready, right? Let's test it:

type DemoConstraints = {
    change: { required: true, kind: 'feat'|'fix' },
    message: { required: true, minLength: 10, maxLength: 30 }
}

type CommitLinterResult1 = CommitLinter<'[feat] redesign login page.', '[*change*] *message*.', DemoConstraints>;
// => null

type CommitLinterResult2 = CommitLinter<'[foo] redesign login page.', '[*change*] *message*.', DemoConstraints>;
// => "Invalid value 'foo'"

type CommitLinterResult3 = CommitLinter<'redesign login page', '*message*', DemoConstraints>;
// => "message should be at least 10 characters long." ???

What happened in the third test? Why we're getting an error message? Our commit message is longer than ten characters!

Well, here's where the double infer keywords I mentioned earlier come into play 🤦

To understand the problem with two infer keywords, let's perform a little experiment:

type DoubleInfer<T extends string> = T extends `${infer Start}${infer End}` ? Start : never;

type Experiment1 = DoubleInfer<'abc'>; // => ['a', 'bc']
type Experiment2 = DoubleInfer<'a'>; // => ['a', '']

Where we match a string to a pattern that contains two or more infer keywords, Typescript tries to "slice" the string such that the inferred types will match one letter each, and the last inferred type will match the remainder of the string.

Now, let's pass our commit message to DoubleInfer and compare the result to the result of the match statement of our linter:

// Our experiment:
DoubleInfer<'redesign login page'>; // ['r', 'edesign login page']

// Our linter:
Message extends `${Prefix}${infer Value}${Suffix}${infer RestOfMessage}`
// => Prefix='', Value='r', Suffix='', RestOfMessage='edesign login page'

And now let's try to validate this Value:

ValidateValue<C, TokenName, Value /* only 'r'! */> extends `${infer Message}`
// => Validation error: "message should be at least 10 characters long."

Because the problem is rooted in the double infer keywords and only in cases where the entire string matches one token, the solution will be to match the entire string by using a single infer keyword.

Here's a suggested solution:

type CommitLinter<Message extends CommitMessage, Template extends CommitTemplate, C extends Constraints, Tokens = TokenParser<Template>> = 
    Tokens extends [infer Token, ...infer RestTokens]
        ? Token extends [infer TokenName extends string, [infer Prefix extends string, infer Suffix extends string]]
            ? Message extends `${Prefix}${infer Value}${Suffix}${infer RestOfMessage}`
                ? ValidateValue<C, TokenName, Value> extends `${infer Error}` // first validation
                    ? Message extends `${Prefix}${infer Value}${Suffix}`
                        ? ValidateValue<C, TokenName, Value> extends `${infer Error}` // second validation
                            ? Error
                            : CommitLinter<'', Template, C, RestTokens>
                        : Error
                    : CommitLinter<RestOfMessage, Template, C, RestTokens>
                : 'Commit message does not match the template'
            : 'Invalid token'
        : null;

Notice that if the first validation fails, we don't return the error message immediately; instead, we try to match the message to the token again, but this time, we remove the RestOfMessage inferred type, and we're left with only one inferred type which correctly matches to the entire message.

Let's run the previous tests again:

type DemoConstraints = {
    change: { required: true, kind: 'feat'|'fix' },
    message: { required: true, minLength: 10, maxLength: 30 }
}

type CommitLinterResult1 = CommitLinter<'[feat] redesign login page.', '[*change*] *message*.', DemoConstraints>;
// => null

type CommitLinterResult2 = CommitLinter<'[foo] redesign login page.', '[*change*] *message*.', DemoConstraints>;
// => "Invalid value 'foo'"

type CommitLinterResult3 = CommitLinter<'redesign login page', '*message*', DemoConstraints>;
// => null

This time, all the tests pass. 🥳

Congratulations, we've made it! All that's left is to incorporate that into Git hooks, but that's a topic for another post 😉

Thank you for reading this post. If you have any questions/suggestions/comments, please write them down in the comments. I'd love to hear your thoughts. 🤗

See you next article ✍️👋🏼

DEV Community

Building Git Commit Linter Type in Typescript (Medium Difficulty)

Defining the challenge

Thinking about a solution

What is a Token Parser?

Building the token parser

Extracting the token's suffix

Enforcing constraints with a validator

Assembling our linter

Top comments (0)

Read next

Laravel Error Fix: How to Handle "Attempt property on null"

🚀 Exploring Upgrading to Nuxt 3

Filament Context Menu

Maximizing the Use of EC2 Instance Connect Endpoint with CDK