DEV Community

B. Camphart
B. Camphart

Posted on

Integration Testing in Obsidian

When I was writing my own plugin for Obsidian, and when I've contributed to others, I was always unsure if the unit tests I was writing were, well, accurate to how the plugin was going to actually behave when actually run inside Obsidian.

This article aims to show how I discovered a method of running tests inside Obsidian, and to help other plugin developers do so.

The Obsidian API has a lot of functionality, but it lacks a lot of documentation. Plus, there's questions I had about its behavior that documentation simply couldn't answer. I needed tests. I could build the plugin and load it into my local Obsidian test vault, then play with it in Obsidian manually to check that things worked, but this was time consuming, error-prone, and unreliable.

Using something like Vitest, Jest, or Mocha didn't work because:

  1. They load tests dynamically at runtime, and I'm limited to a single main.js file for my plugin.
  2. Their test runners either aren't easily isolated, or required external dependencies that simply couldn't be resolved at runtime.

I needed a way to embed the tests, and the testing framework, at build time.

I realized I had to build a framework myself. So I did.

If you don't care how I did it, and just want to use the framework, you can find it on Github. If you're curious, or want a more in-depth understanding of how to use the framework, keep reading.

I'm assuming you understand how to build an Obsidian plugin, how to write effective unit tests, and are familiar with the Obsidian API.

Test Builder Methods

If I was going to do this, the first thing I needed to do was define an interface for all the methods I would need in a testing framework. However, I also knew that I didn't want to implement a bunch of functionality that I didn't really need, so I started small:

export type Awaitable<T> = Promise<T> | T;

export interface TestSuite {
    describe(name: string, fn: () => void): void;
    it(name: string, fn: () => Awaitable<void>): void;
    beforeAll(fn: () => Awaitable<void>): void;
    beforeEach(fn: () => Awaitable<void>): void;
    afterEach(fn: () => Awaitable<void>): void;
    afterAll(fn: () => Awaitable<void>): void;
}
Enter fullscreen mode Exit fullscreen mode

The tests, setup, and teardown methods could be synchronous, or asynchronous, but anything describing a suite of tests would need to happen synchronously.

This is really just a reduced version of the interface that Vitest, Jest, and the rest provide.

I could have just defined these functions at the top-level and, during my integration tests, had the import for Vitest get replaced with an import to where I'd defined these functions, but that would make it so that the test code was effectively lying. Any developer looking at the code later would think issues that arose came from Vitest, but in reality, the code with which I had magically replaced Vitest would be the source of the problem, making debugging a nightmare.

Implementing this wasn't actually very difficult. You're just constructing a tree of test suites, and the leaves are the tests, setup, and teardown methods.

type Suite = {
    readonly name: string;
    readonly parent?: Suite;
    children: (ChildSuite | Test)[];
    beforeAll: BeforeAllFn[];
    beforeEach: BeforeEachFn[];
    afterEach: AfterEachFn[];
    afterAll: AfterAllFn[];
};

type Test = {
    readonly name: string;
    readonly parent: ParentSuite;
    run(): Awaitable<void>;
};
Enter fullscreen mode Exit fullscreen mode

Because all the builder methods happen synchronously, I could just keep track of which suite was currently being built, and go back to its parent after each describe block. Here's my implementation:

export default class TestBuilder implements TestSuite {
    private readonly root: Suite;
    private current: Suite;

    constructor(name: string = "") {
        this.root = {
            name,
            children: [],
            beforeAll: [],
            beforeEach: [],
            afterEach: [],
            afterAll: [],
        };
        this.current = this.root;
    }

    describe(name: string, fn: () => void): void {
        const parent = this.current;
        const child: Suite = {
            name,
            parent,
            children: [],
            beforeAll: [],
            beforeEach: [],
            afterEach: [],
            afterAll: [],
        };
        parent.children.push(child);
        this.current = child;
        fn();
        this.current = parent;
    }

    it(name: string, fn: TestFn): void {
        const test: Test = {
            name,
            parent: this.current,
            run: fn,
        };
        this.current.children.push(test);
    }

    beforeAll(fn: BeforeAllFn): void {
        this.current.beforeAll.push(fn);
    }

    beforeEach(fn: BeforeEachFn): void {
        this.current.beforeEach.push(fn);
    }

    afterEach(fn: AfterEachFn): void {
        this.current.afterEach.push(fn);
    }

    afterAll(fn: AfterAllFn): void {
        this.current.afterAll.push(fn);
    }

    /** 
     * Return a readonly copy of the test tree
     */
    definition() {
        return {
            ...this.root,
        };
    }

    /**
     * Bind all the methods to this instance for destructuring
     */
    bound(): TestSuite {
        return {
            describe: this.describe.bind(this),
            it: this.it.bind(this),
            beforeAll: this.beforeAll.bind(this),
            beforeEach: this.beforeEach.bind(this),
            afterEach: this.afterEach.bind(this),
            afterAll: this.afterAll.bind(this),
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

I used some type aliases to make the types of methods we're working with more apparent, but all the *Fn (TestFn, BeforeAllFn, etc.) types are functions that return an Awaitable<void>.

This interface and implementation now allowed me to write tests against it, then decide based on the build configuration, which test framework to run the tests against.

Contracts

If you're unfamiliar with contracts, at least, in regard to software testing, let me quickly define them:

A contract describes the expected behavior of an interface.

It sounds just like any other test, but, in this case, the contract doesn't know what class it's testing, so it has to focus on the public methods alone. A contract doesn't really know how the class it's testing is constructed, and it doesn't know anything about the underlying data. Because of this, a contract must be designed to work with anything that adheres to the interface it's testing. And, crucially, can be used to test multiple implementations of the same interface.

In my typical method of defining a contract, I would just create a function that accepts a factory, then define the tests within the function, using whatever testing framework I had available. Like this:

import { describe, it, beforeAll } from "vitest";
import { type TodoList } from "./todoList";

export function todoListContract(factory: () => TodoList) {
    describe(`Todo List Contract`, () => {
        let todoList: TodoList;
        beforeEach(() => {
            todoList = factory();
        });

        it(`does one thing`, () => {});
        it(`does something else`, () => {});
    });
}
Enter fullscreen mode Exit fullscreen mode

This would allow me to test the contract against any implementation of the TodoList interface. But, because I couldn't use Vitest or anything else inside Obsidian, I had to change the way I did things to work with any test framework:

import { type TestSuite } from "my-framework;"
import { type TodoList } from "./todoList";

export function todoListContract({ describe, it, beforeEach }: TestSuite) {}
    return function (factory: () => TodoList) {
        describe(`Todo List Contract`, () => {
            let todoList: TodoList;
            beforeEach(() => {
                todoList = factory();
            });

            it(`does one thing`, () => {});
            it(`does something else`, () => {});
        });
    };
}
Enter fullscreen mode Exit fullscreen mode

In my unit tests, I could just pass the methods that Vitest provides into the contract:

import { describe, it, beforeEach } from "vitest";
import { todoListContract } from "contracts/todoList";

describe(`Fake Todo List`, () => {

    todoListContract({ describe, it, beforeEach })(() => new FakeTodoList());

});
Enter fullscreen mode Exit fullscreen mode

And then, in an integration test, I could do the following:

import type { App } from "obsidian";
import { todoListContract } from "contracts/todoList";
import type { TestSuite } from "integration/framework";

export function testRealTodoList(
    suite: TestSuite,
    app: App,
) {

    suite.describe(`Todo List in Obsidian File`, () => {

        todoListContract(suite)(() => new ObsidianFileTodoList(app));

    });

}
Enter fullscreen mode Exit fullscreen mode

This allowed me to only execute these tests when the Obsidian app was available, and thus ensure that code talking to the Obsidian API was behaving the way I expected it to, as defined by the contract. I could also extend this test to make sure that certain things I knew about the ObsidianFileTodoList were correct, which would also be executed from within Obsidian.

But how do we actually do this in Obsidian?

Integration Test Build Configuration

I don't want the integration tests to be run, or even included, in the plugin. At first, I thought about using vite's tree shaking and import.meta to dynamically include the integration tests at build time, but that ended up being overly complicated and fragile. Instead, I just made a new main.ts file, and targeted that as the entry point for by bundler, if the build mode was set to "integration".

import { defineConfig } from "vite";

export default defineConfig(env => {
    const config = // ... however you configure your build

    if (env.mode === "integration") {
        config.build.lib.entry = "integration/main.ts";
    }

    return config;
});
Enter fullscreen mode Exit fullscreen mode

And the new main.ts file:

import { Plugin } from "obsidian";
import { TestBuilder } from "integration/framework/builder";

// import the tests
import { testRealTodoList } from "integration/tests/todoList";

export default class IntTestPlugin extends Plugin {

    onload() {
        const suite = new TestBuilder();
        // build the tests
        testRealTodoList(suite, this.app);
    }

}

Enter fullscreen mode Exit fullscreen mode

This will bundle all of our tests into the plugin. I just imported them manually. You could probably do some dynamic importing/collecting of tests at build time with a script, or using vite's glob imports, but, until I had a LOT of tests, I figured I didn't need to over complicate things.

So, now that I had my framework, and my tests, wired up and bundled into a plugin, I needed to actually run those tests.

Running the Integration Tests

Somehow, I had to take the test tree that had been built, and do two things:

  1. Display all the tests that were going to be run to prove that they'd been collected.
  2. Show the actual outcome of each test.

Displaying the tests just involved walking the tree after construction and reading the names of each suite or test.

function collectTests(suite: DefinedSuite, reporter: NestedSuiteReporter) {
    reporter.collectSuite(suite.name);
    for (const child of suite.children) {
        if ('children' in child) {
            collectTests(child, reporter);
        } else {
            reporter.collectTest(child.name);
        }
    }
    reporter.suiteCollected(suite.name);
}
Enter fullscreen mode Exit fullscreen mode

The NestedSuiteReporter interface just provides a separation between the structure of the defined test suite, and how it's being reported. If the reporter wants to display them in a hierarchy, it can. If it wants to only display the tests, it can. The simplest version of this would just be something that prints to the console:

export class NestedSuiteConsoleReporter implements NestedSuiteReporter {

    private nestingLevel = 0;

    collectSuite(name: string) {
        const padding = "".padEnd(this.nestingLevel*2, " ");
        console.log(padding + name);
        this.nestingLevel++;
    }

    collectTest(name: string) {
        const padding = "".padEnd(this.nestingLevel*2, " ");
        console.log(padding + name);
    }

    suiteCollected(name: string) {
        this.nestingLevel--;
    }
}
Enter fullscreen mode Exit fullscreen mode

So, then, we can show all the tests we've collected during our integration test run:

import { Plugin } from "obsidian";
import { TestBuilder } from "integration/framework/builder";
import { collectTest } from "integration/framework/collect";
import { NestedSuiteConsoleReporter } from "integration/framework/report";

// import the tests
import { testRealTodoList } from "integration/tests/todoList";

export default class IntTestPlugin extends Plugin {

    onload() {
        const suite = new TestBuilder();
        // build the tests
        testRealTodoList(suite, this.app);

        // report the collected tests
        collectTests(suite.definition(), new NestedSuiteConsoleReporter());
    }

}
Enter fullscreen mode Exit fullscreen mode

Now there's just actually running the tests.

This becomes more complex because each test may have many ancestor suites that have multiple beforeEach and afterEach methods defined. We also have to make sure that no test, or support method, can throw an error and break the entire framework. Additionally, each suite has to run its, possibly multiple, beforeAll and afterAll methods before allowing the next sibling suite to start. These too might throw errors. Additionally, if a beforeAll method fails, then all descendent tests should be marked as failures and no additional beforeAll or beforeEach methods should be run. However, all afterEach and afterAll methods should always be run, in case something needs to be cleaned up.

Running a single test is relatively simple:

export function runTest(
    beforeEach: readonly BeforeEachFn[],
    afterEach: readonly AfterEachFn[],
    test: DefinedTest, 
    reporter: NestedTestRunReporter
) {
    reporter.testStarting(test.name);
    let failure: any | undefined;
    try {
        for (const fn of beforeEach) {
            await fn();
        }
        await test.run();
    } catch (thrown) {
        failure = thrown;
    }

    let afterTestFailure: any | undefined;
    for (const fn of afterEach) {
        try {
            await fn();
        } catch (thrown) {
            afterTestFailure = thrown;
        }
    }

    reporter.testFinished(test.name, failure, afterTestFailure);

    return failure || afterTestFailure;
}
Enter fullscreen mode Exit fullscreen mode

The first block runs each beforeEach function, and exits if anything throws, preventing subsequent beforeEach methods from running, and preventing the test from running with an invalid setup. The second block ensures that all afterEach method runs, regardless of if previous failures. Finally, we let the reporter know if the test failed, or if an afterEach method failed, then return either of those failures.

It's when we run a test suite that things get more complex:

export function runSuite(
    beforeEach: readonly BeforeEachFn[],
    afterEach: readonly AfterEachFn[],
    suite: DefinedSuite,
    reporter: NestedTestRunReporter
) {
    reporter.suiteStarting(suite.name);

    let combinedBeforeEach = beforeEach.concat(suite.beforeEach);
    const combinedAfterEach = suite.afterEach.concat(afterEach);

    let beforeAllFailure: any | undefined;
    try {
        for (const fn of suite.beforeAll) {
            await fn();
        }
    } catch (thrown) {
        beforeAllFailure = thrown;
    }

    if (beforeAllFailure) {
        combinedBeforeEach = [() => throw beforeAllFailure];
    }

    const childFailures: any[] = [];
    for (const child of suite.children) {
        let childFailure: any | undefined;
        if ('children' in child) {
            childFailure = runSuite(combinedBeforeEach, combinedAfterEach, child, reporter);
        } else {
            childFailure = runTest(combinedBeforeEach, combinedAfterEach, child, reporter);
        }
        if (childFailure) {
            childFailures.push(childFailure);
        }
    }

    let afterAllFailures: any[] = [];
    for (const fn of suite.afterAll) {
        try {
            await fn();
        } catch (thrown) {
            afterAllFailures.push(thrown);
        }
    }

    reporter.suiteFinished(suite.name, childFailures, afterAllFailures);

    if (childFailures.length > 0 || afterAllFailures.length > 0) {
        return childFailures.concat(afterAllFailures);
    }

}
Enter fullscreen mode Exit fullscreen mode

Some of the patterns here should be familiar from the runTest function, and from the collectTests function. If any child fails, or any of the afterAll methods fail, then the suite will return the combined array of all the failures to indicate that something went wrong during its execution. This will propagate all the way back up the hierarchy. Additionally, the reporter will be told of any failures when the suite is finished.

Some additional interesting observations:

  • the beforeEach defined in this suite are added to the end of array of methods to run, but the afterEach methods of this suite are prepended. This is so that the afterEach methods in this suite will be run before outer suites' afterEach methods are run.
  • If there's a failure before the start of this suite, the array of beforeEach methods is hijacked to only have a single, failing method. This will make all nested tests fail.

Finally, we would use it like this:

import { Plugin } from "obsidian";
import { TestBuilder } from "integration/framework/builder";
import { collectTest } from "integration/framework/collect";
import { NestedSuiteConsoleReporter } from "integration/framework/report";
import { runSuite } from "integration/framework/run";

// import the tests
import { testRealTodoList } from "integration/tests/todoList";

export default class IntTestPlugin extends Plugin {

    onload() {
        const suite = new TestBuilder();
        // build the tests
        testRealTodoList(suite, this.app);

        const reporter = new NestedSuiteConsoleReporter();
        const defined = suite.definition();
        // report the collected tests
        collectTests(defined, reporter);

        // run the collected tests
        this.app.workspace.onLayoutReady(() => {
            runSuite([], [], defined, reporter);
        });
    }
}
Enter fullscreen mode Exit fullscreen mode

I'll let you implement the NestedTestRunReporter in the NestedSuiteConsoleReporter class yourself. Or, you can just use the code I've already written for you.

And there you have it! If you build the plugin and launch Obsidian, you should see your tests print to the console. Now, you can write tests, and contracts, that you know will work in Obsidian, without having to do it manually each time.

This should hopefully be enough to get you up and running. Now, you should be able to run integration tests inside Obsidian to make sure small pieces of your plugin work as intended. But, if you want to be really, really sure that your plugin does what you expect, well, there's two other things to do:

  • Test Obsidian directly. How do we know that, say the "modify" event that Obsidian fires is being called when we expect? If we test it manually to understand it, and then the behavior changes sometime in the future, that could break the way our plugin works. This is where consumer-driven contract testing comes in. We create a contract based on assumptions we have about Obsidian's behavior, and only tests that pass are allowed to exist in the contract. We'll explore this in a future article.
  • Really and truly test the plugin. We've tested the parts, we've tested Obsidian itself, but how do we know, when all the parts are put together, that something funky and weird doesn't happen? That's where end-to-end testing comes in. We'll explore this in a future article as well.

For now, given what you've learned in this article, see if you can put together your own way of doing the above two items.


Thanks for reading! Let me know if you have any questions, or if you feel like I missed something (or I'm just flat-out wrong). You can leave a comment here, or submit an issue on github.

Want more from me? Check out my github page, or subscribe here for future articles.

Top comments (0)