Horia Coman

Posted on May 28, 2019 • Originally published at horia141.com on May 28, 2019

Language-Level Dependency Injection

#pl #ideas #lazy #post

I’ve recently read languages I want to write and Hull – an alternative to shell that I’ll never have time to implement and I came to the conclusion I rather like the idea of writing about some hypothetical language rather than going out and building a prototype. The latter is a lot of work after all and you should do it only for serious things, not any ol’ silly idea that pops into your head! So it’s much better to just write about the silly stuff and to get it out of my system. So what follows is my contribution to this sort of literature.

The language I’d like to exist would just be a version of ${generic_oo_programming_language] but with dependency injection (DI) as a language-level construct. As opposed to something bolted on via a library. I think that one of the best sources of language features is looking at common coding patterns and lifting them up at the language level. And for a certain class of applications – think backend services, interactive applications, highly configurable systems in general – Inversion of Control and its DI implementation are the way to go wrt organizing them. So why not make this official and provide facilities for DI at the language level?

What would this mean in practice? Well, first I need to point that DI and DI frameworks/containers are essentially just tools to construct and wire objects in a syntactically nice way. So our focus will actually be on syntactic-sugar[1] type features. You can very well not use them and just write a bunch of manual boilerplate in order to link everything up together. At Bolt we actually do this and with our microservices approach it’s not such a hassle. But that’s a topic for another blog post. And in general, the bigger an application gets, the less manageable it is to do everything manually.

I’m going to use TypeScript as a stand in for ${generic_oo_programming_language}, but the examples should be easy to understand without any prior TypeScript knowledge.

Suppose we have the following structure for a LibraryService class:

class LibraryService {

    private readonly bookRepository: BookRepository;
    private readonly userRepository: UserRepository;

    public LibraryService(bookRepository: BookRepository, userRepository: UserRepository) {
       this.bookRepository = bookRepository;
       this.userRepository = userRepository;
    }

    public countBorrowedBooks(): number {
        // Don’t code like this in real life though!
        return this.bookRepository.getAll().filter(b => b.isBorrowed).length;
    }
}

There’s a single constructor and two dependencies basically. The class otherwise looks like any regular TypeScript class. Nothing special needs to happen at definition time.

A service main function which sets up a small express-like web server making use of a LibraryService might look something like this:

public main(): void {

    const libraryService = new LibraryService;
    const app = new WebServer; // Assume this is another class
    app.get(“/count-borrowed”, (req, res) => {
        res.write(libraryService.countBorrowedBooks());
        res.end();
    });
    app.start();
}

The call to new LibraryService instructs the compiler to generate all the code necessary for building an instance of LibraryService. This means it takes care of instantiating UserRepository and BookRepository - there isn’t even an empty argument list used. These opeations might evolve recursively to generate quite a lot of code.

And again there’s no new syntax here, just new semantics for the existing object creation syntax.

When speaking of “an instance”, we actually refer to the instance of LibraryService. Calling new LibraryService again will return the originally created instance, much like the default behavior of every DI framework. In fact, you can image the DI system like a map from DI key to instances. A core component of the DI key is the type name, but as we’ll see later it can be more complex. When a new in the proper format is encountered and instance is either retrieved from the DI store if it’s available, or a new one is created. All the management and complexity of this store is pushed to the language implementation here.

One nice improvement over frameworks is that you’re building atop all the other language level features that exist. So you can reuse module visibility and code structure information and drop things like modules and deal only with types (or DI keys you need to resolve more precisely).

All this can live as much as possible at compile-time rather than run-time. Did not think things through completely, but it might be the case that everything could be resolved at compile-time without a great loss of expressive power.

Another advantage would be uniformity. Especially wrt standard libraries. These tend to be unaware of DI frameworks. So the code there is usually different than other code. But it could be more uniform. Especially about time and testing. But also with integrating other third-party code. You can assume they’re aware of the same DI system as the one you’re using.

There’s also no separation between DI and non-DI classes. You can call new on something like Point2D and it’ll try to work. If this has an empty constructor it’ll just invoke that, or if there’s something which can’t be satisfied you’ll know at compile time.

There’s some more sophisticated cases I’d like to cover here, by building atop language integration.

For example, factory or supplier methods can also be implemented as language level constructs. Let’s say we have a DB connection string which we can get from a config store, but which is stored encrypted and we must decrypt before building a connection object. The code might look like this:

factory buildConnection(configStore: ConfigStore): Connection {
    const dbConnEnc = configStore.get(“DB_CONN”);
    const dbConn = decrypt(dbConnEnc, BAKED_IN_SECRET);
    return new Connection(dbConn);
}

The above is not the kind of code you’d want in a constructor or even as a static method of a class cause it’s very application specific, so we have the new factory function type for it. Thanks to the fact that this is a language level construct, you don’t need to register the factory anywhere special, pull in a dependency’s module etc. But the way to build the object should be unique, so there should only be one factory function per DI key, and it will take precedence over constructors and the like.

An advanced example is using more than a type as an “injection” key. Having just the type is many times limiting, especially for value types. So there’s a practice of extending the DI key with some string or symbol type, besides the actual type. This might look like this:

factory getEnvironment(): [:ENV] string {
    return readFile(“.env”);
}

// later on you might have something like
class Logger {

    private readonly env: string;

    public Logger(env: [:ENV] string): void {
        this.env = env;
    }
}

The [:ENV] acts as a modifier for any type and together with the type serves to identify an object the DI system can provide. Most DI containers stop here as well. Here we could go much further, but it’s probably not worth it.

We can also tie in the library level notion of injection scope, with actual programming language scopes[2]. This is especially useful in server side software, where you might have things like the application level scope and a request scope. Consider again something like request time. For ease of debugging, data integrity, prevention of faults[3] etc. it might make sense that all places where we need the time when handling a particular request use the same “request” time, generated once when the request started to be handled deep in framework code.

class WebServer {

    // big codes here

    public get(url: string, handler: HandlerFn): void {

        discope(:Request) {
            factory getTime(): [:RequestTime] Date {
                return new Date();
            }

            handler(req, res);
        }
    }

}

The key [:RequestTime] Date could even only be available from the discope statement and static analysis could ensure that all methods which need a [:RequestTime] Date key are called by methods which can provide this at compile time.

Another nice thing might be partial specification of the argument list in constructors, as a mechanism to override whatever would be used by the DI system. Something like:

new LibraryService(userRepository=new MockUserRepository())

The behavior is intuitive - bookRepository is provided via the DI system, while userRepository is MockUserRepository.

Finally – why stop at object construction? There’s a lot of places where an object of a given type needs to be provided, so why not extend the mechanism there? This works well with the possibility of partially specifying arguments. For example, a common pattern in the Bolt codebase is using a “context” object which certain framework functions require. This gets passed from function to function, but is essentially static across something like a request’s lifetime. We could have code like:

const logger = new Logger(“MakePrediction”);

function makePrediction(ctx: Context, features: FeatureSet): number {

    if (features.user_id < 0) {
        logger.error(ctx, “One of those pesky negative user ids”);
        throw new Error(“Negative user id encountered”);
    }

    // do some other stuff with good features
}

// later on we might call makePrediction and the DI mechanisms could automatically provide
// the existing instances we want.

makePredictions(new, features);

This last one probably deserves more careful attention. It might cause too big of a breakdown in program structure. Too much global knowledge necessary for a local operation being provided implicitly rather than explicitly.

Anyway, that’s about all I could hash out for this blog post and perhaps the most it makes sense to explore without actual plans of making it happen. Happy I got it out of my system though.

PS. Turns out writing this sort of stuff on a bus is not the greatest idea - typos excluded. There’s already a bunch of these languages around in one form or another. Nothing big - else we’d all know about it, but not completely Terra incognita here.

[1] Caveat → syntactic sugar causes cancer of the semicolon.[2] Which I think is one of the more underutilized techniques in PL design. Using/try-with-resources is a bit of syntactic sugar Godsend, but I’ve been very impressed with the approach described in notes on structured concurrency in programming languages.[3] Ensuring time read during a request is monotonic.

Oldest comments (2)

Horia Coman • May 31 '19

Thank you && exactly. When enough people do a thing a certain way, it makes sense to treat it specially.