The Code Speaks On Its Own, NOT

#vsc #literarytheory #programming #visualstudiocode

If we receive a letter saying: "you must stay home, you can only go out for grocery shopping", we won't just follow it, first we'll check who sent it. If it comes from an authority, then we'll follow its orders.

To interpret a text we need to see its context, and context is provided not by the text itself, but by things that surround it: book titles, publisher's info, jacket copy, author, date published (april's fool anyone?), and so on. In Literary Theory, these devices are called paratexts.

Paratexts are key elements when it comes to interpreting a text, to understanding as a reader how we are supposed to use and interpret the text. They guide us as we find meaning. We need paratexts, because text doesn't speak on its own. To interpret it we need context.

Code works the same. We have comments, type declarations, package names, compiler flags, and many more things that provide clues on how to interpret the code, because code on its own doesn't speak for itself.

Let's see some examples from Visual Studio Code to understand how context is important when reading code.

The two classes that were almost the same

Let's take a look at the class OpenIssueReporterArgs, which appears twice in VSC. The first time inside the file vscode/src/vs/workbench/contrib/issue/common/commands.ts:

export interface OpenIssueReporterArgs {
    readonly extensionId?: string;
    readonly issueTitle?: string;
    readonly issueBody?: string;
}

and the second one inside the file vscode/src/vs/workbench/api/common/apiCommands.ts:

export interface OpenIssueReporterArgs {
    readonly extensionId: string;
    readonly issueTitle?: string;
    readonly issueBody?: string;
}

These interfaces are used by the Issue Reporter Tool (see image), they specify the interface for the arguments passed to the tool. The interfaces are almost identical, the only difference being that in the second interface extensionId is required. So how do we know what they do? They don't seem to be speaking too much on their own, do they?

Remember we said package names act as paratext, they can help us interpret code by providing context? In this case our first interface lives inside vscode/src/vs/workbench/contrib, while the second one is inside vscode/src/vs/workbench/api. Inside contrib we have contributions that add features to VSC like full text search or git. One of those contributions is the Issue Reporter Tool. These contributions can be used from VSC, while the second one is inside api, which can be used by extensions. Extensions must provide their ID when reporting issues, while VSC doesn't need to do that. That's why for the apiCommands version of OpenIssueReporterArgs the parameter extensionId is required.

Code Comments are Hard

Another place where we see paratexts is in code comments. An evergreen problem with comments is that they tend to lose touch with code, falling behind from actual features, or even commenting about stuff that's not there anymore. But don't worry, we aren't the only ones with this problem. Not even Cervantes escaped this fate: in Don Quixote, the description for Chapter X doesn’t match the contents of the chapter. When they reformatted the book, the chapter description stopped matching its contents.

In the case of Visual Studio Code, it has some great docs for its APIs, detailing what a function is doing, the expected types and so on. But even there, we might find things that are confusing. Take a look at this code from vscode/src/vs/base/common/collections.ts:

/**
 * Returns an array which contains all values that reside
 * in the given set.`
 */
export function values<T>(from: IStringDictionary<T> | INumberDictionary<T>): T[] {
    const result: T[] = [];
    for (let key in from) {
        if (hasOwnProperty.call(from, key)) {
            result.push((from as any)[key]);
        }
    }
    return result;
}

The docs talk about a set, while the function accepts a dictionary. In my article Metaphors We Compute By I discuss how names are important because they augment the understanding we have of situations, or in this case, of pieces of code. A set implies uniqueness of elements. Moreover, a set offers certain operations like intersection, union, and so on, that is not clear at first sight how they would make sense for a dictionary, if at all.

So I opened a PR, updating the comments for those functions. While I was doing the final commit, I realized that there were more functions inside that API that mentioned set instead of dictionary. Again, keeping docs & comments in sync is hard. I think having just one function fixed saying dictionary while the others said set would have been even worse, since a future reader might be led to think they do different things.

About this issue, a colleague introduced me to a couple of tools that could help with this. One of them is electron/docs-parser. Then we have electron/typescript-definitions, and finally electron/archaeologist.

With docs-parser they generate an AST out of the documentation found in their TypeScript code. Then that AST is used by typescript-definitions to output actual TypeScript types. Finally with archaeologist they make sure that whenever a pull request is submitted against Electron, the types as they appear in the functions, match the types described in the docs. For all this to work, Electron developers have to follow a documentation style guide. I think this is a step in the right direction.

Conclusion

In this article we showed how "code speaks on its own" is just a myth. All texts, including code, work because they are assisted by many devices that help us interpret them. In Literary Theory, some of those devices are called paratexts. We saw paratexts appear in code in the shape of comments, types, package names and so on.

We saw how to understand an interface that was almost identical with another one, we had to resort to package names to understand its purpose.

Finally about comments we learnt that as with Don Quixote, we need to make sure our paratexts match the content they are trying to describe, otherwise they just add confusion. In the case of comments matching code, we saw that the people behind Electron are creating tools that move us forward in the right direction.

DEV Community

The Code Speaks On Its Own, NOT

The two classes that were almost the same

Code Comments are Hard

Conclusion

Top comments (0)