loading...

Building Git with Node.js & TypeScript - Part 2

ethanarrowood profile image Ethan Arrowood ・7 min read

This post covers chapter 4, the concept of history between commits. Follow along with the code available here.

Read the previous posts here:

Reminders:

  • code highlight text references actual pieces of code such as commands, properties, variables, etc.
  • boldface text references file and directory names.
  • italic text references higher level data structures such as commit, blob, database, etc.
  • Most classes will referred to using italics, but may also appear as code highlights.
  • Imports are omitted from code examples. Assume all imports refer to other local files or Node.js core modules.
  • All code blocks have their respective file name commented at the top of the block.

Overview

Previously, I implemented the init and commit commands. Together they create a .git directory with a database that can track blobs of data through commits organized with trees. Additionally, it tracks the commit author, message, and timestamp. In the previous article I even demonstrated how you can get my implementation up and running! In this post I will be introducing two new structures: refs and lockfile. I'll be making some changes to the Commit and Database classes, and the commit command in jit.ts.

While working on this section I made some quick fixes to the existing code:

  • calls to database.store are now awaited
  • the slice call was removed from the database.generateTempName method as it was not necessary

History and Refs

If you have used git before, you'll already know that commits are connected in a chain-like structure. To create this chain, commits track their parent through a reference. There are more complex reference relationships which will come later in this series, but for now we are focussing on a flat, commit-to-commit chain.

Create a Refs class. We will come pack and implement the updateHead method later. The readHead method first checks if the HEAD file exists and is readable; if it is not then it returns null (this detail is important). Then, we return the contents of the HEAD file -- the latest commit.

// refs.ts
export default class Refs {
    public pathname: string

    private headPath: string

    constructor(pathname: string) {
        this.pathname = pathname
        this.headPath = path.join(pathname, 'HEAD')
    }

    public async updateHead(oid: string) {}

    public async readHead() {
        try {
            await fs.promises.access(this.headPath, fs.constants.F_OK | fs.constants.R_OK)
            return (await fs.promises.readFile(this.headPath, 'utf8')).trim()
        } catch (err) {
            return null
        }
    }
}

In jit.ts create a Refs instance alongside the Workspace and Database. And then get the latest commit using the readHead method (I do this after storing the tree in the database). Pass the parent commit object id to the new Commit constructor, and after writing the new commit to the database, update the HEAD file with refs.updateHead.

// jits.ts

// ...
const workspace = new Workspace(rootPath)
const database = new Database(dbPath)
const refs = new Refs(gitPath)
// ...
const parent = await refs.readHead()
// ...
const commit = new Commit(parent, tree.oid, author, message)
await database.store(commit)
await refs.updateHead(commit.oid)

Jump back over to refs.ts to start implementing the updateHead method. This method makes use of a new structure, lockfile.

// refs.ts

public async updateHead(oid: string) {
    const lockfile = new Lockfile(this.headPath)

    if (!(await lockfile.holdForUpdate())) {
        throw new LockDenied(`Could not acquire lock on file: ${this.headPath}`)
    }

    await lockfile.write(oid)
    await lockfile.write("\n")
    await lockfile.commit()
}

Lockfile

A lockfile, in this context, is a mechanism to protect our program from attempting to operate on the same file at the same time. If two operations were attempting to write to the HEAD file at the same time, the operations could potentially result in unexpected behavior or even a crash. By using a locking mechanism, the application can be certain that it won't accidentally be operating on a file that something else is operating on. Thus, introducing the Lockfile class.

The class contains three private properties, the most important one being the lock file handle. This file handle will not directly refer to the HEAD file, but a HEAD.lock one instead.

The holdForUpdate method first checks if the lock is null. If it is, it knows that nothing is currently being locked, so then it attempts to open HEAD.lock. The file flag constants validate a couple of conditions:

  • O_RDWR checks if the program has read/write access to the file
  • O_CREAT will create the file if it doesn't already exist
  • O_EXCL indicates that opening a file should fail if the O_CREAT flag is set and the file already exists

The method returns true after successfully creating the file handle. Otherwise, it handles a set of error conditions:

  • If the file already exists, return false.
  • If the parent directory does not exist, throw a custom MissingParent error
  • If the application does not have the right access permissions, throw a custom NoPermission error
  • And finally, if we don't recognize the error, throw it so we can debug and later improve the code.

The reason this method returns boolean values in certain conditions is how it will be used. Referring back to its use within refs.ts, you can see that if the lockfile update returns false, we throw an error that we couldn't lock the HEAD file.

The write method checks for the existence of the lock file and then writes the data to it.

The commit method also checks for the lock file, and then closes the file handle and renames it to the existing, non .lock path. After that it also resets the lock property to null.

// lockfile.ts

class MissingParent extends Error {}
class NoPermission extends Error {}
class StaleLock extends Error {}

export default class Lockfile {
    private filePath: string
    private lockPath: string
    private lock: fs.promises.FileHandle | null

    constructor(path: string) {
        this.filePath = path
        this.lockPath = `${path}.lock`
        this.lock = null
    }

    public async holdForUpdate () {
        try {
            if (this.lock === null) {
                const flags = fs.constants.O_RDWR | fs.constants.O_CREAT | fs.constants.O_EXCL
                this.lock = await fs.promises.open(this.lockPath, flags)
            }
            return true
        } catch (err) {
            switch (err.code) {
                case 'EEXIST':
                    return false
                case 'ENOENT':
                    throw new MissingParent(err.message)
                case 'EACCES':
                    throw new NoPermission(err.message)
                default:
                    throw err
            }
        }
    }

    public async write(data: string) {
        if (this.lock === null) {
            throw new StaleLock(`Not holding lock on file: ${this.lockPath}`)
        }
        await this.lock.write(data)
    }

    public async commit() {
        if (this.lock === null) {
            throw new StaleLock(`Not holding lock on file: ${this.lockPath}`)
        }
        await this.lock.close()
        await fs.promises.rename(this.lockPath, this.filePath)
        this.lock = null
    }
}

There is one major change I'd like to make in this class in the future; removing the use of null. I prefer to rely on undefined as null has some strange behaviors in JavaScript. This is not a hard rule for JavaScript apps, but it is my preference. For now though, using null is okay as it better aligns with the Ruby implementation this is based off of.

If you want to know more about why that is my preference, read this issue by @sindresorhus.

Now that we've completed both refs and lockfile, all thats left are some short changes to commit and database.

Commit Updates

Recall that in jit.ts we are now passing the parent commit as the first argument to the Commit class constructor. We must update the Commit method to handle these changes. The generateData method must be updated too. The parent line must only be added if it is not null. This piece maintains the current functionality for the root-commit.

// commit.ts
export default class Commit extends Entity {
    public parent: string | null
    // ...

    constructor(parent: string | null, treeOid: string, author: Author, message: string) {
        super('commit', Commit.generateData(parent, treeOid, author, message))
        this.parent = parent
        // ...
    }

    private static generateData(parent: string | null, treeOid: string, author: Author, message: string) {
        const lines = []

        lines.push(`tree ${treeOid}`)
        if (parent !== null) lines.push(`parent ${parent}`)
        lines.push(`author ${author.toString()}`)
        lines.push(`committer ${author.toString()}`)
        lines.push("")
        lines.push(message)

        const data = lines.join("\n")

        return Buffer.from(data)
    }
}

Database updates

In addition to the history feature, we can make a small edit to the database writeObject method that will prevent it from storing objects that already exist. I've added a fileExists method to simplify the logic, this can probably be written better so if you have any ideas comment them below and we can discuss them together.

// database.ts
export default class Database {
    // ...
    private async writeObject(oid: string, content: Buffer) {
        const objectPath = path.join(this.pathname, oid.substring(0, 2), oid.substring(2))
        if (await this.fileExists(objectPath)) return
        // ...
    }
    // ...
    private async fileExists(path: string) {
        try {
            await fs.promises.access(path, fs.constants.F_OK)
            return true
        } catch (err) {
            return false
        }
    }
}

Before finishing, there is one last change in jit.ts at the end of the commit command. This change improves the CLI output when creating a root vs non-root commit.

// jit.ts
const isRoot = parent === null ? "(root-commit) " : ""
console.log(`[${isRoot}${commit.oid}] ${message.substring(0, message.indexOf("\n"))}`)

Try it out

Clone the repo:

git clone git@github.com:Ethan-Arrowood/building-git-with-nodejs-and-typescript.git

Fetch and checkout the part-2 branch

git fetch origin part-2
git checkout part-2

Install dependencies, build src, and link the executable

npm i
npm run build
npm link

Set the current working directory to src and and run the commit command with the initial commit message

cd src
jit init
export GIT_AUTHOR_NAME="name" GIT_AUTHOR_EMAIL="email" && cat ../COMMIT_EDITMSG | jit commit

Write a second commit

commit ../COMMIT_EDITMSG2 | jit commit

To see if everything worked correctly use git log

git log --oneline

It should output two commits with their respective messages, mine looked like this:

a6cfc02 (HEAD) Use HEAD to set the parent of the new commit
fd5602b Initial revision of "jit", the information manager from Boston

Conclusion

That is it for the initial history feature. Thank you for reading! I encourage you to ask questions and continue the discussion in the comments; I'll do my best to respond to everyone! If you enjoyed make sure to follow me on Twitter (@ArrowoodTech). And don't forget to check out the book, Building Git.

Posted on by:

ethanarrowood profile

Ethan Arrowood

@ethanarrowood

Microsoft Software Engineer by day, JavaScript/TypeScript/Node.js open source contributor by night.

Discussion

pic
Editor guide