loading...

Why Go modules are faster than GOPATH

#go
tbpalsulich profile image Tyler Bui-Palsulich ・6 min read

Downloading dependencies with Go modules can be significantly faster than using GOPATH-based dependency management. A fresh dependency download experiment for this post took 9 minutes and 33.845 seconds in GOPATH mode and 10.185 seconds in module mode. This post explains why. The key difference is that modules avoid deep repository clones and can use a module proxy.

For an introduction to Go modules, see https://blog.golang.org/using-go-modules.

GOPATH-based dependency management downloads every dependency into GOPATH/src, doing a deep clone of the version control repository in the process.

So, if you're in GOPATH mode and your project depends on A, which depends on B, which depends on C, the go command will git clone all three repositories (assuming they're all using git).

When using Go modules, you need two things when downloading a dependency:

  1. The dependency's source code.
  2. The dependency's go.mod file.

The go.mod file is used to figure out which version of every dependency you need. Once the go command collects all of the go.mod files from all of your dependencies, it can figure out which version of each dependency you need for your project.

For example, if you depend on modules A and B directly, and module A depends on v1.2.0 of module B, your module needs to depend on at a minimum B v1.2.0. This same idea is applied to every module you depend on, either directly or indirectly.

Dependency graph showing how A's dependency on B v1.2.0 means your project must depend on at least B v1.2.0.

Side note: notice that if a new version of module A or B is released, that doesn't change the minimum version of B you must depend on. That requirement is stable over time.

Module proxy

Module proxies are a huge reason why Go modules are so much faster. A module proxy understands the two components you need from every module dependency: the source code and the go.mod file. That understanding leads to two separate optimizations:

  1. Source code can be distributed as a zip file instead of a deep VCS clone. Downloading a single zip file is much faster than doing a full VCS clone, making module downloads more efficient than GOPATH.
  2. You can download a go.mod file without getting the rest of the source code. This makes using a module proxy more efficient than not using a proxy.

Optimization #1 means when the go command downloads a dependency, there is less to download compared to GOPATH. Instead of having to (usually) do a full git clone, the go command downloads a single zip file with the source code of the module version you're asking for.

If you are using modules without a proxy, the go command does a shallow clone whenever possible. A shallow clone is much faster than a deep clone because you are only retrieving a single commit, rather than the full repository history.

The result is that, when using modules, there is simply less stuff to download compared to GOPATH.

Optimization #2, understanding go.mod files, is more subtle.

Just because a module is in your module graph doesn't mean it's in your import graph. The module graph includes the entire list of modules included in your go.mod file, all your dependencies' go.mod files, all of their dependencies' go.mod files, and so on. The import graph contains all of the packages imported by your project, the packages that those packages import, and so on. A module you depend on can contain packages you don't need to compile your project.

For example, let's say you depend on a database helper with support for Postgres, MySQL, and MongoDB. The helper has a separate package for each supported database and each one depends on a third-party module/package to communicate with that database. Your project only uses the helper's Postgres package, so you don't need the MySQL or MongoDB packages to build your project. Further, you don't need the source code for the MySQL or MongoDB modules to compile your project—they aren't in the import graph!

Dependency graph for the database example showing the import graph as a subset of the module graph.

However, you do need the go.mod file for those modules. If a module is in your module graph, it must be taken into account when figuring out the minimum version of each dependency to use.

Here is where a module proxy comes to the rescue. A module proxy can give you just the go.mod file for any given module, without the source code. If you aren't using a proxy, you need to clone the entire repo (source code and all) just to get the go.mod file, knowing that code will never be compiled. Notably, as mentioned above, the go command in module mode does a shallow clone whenever possible.

When you're in GOPATH mode, the go command only downloads the dependencies that are in your import graph. So, this particular proxy optimization only improves performance compared to not using a module proxy. Lazy module loading (planned for Go 1.16) will speed up modules even more by reducing the number of go.mod files that the go command needs to fetch.

Results

The optimizations above mean that (1) the stuff you download is smaller and (2) there are fewer things to download.

Combined, a fresh module download can be around 5 times faster when using a proxy versus without. When compared to GOPATH, using modules with a proxy can be over 50 times faster.

It's theoretically possible for modules to be slower because you need to download each version of a dependency separately, rather than only having a single version of the dependency in your system. In practice, modules are much faster and the cache requires less space on disk, especially when you factor in standard GOPATH+vendoring patterns.

Try it out

To see these optimizations in action, try the following commands with a dependency of your choice. Starting with Go 1.13, the default module proxy is proxy.golang.org—if you're using Go 1.13 or later, using modules, and haven't adjusted GOPROXY, you're already getting these benefits.

I used Cloud Shell to run these tests. Your results may vary depending on the machine you're using, the Go version you're using, your network speed, the dependency you test with, and other factors.

  1. Start by creating a temporary, empty GOPATH to test with to avoid deleting the normal GOPATH contents.

    $ mkdir /tmp/tmp.GOPATH
    $ export GOPATH=/tmp/tmp.GOPATH
    $ go env GOPATH # Just to confirm.
    /tmp/tmp.GOPATH
    
  2. Now, try downloading a dependency in GOPATH mode. These commands use cloud.google.com/go/storage, but you can try with whatever dependency you'd like:

    # Force GOPATH mode. Be sure the current directory
    # doesn't have a go.mod.
    $ export GO111MODULES=off
    $ time go get cloud.google.com/go/storage
    real    9m33.845s
    user    4m1.197s
    sys     0m18.079s
    

    You can find the cloud.google.com/go/storage go.mod file on GitHub.

  3. Next, try using Go modules. Create a new module to test with:

    $ mkdir proxy-testing
    $ cd proxy-testing
    $ unset GO111MODULES # Back to the default.
    $ go mod init example.com/proxy-testing
    
  4. Now, try downloading a dependency without a proxy:

    $ go clean -modcache # Careful!
    $ go env -w GOPROXY=direct # direct means go directly to the source.
    $ go env GOPROXY
    direct
    $ time go get cloud.google.com/go/storage
    go: finding cloud.google.com/go/storage v1.10.0
    ...
    
    real    2m6.396s
    user    1m51.447s
    sys     0m18.311s
    
  5. Now try it with the proxy enabled (by resetting GOPROXY to the default):

    $ go env -w GOPROXY=
    $ go env GOPROXY
    https://proxy.golang.org,direct
    $ go clean -modcache # Careful!
    $ go mod tidy # To start from the same state as before.
    $ time go get cloud.google.com/go/storage
    go: finding cloud.google.com/go/storage v1.10.0
    ...
    
    real    0m10.185s
    user    0m9.610s
    sys     0m1.961s
    

Bar chart comparing the three experiments. Data summarized below.

Downloading cloud.google.com/go/storage took:

  • 9m33.845s in GOPATH mode,
  • 2m6.396s using modules without a proxy, and
  • 10.185s using modules with a proxy (the default).

If you run these tests on your machine, the results will vary. However, modules are dramatically faster than GOPATH, especially when using the default proxy.

Posted on by:

tbpalsulich profile

Tyler Bui-Palsulich

@tbpalsulich

Engineer at Google working on Go and Cloud.

Discussion

markdown guide