DEV Community

Cover image for Everything you need to know about GitHub Arctic Code Vault
Shubham Rattra
Shubham Rattra

Posted on

Everything you need to know about GitHub Arctic Code Vault

Alt Text
You might have recently seen something like this in your GitHub account or other GitHub account and be wondering what is this all about.
So let’s take a look at what is an Arctic Code Vault Contributor and who are the ones who gets this batch.

GitHub, the world’s largest open-source platform for software and programs has safely locked the data of huge value and magnitude in a coal mine in Longyearbyen’s Norwegian town in the Arctic region.

Alt Text

Back in November 2019, GitHub Arctic Code Vault was first announced.
The GitHub Arctic Code Vault is a data repository preserved in the Arctic World Archive (AWA), a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. The archive is located in a decommissioned coal mine in the Svalbard archipelago, closer to the North Pole than the Arctic Circle.
Last year, GitHub said that it plans to capture a snapshot of every active public repository on 02/02/2020 and preserve that data in the Arctic Code Vault.
The project began on February 2, when the firm took a snapshot of all of GitHub’s active public repositories to store them in the vault. They initially intended to travel to Norway and personally escort the world’s open-source technology to the Arctic but their plans were derailed by the global pandemic. Then, they had to wait until 8 July for the Arctic Data Vault data to be deposited.
GitHub announced that the code was successfully deposited in the Arctic Code Vault on July 8, 2020. Over the past several months, GitHub worked with its archive partners Piql to write the 21TB of GitHub repository data to 186 reels of piqlFilm (digital photosensitive archival film).
GitHub’s strategic software director, Julia Metcalf, has written a blog post on the company’s website notifying the completion of GitHub’s Archive Program on July 8th. Discussing the objective of the Archive Program, Metcalf wrote “Our mission is to preserve open-source software for future generations by storing your code in an archive built to last a thousand years.”
The Arctic Code Vault is only a small part of the wider GitHub Archive Program, however, which sees the company partner with the Long Now Foundation, Internet Archive, Software Heritage Foundation, Microsoft Research, and others.

Alt Text

How the cold storage will last 1,000 years?

Svalbard has been regulated by the international Svalbard Treaty as a demilitarized zone. Home to the world’s northernmost town, it is one of the most remote and geopolitically stable human habitations on Earth.
The AWA is a joint initiative between Norwegian state-owned mining company Store Norske Spitsbergen Kulkompani (SNSK) and very-long-term digital preservation provider Piql AS. AWA is devoted to archival storage in perpetuity. The film reels will be stored in a steel-walled container inside a sealed chamber within a decommissioned coal mine on the remote archipelago of Svalbard. The AWA already preserves historical and cultural data from Italy, Brazil, Norway, the Vatican, and many others.

Alt Text

What’s in the 02/02/2020 snapshot?

The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault will sweep up every active public GitHub repository, in addition to significant dormant repos.
The snapshot will include every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020, every repo with at least 1 star and any commits from the year before the snapshot (02/03/2019–02/02/2020), and every repo with at least 250 stars.
The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size — depending on available space, repos with more stars may retain binaries. Each repository will be packaged as a single TAR file. For greater data density and integrity, most of the data will be stored QR-encoded and compressed. A human-readable index and guide will itemize the location of each repository and explain how to recover the data.

Alt Text

The company further shared that every reel of the archive includes a copy of the “Guide to the GitHub Code Vault” in five languages, written with input from GitHub’s community and available at the Archive Program’s own GitHub repository.
The archive will also include human-readable reel which documents the technical history and cultural context of the archive’s contents, which the company calls as the Tech Tree. It will primarily consist of the existing works, selected to provide a detailed understanding of modern computing, open-source and its applications, modern software development, popular programming languages, etc.

What is the reason for doing this?

This project aims to preserve open-source software for future generations by storing it in an archive built to last a thousand years.
They hope that one day, the open-source data can be used by historians or future civilizations to understand the dawn of computing: the present.
In addition to the repositories, GitHub also saved a few classic works of humanity and an introductory letter in case it’s discovered after an apocalypse, or by aliens, or by something that doesn’t know much about present humanity. “This archive, the GitHub Code Vault, was established by the GitHub Archive Program, whose mission is to preserve open-source software for future generations”.

Who gets this batch?

The snapshot included any public repository that had at least 250 stars, that had at least one star and had been updated in the past year, or that had no stars but had been updated in the previous eighty days. If you’ve ever uploaded to GitHub, you probably had got your name and a creation stored in the arctic. Clicking on the Arctic Code Vault Contributor badge in the highlights section of a profile will reveal which of a user’s projects were saved in this snapshot.
GitHub created the Arctic Code Vault Badge to honor the millions of developers worldwide who contributed to the open-source project. This badge is displayed in the highlights section of the developer’s GitHub profile.

Alt Text

So if you have the Arctic Code Vault Contributor badge then congratulations your code or project will be safe for 1000years at least and hopefully, someone in those times would find it useful.

Have a look at this video and see where your code/project is stored and how they are stored-

IMAGE ALT TEXT HERE

Top comments (0)