git is indeed a database. GitHub is a remote database powered by git.
I needed a way to keep information about certain important events in my code nicely saved for later analysis. What can be better than committing them to a VCS? Timestamps, commit descriptions etc.
I used local git first and then switched to GitHub. GitHub provides API for all its functionality.
The little code below demonstrates how this approach works.
It needs two things to be set: GITHUB_TOKEN, which can be generated in your GitHub account and the repo
variable with the repository name.
It upserts a new file to create a file. Then it upserts it again to modify it. Then it deletes the file.
The repository log nicely keeps all these actions in the commit history.
Note: The PyGithub package needs to be installed first:
pip instal pygithub
Code:
import os
from typing import Optional, Union
import github
from github.Repository import Repository
def get_repo(repo: str) -> Repository:
assert repo, 'repository name is missing'
g = github.Github(os.environ['GITHUB_TOKEN'])
return g.get_repo(repo)
def upsert_file(
name: str,
body: str,
message: Optional[str] = None,
*,
repo: Optional[Union[Repository, str]] = None,
branch: Optional[str] = "main",
verbose: Optional[bool] = False,
):
r = repo if isinstance(repo, Repository) else get_repo(repo)
try:
description_ = message or f'Update {name}'
current = r.get_contents(name, ref=branch)
current = r.update_file(
current.path,
description_,
body,
current.sha,
branch=branch,
)
if verbose:
print(current)
except github.GithubException:
message = message or f'Create {name}'
created = r.create_file(name, message, body, branch=branch)
if verbose:
print(created)
def delete_file(
name: str,
message: str = None,
*,
repo: Optional[Union[Repository, str]] = None,
branch: str = "main",
verbose: Optional[bool] = False,
):
r = repo if isinstance(repo, Repository) else get_repo(repo)
message = message or f'Delete {name}'
current = r.get_contents(name, ref=branch)
deleted = r.delete_file(
current.path,
message,
current.sha,
branch=branch,
)
if verbose:
print(deleted)
assert os.getenv('GITHUB_TOKEN'), 'Set GITHUB_TOKEN variable'
repo = "<YOUR_GITHUB_NAME>/<REPO_NAME>"
upsert_file("README.md", "NEW BODY", repo=repo, verbose=True)
upsert_file("README.md", "UPDATED BODY", repo=repo, verbose=True)
delete_file("README.md", repo=repo, verbose=True)
Execute it by:
python main.py
It prints something like:
{'content': ContentFile(path="README.md"), 'commit': Commit(sha="a6c540fec9b1b02e21acbb0ddd790efb6b7cb33f")}
{'commit': Commit(sha="2436e7ff2692a9af398dabd9eb9d1eee0f821954"), 'content': ContentFile(path="README.md")}
{'commit': Commit(sha="31fefb51e3510071777e4f4c8a0971de0a184f78"), 'content': NotSet}
Go to your GitHub repository and check the commits.
Top comments (1)
Amusing (ab)use of Github 😁
I would probably look at a source/version control system as an event store rather than a database (where I would expect to see explicit relationships between data items/tables/documents), but you certainly get atomicity, transactional processing and recovery features!