loading...

Migrating from GitHub to Gitea

jmarhee profile image Joseph D. Marhee Updated on ・3 min read

I recently made the move, for a variety of reasons, from a public Github profile to private git infrastructure. One of the solutions I tried was Gitea, an opensource, self-hosted solution with a Github-like UI and a very accessible API.

Gitea has facilities in the UI to import repositories, but I had a lot of repos to recreate, and wanted to migrate quickly, so I wrote a script to consume both ends' APIs:

Import the Github python client, and the os, re uests, and json packages into your script, and plugin your Github and Gitea credentials and endpoint (the server address) information:

from github import Github
import os
import json
import requests

#Github
GH_ACCESS_TOKEN = os.environ['GH_ACCESS_TOKEN']
#Gitea
GITEA_ACCESS_TOKEN = os.environ['GITEA_ACCESS_TOKEN']
GITEA_USER = ""
GITEA_PASS = ""
TARGET_HOST = "https://git.yourserver.co"
MIGRATE_URI = "/api/v1/repos/migrate"
ENDPOINT = "%s%s" % (TARGET_HOST, MIGRATE_URI)

g = Github(GH_ACCESS_TOKEN)

EXCLUDE = []

and in EXCLUDE, put in a comma-delimited list of repo names from Github you don't want to migrate, i.e.:

EXCLUDE = ["repo1", "repo2"...]

Our first function will just build a list of your repos, and some pertinent information relevant to Gitea required for create a new repo on that end, mostly just the name, description, whether or not it's private, and the URL to clone from:

def getRepos(g):
    repos = []
    for repo in g.get_user().get_repos():
            r = {}
            r['name'] = str(repo.name)
            r['url'] = str(repo.url)
            r['description'] = str(repo.description)
            r['private'] = str(repo.private)
            repos.append(r)
    return repos

This returns a list of dictionaries structured like:

{"name": "repo_name", "url": "https://github.com/...", ... }

The next function we'll start will ingest an object like the one above to create a new respository:

def createRepo(source_url,name,description,private):
    headers = { "accept": "application/json", "content-type": "application/json" }
    headers["Authorization"] = "token %s" % (GITEA_ACCESS_TOKEN)

    migrate_data = { "mirror": "false",  "uid": 1 }
    migrate_data["auth_password"] = "%s" % (GITEA_PASS)
    migrate_data["auth_username"] = "%s" % (GITEA_USER)
    migrate_data["description"] = "%s" % (description)
    migrate_data["repo_name"] = "%s" % (name)
    migrate_data["private"] = "%s" % (private)
    migrate_data["clone_url"] = "%s" % (source_url)
...

The above is creating the required authentication headers, and then the request body with the repo data we specified in the getRepos function. We'll finish this function by sending this request:

...

    try:
        r = requests.post(url=ENDPOINT, data=json.dumps(migrate_data), headers=json.dumps(headers))

        if r.status_code != 200:
            return "Non-OK Response: %s" % (r.status_code)
        else:
            return "Done: %s" % (source_url)
    except Exception as e:
        return e

Now, to connect these two tasks together, we're going to create a handler to create the repository list, and then instantiate our second function to create the new repo from it:

def runMigration(r,x):
    exclude_repos = x
    for repo in r:
        if repo not in exclude_repos:
            print "Working on %s" % (repo['name'])
            print createRepo(repo['url'],repo['name'],repo['description'],repo['private'])
        else:
            print "Excluding %s" % (repo['name'])
    return "Done"

You'll see this iterates the repo list, checks that you have not excluded it, and then creates the new repo.

Putting it all together, it will look like this, and be ready to run:

Github Migration Script

For whatever your given git solution, a script like this can similarly process data from Github, and create new repos on the new host (if it has an API, a similar flow for your createRepo function will likely suffice! However, you can use that fed data, and use a library like GitPython to create the new clone locally, etc.)--this can help make such a move to a decentralized platform just that much more painless!

Discussion

pic
Editor guide
Collapse
dmfay profile image
Dian Fay

Interesting! I've been starting to consider the alternatives myself, for similar reasons. How would someone else authenticate to file an issue or pull request? Can they log in via OAuth with Google, social media, or GitHub accounts?

Collapse
jmarhee profile image
Joseph D. Marhee Author

A lot of solutions do allow public registration (if this isn't an organizational project--most of mine are, for example), or for example, if you use a solution like Gitea or Gitlab, it supports mirroring a project on Github (if you want the public to contribute, this can be a helpful pipeline in) and you can aggregate issues and pull requests that way.

Gitlab, in particular, does support Single Sign On (via Google and Github, I believe!)

Collapse
vinayhegde1990 profile image
Vinay Hegde

While the premise of setting up a self-hosted Git solution is intriguing (Gitea appears to be very fast too) but from an infrastructure perspective, will it not create an additional overhead to keep it highly available, maintain and/or upgrade?

PS: Just learned GitHub started offering private repos for upto 3 collaborators as per this

Collapse
jmarhee profile image
Joseph D. Marhee Author

Yes, it requires operational overhead, as does any self-hosted solution.
This is useful, for example, for organizations seeking to control data flow (i.e. making this code only available on a private network, or to and from a CI/CD pipeline or deployment platform), or just anyone seeking to cease doing business with SaaS providers (my main motivation, for example)--cases where the operational overhead is not a compelling drawback.

Collapse
vinayhegde1990 profile image
Vinay Hegde

Perfect! Concerns around privacy are very valid nowadays since all SaaS providers snoop on user-data in the name of rules & regulations, compliance via some way or the other.

Collapse
vitalipom profile image
Vitali Pomanitski

For developing Effectedkeyboard, I use Google Git Repository. Effected keyboard (@Effectedkey) is based on Anysoftkeyboard and so I hold down two private repositories there, for Anysoftkeyboard as well. Comfortable while I reference it. I pay only for Network traffic (0.05$ a month!) And they have secure paying and two steps authentication so anyone with my phone only can access. Great tool by the way.

Collapse
kayzercode profile image
Alex

I'm using Gitea on my own server about 2 weeks. I took it because like to save my code privately. In future I plan to sell my software some how. =)
And of course I love to hire coders to help to write code for my projects.
So, they push to my git and I learn from their code. That's all.

Collapse
professorsajjad profile image
Sajjad

another beautiful developer

Collapse
robencom profile image
robencom

So, can I install Gitea on my company's intranet servers and use it as we use Github?

Collapse
jmarhee profile image
Joseph D. Marhee Author

Yes. It's one of many excellent options for running your own git services and has compatibility with a lot of the same tooling and integrations (CI/CD, testing, etc.)

Collapse
dimensi0n profile image
Erwan ROUSSEL

What do you use for unit testing ?

Collapse
jmarhee profile image
Joseph D. Marhee Author

When I tried Gitea, I found I was able to connect my existing Drone instance pretty easily.

I ultimately ended up on Bitbucket Server, which like Gitlab, has built-in pipeline functionality for test automation, but found the experience of testing with Gitea pretty similar to past experiences with self-managed git infrastructure.