loading...
Cover image for How to backup your Firestore data automatically
Zenika

How to backup your Firestore data automatically

jlandure profile image Julien Landuré Updated on ・2 min read

With my team, we use a lot of Firebase features like Firestore.

But there is no simple way to backup the data regularly.

We created a tiny Docker image zenika/alpine-firestore-backup and this simple tutorial to perform backups automatically on the Google Cloud Platform with Serverless services like Cloud Run and Cloud Scheduler.

Step1: Create a bucket on GCP

Create a GCP coldline bucket and save the name of your bucket.

Step2: Create a service account

Create a GCP Service account with the following rights:

  • Owner
  • Cloud Datastore Owner
  • Cloud Datastore Import Export Admin
  • Storage Admin

Then, download the JSON private key file.

Step3: Create your env variables for Cloud Run

Please fill in the following information:

  • GCLOUD_PROJECT_ID
  • GCLOUD_BUCKET_NAME
  • GCLOUD_SERVICE_KEY

For the GCLOUD_SERVICE_KEY, make a base64 encoded string using this command:

cat key.json | base64

Step4: Set up Cloud Run

Cloud Run is a serverless service to automatically serve your containers using http.

Create a Cloud Run service using the public image gcr.io/zenika-hub/alpine-firestore-backup.

Be careful to:

  • Choose your newly image in latest
  • Choose "Cloud Run (fully managed)" and a location
  • Enter a service name
  • Select "Allow unauthenticated invocations"
  • In the "Show optional settings / Environment variables", set the 3 environment variables seen in the previous section

You can test the service using your browser: https://alpine-firestore-backup-XXX-run.app/

Save the url created to call your Cloud Run Service.
For example: https://alpine-firestore-backup-XXX-run.app/backup

cloud-run

Step5: Launch with Cloud Scheduler

Cloud Scheduler allow you to schedule a cronjob in order to call a https endpoint at regular intervals.

Prepare a Cloud Scheduler to send a request to your Cloud Run Service every time you need.

For example, every Monday at 3:00am 0 3 * * 1 a backup will be done and stored in your bucket.

cloud-scheduler

Step6: Monitor the backup operations

You can also check the current status of each backup operation using the following url https://alpine-firestore-backup-XXX-run.app/list

Conclusion

Feel free to have a look at the Docker image we created for this operation here on github or here on dockerhub.

We also have others images we maintained like a popular and small Chromium Headless image called zenika/alpine-chrome.

Posted on by:

jlandure profile

Julien Landuré

@jlandure

CTO at Zenika, Julien is also Google Developer Expert Cloud. He loves cloud technologies to get result quickly using managed services. He funded the GDG Nantes & organized DevFest Nantes

Zenika

We are a software development company whose mission is to drive change via IT innovation. Many of our consultants have written books, do open-source contributions, teach classes and speak at popular meet-ups and conferences.

Discussion

markdown guide
 

Thanks for this great idea!

However, I'm skeptical about security management:

  • The service account has to have owner role? It's not too much?
  • Why adding the other roles if the service account is owner?
  • This highly granted role is simply base64 encoded and let in "clear" in environment variable!

Why do not simply consider this:

  • Rely on Cloud Run identity and grant the sufficient role on it
  • If you really need an additional/external service account, you maybe can consider berglas. If your code is in Go or in Python, you can easily use it (I wrote the Python lib for reading secrets from bucket)
 

Hi Guillaume 👋

Thanks for your feedback. 👍
I invite you to report the error on the github of the project here.

For the service account and the ownerrole, I just followed the documentation here.
Perhaps we could use the Cloud Run service account.

This first tutorial on "how to backup your Firestore data" is described to show a simple usecase. I understand your advices on security management. Your idea to use KMS is interesting.

Thank you.

 

I created the issue #7 and the pull request #8

Security still can be higher, with private Cloud Run, and Cloud Scheduler with a service account identity and the role run.invoker granted on it for calling the Cloud Run.

Unrelated comment: I hope you enjoy your GDE Summit weekend!

Best
Guillaume

 

I am getting an error when hitting the list page
2019-08-20 13:21:49.832 SASTERROR: (gcloud.beta.firestore.operations.list) PERMISSION_DENIED: The caller does not have permission

firebase-backup@xxxxxx.gserviceacc... firebase-backup
Cloud Datastore Import Export Admin
Cloud Datastore Owner
Owner
Storage Admin

 

Hi Daniel 👋

Thanks for your feedback. 👍
I invite you to report the error on the github of the project here

Perhaps you didn't enable the APIs 🤔
Be careful, like written in the docs, Only GCP projects with billing enabled can use the export and import functionality. Are you using the Spark Plan on Firebase? 😇

Please answer if you find a solution to share the solution to the community. 🙌

 

Which APIs are required? Don't see any mention of these in either blog or github posting

 

Thanks Julien, I only followed this blog post.
I'll take a loot at the github link as well

We are using Blaze Plan on the project

 

What is the plan for the recover/import of these backups?

 

Hi Daniel 👋
You can manually run the following commands in Cloud Shell to trigger the import.