DEV Community

Julien Gabriel
Julien Gabriel

Posted on

4 1 1

Symfony DbToolsBundle - anonymize your data

Illustrations are taken directly from their website : https://dbtoolsbundle.readthedocs.io

About the Bundle

This bundle allows you to backup, restore, and anonymize databases. It is compatible with MySQL, MariaDB, PostgreSQL, and SQLite. For more details, visit their website: https://dbtoolsbundle.readthedocs.io

Set up a GDPR-friendly workflow

All companies trying to set up databases for testing and development purposes face this challenge. It often takes a lot of time and effort to maintain GDPR compliance.

Anonimization workflow example

Basic implementation of anonymisation on an entity

Let's take a look at a concrete example:



class User implements UserInterface
{
    #[ORM\Id]
    #[ORM\GeneratedValue]
    #[ORM\Column]
    private ?int $id = null;

    #[ORM\Column(length: 180, unique: true)]
    #[Anonymize(type: 'email')]
    private ?string $email = null;

    #[ORM\Column(nullable: true)]
    #[Anonymize(type: 'password')]
    private ?string $password = null;

    // [...] etc...
}


Enter fullscreen mode Exit fullscreen mode

By defining the PHP attribute #[Anonymize] on class attributes, you'll define "what and how" you'll anonymize. Many core anonymizers are already available, such as:

  • EmailAnonymizer
  • PasswordAnonymizer
  • IntegerAnonymizer
  • FloatAnonymizer
  • DateAnonymizer
  • NullAnonymizer
  • ConstantAnonymizer
  • Md5Anonymizer
  • StringAnonymizer
  • LastnameAnonymizer
  • FirstnameAnonymizer
  • LoremIpsumAnonymizer
  • AddressAnonymizer

You can even develop your own Anonymizer class if you need more specific functionality.

Once your configuration is done, you simply have to run the following command to anonymize your database:



php bin/console db-tools:anonymize [options]
# Real example for local database
# php bin/console db-tools:anonymize --local-database


Enter fullscreen mode Exit fullscreen mode

To reproduce the workflow shown above, you can easily use your favorite CI/CD tools to run pipelines that distribute an up-to-date database, completely anonymized and GDPR-friendly (though not necessarily 100% compliant).

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more