DEV Community

Cover image for Seed your SqlServer, PostgreSql and MySql databases with CSV files using yuniql migrations
Rodel E. Dagumampan
Rodel E. Dagumampan

Posted on

Seed your SqlServer, PostgreSql and MySql databases with CSV files using yuniql migrations


Let me start by saying, I am writing this both to create awareness of a tool I have built and also to share my experiences that I hope helps the community make software better, faster, safer. :)

In the past 8 months, I have been building project yuniql (yuu-nee-kel); an open source schema versioning and database migration tool made with .NET Core. From the beginning, one of the primary use cases we tried to address is seeding our databases both for environment-specific deployment and for supporting integration tests. Every fresh databases carries initial set of data such as lookup tables and baseline data. Running integration tests and regression tests requires snapshots of data or samples to support the scenario being tested.

While we can always script out the initial data, an efficient way is to just drop CSV files into a folder and have this imported into destination tables during deployment, test execution or application startup. Using CSV files also allows our Test Engineers and Business SMEs prepare data themselves and deliver to development teams so it can be can be fed into regression tests.

Such is the birth of bulk import support of yuniql for CSV files as seed data.

Seeding database with CSV files and yuniql

In this article, I am using a sample database from Lets prepare a baseline schema version of the database and seed regions, countries, location, departments and jobs tables. Then we will create new version for employees and dependents tables.


Step 1: Install yuniql CLI 👉

We begin by getting yuniql cli installed. We can install yuniql CLI in several ways via chocolatey, dotnet global tool or direct from source. Choose what works for you.

Option 1: Install via .NET Core Global Tool. Requires .NET Core 3.0 SDK installed.

dotnet tool install -g yuniql.cli

Option 2: Install via Chocolatey package manager. Requires choco installed.

choco install yuniql

Option 3: Use Powershell and append to PATH

Invoke-WebRequest -Uri -OutFile  "c:\temp\"
Expand-Archive "c:\temp\" -DestinationPath "c:\temp\yuniql-cli"
$Env:Path += ";c:\temp\yuniql-cli"

Option 4: Download zipped package containing yuniql.exe file and extract to our workspace directory.

Step 2: Initialize schema repository, call yuniql init 🍨

yuniql init 

When yuniql init is issued, a baseline directory structure. Each version directory represents an atomic version of our database executed in an all-or-nothing fashion. Utility directories are also created to cover pre and post migration activities. A typical starter project workspace looks like this.


Step 3: Create baseline schema structure 🍭

Our next step is to prepare our Baseline version. A Baseline version, is the v0.00 of our database schema and initial data. Baseline helps create full visibility of our database schema evolution. Here, we prepare set of CREATE TABLE scripts in setup-tables.sql file.


Step 4: Prepare and add CSV files in the baseline 🍹

Lets now prepare the CSV files. Each CSV file represents a fully qualified name of the destination table. File dbo.regions.csv will be bulk loaded into dbo.regions table. We may use other schema or keep it simple to regions.csv and it will use dbo as default schema.


Step 5: Let's do this, call yuniql run 🏃

docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=P@ssw0rd!" -p 1400:1433 -d
SETX YUNIQL_CONNECTION_STRING "Server=localhost,1400;Database=helloyuniql;User Id=SA;Password=P@ssw0rd!"

Lets verify now by deploying our database locally. Here, we can use SQL Server container to speed up the process. In principle, we just need a database server we can connect to.

yuniql run -a

When yuniql run is issued the first time, it inspects the target database and creates required table to track the versions applied. The -a tells yuniql to create new database also. All script files in _init directory and child directories will be executed only this time. The order of execution is as follows _init, _pre, vx.xx, vxx.xx+N, _draft , _post.

All CSV files are bulk loaded when all script files are executed.


Lets verify how data looks now at SQL Server. Notice that we created dbo.__yuniqldbversions to keep record of versions already applied. This ensures the CSV files will not be imported again the next time we call yuniql run


Where do we go from here? 🚧

This is one of the first steps to enable continuous delivery of changes to our databases. We can take this further by committing this workspace into a git repository and establish a CI/CD pipelines to run the migrations to different environments. Yuniql is released also as free Azure DevOps Task and Docker image.

What about other database platforms? ❄️

A PostgreSql and MySql compatible scripts can be executed in the same way with --platform parameter.

yuniql run -a --platform postgresql
yuniql run -a --platform mysql

The latest release of yuniql supports CSV import for SQL Server, PostgreSql and MySql. I am now maturing the implementation for Snowflake DW, Azure Synapse Analytics, and CockroachDB which are equally exciting data platforms. Contributors are welcome! :)

Parting words ㊗️

Using CSV files to bulk import baseline data and test samples is an effective approach to seeding databases. CSV files also allows non-SQL users help developers prepare datasets to support the software development process. As yuniql also leverage the native bulk import APIs of the target platform such SqlBulkCopy for SqlServer, we can import large CSV files pretty fast.

Thanks if you have reached this far!!! 🍻 This is my DEV first post and I guess I had fun with emojis :D The source for this article can be cloned here

P.S. Please support yuniql by clicking GitHub Star! For a young project like this, a star can help capture more user experiences and improve the tool in its early stage.

Thanks thanks!

Top comments (0)