DEV Community

nchika
nchika

Posted on • Updated on

nao1215/csv - Read csv with validation in golang

What is nao1215/csv package

The nao1215/csv package is a library that performs validation when loading CSV files. Validation is executed based on the rules specified in the struct tags.

The nao1215/csv package was developed with inspiration from go-playground/validator and shogo82148/go-header-csv. I would like to express my gratitude here.

How to use

Please attach the "validate:" tag to your structure and write the validation rules after it. It's crucial that the "order of columns" matches the "order of field definitions" in the structure. The csv package does not automatically adjust the order.

When using csv.Decode, please pass a pointer to a slice of structures tagged with struct tags. The csv package will perform validation based on the struct tags and save the read results to the slice of structures if there are no errors. If there are errors, it will return them as []error.

package main

import (
    "bytes"
    "fmt"

    "github.com/nao1215/csv"
)

func main() {
    input := `id,name,age
1,Gina,23
a,Yulia,25
3,Den1s,30
`
    buf := bytes.NewBufferString(input)
    c, err := csv.NewCSV(buf)
    if err != nil {
        panic(err)
    }

    type person struct {
        ID   int    `validate:"numeric"`
        Name string `validate:"alpha"`
        Age  int    `validate:"gt=24"`
    }
    people := make([]person, 0)

    errs := c.Decode(&people)
    if len(errs) != 0 {
        for _, err := range errs {
            fmt.Println(err.Error())
        }
    }

    // Output:
    // line:2 column age: target is not greater than the threshold value: threshold=24.000000, value=23.000000
    // line:3 column id: target is not a numeric character: value=a
    // line:4 column name: target is not an alphabetic character: value=Den1s
}
Enter fullscreen mode Exit fullscreen mode

Struct tags

You set the validation rules following the "validate:" tag according to the rules in the table below. If you need to set multiple rules, please enumerate them separated by commas.

Validation rule without arguments

Tag Name Description
boolean Check whether value is boolean or not.
alpha Check whether value is alphabetic or not
numeric Check whether value is numeric or not
alphanumeric Check whether value is alphanumeric or not
required Check whether value is empty or not

Validation rule with numeric argument

Tag Name Description
eq Check whether value is equal to the specified value.
e.g. validate:"eq=1"
ne Check whether value is not equal to the specified value
e.g. validate:"ne=1"
gt Check whether value is greater than the specified value
e.g. validate:"gt=1"
gte Check whether value is greater than or equal to the specified value
e.g. validate:"gte=1"
lt Check whether value is less than the specified value
e.g. validate:"lt=1"
lte Check whether value is less than or equal to the specified value
e.g. validate:"lte=1"
min Check whether value is greater than or equal to the specified value
e.g. validate:"min=1"
max Check whether value is less than or equal to the specified value
e.g. validate:"max=100"
len Check whether the length of the value is equal to the specified value
e.g. validate:"len=10"
oneof Check whether value is included in the specified values
e.g. validate:"oneof=male female prefer_not_to"

Why I wrote nao1215/csv

It's the beginning of a long tale of memories.

Have you ever had the task of finding 10 errors within a CSV containing tens of thousands of rows and over 300 columns? I've experienced that several times.

I developed nao1215/sqly(article is here), which allows executing SQL on CSV files, because I found correcting CSVs to be challenging. However, sqly alone was insufficient.

In Japan, it's common to manage master data using Excel. However, it's not practical for apps to directly read data from Excel. Therefore, it's sometimes necessary to convert Excel to CSV and create apps to import the data into multiple database tables. However, there are cases where incorrect data is present in the CSV (it's strange to have incorrect data in master data, isn't it?), which prevents it from being imported into the database.

work flow

I despised this task.

I just wanted to make it even a bit easier. I had someone else implement a validation process to check the CSV columns and identify where errors were (I didn't do it myself). However, I believed I could write the validation process more generically.

Over a year has passed since I started contemplating this. To bid farewell to this challenge, I secluded myself in my room for a day and created nao1215/csv. It took over a year to accomplish what could have been done in just one day.

Next work

I plan to increase the variety of validations and implement faster validation processes. As a side note, I am not currently working with CSVs. I spent a day to resolve lingering regrets from past work. Therefore, the development pace of nao1215/csv may seem slow.

If everyone could gift GitHub Stars, I believe the development pace would be faster.

Thank you for reading this article.

Top comments (2)

Collapse
 
ccoveille profile image
Christophe Colombier

Interesting project, I'll have a look and review it if you are down for it

Collapse
 
nchika profile image
nchika

Thank you for your interest.
The nao1215/csv is a library I wrote in a day, I believe there are many bugs. I would appreciate it if you could try using it and let me know if you encounter any issues via an Issue.