DEV Community

Cover image for gen_data: A Powerful Tool for Test Data Generation
Le Vuong
Le Vuong

Posted on • Edited on

gen_data: A Powerful Tool for Test Data Generation

Introduction

When testing software applications, one of the major challenges developers and testers face is the need for large, realistic datasets to simulate real-world scenarios. That's where gen_data comes in. This tool is designed to generate large amounts of data in CSV format, tailored for testing purposes.

usage: gen_data [-h] -r ROWS [-c COLUMNS] [-t TITLES] csvfile

Generate CSV file with specfied number of rows, and column types.

positional arguments:
  csvfile

options:
  -h, --help            show this help message and exit
  -r ROWS, --rows ROWS  number of rows
  -c COLUMNS, --columns COLUMNS
                        List of colume type, in this format: "t t t:n ..." Where t is type (number), n is column length.
  -t TITLES, --titles TITLES
                        List of column titles

List of supported types: BOOL=1 INT=2 STRING=3 FLOAT=4 DATE=5 DATETIME=6
Enter fullscreen mode Exit fullscreen mode

How to setup?

# Clone the repo
$ git clone git@github.com:patfinder/gen_data.git

# Move to the `gen_data` folder. Then enter below command to setup the tool.
# After this, `gen_data` will become a script command that you can execute directly.
$ pip install -e .

# Show info of installed script
$ pip show gen_data
Name: gen-data
Version: 0.0.1
Summary: A convenient tool for generating big test data.
Home-page: https://github.com/patfinder/gen_data/
Author: Le Vuong Nguyen
Author-email: vuong.se@gmail.com
License: UNKNOWN
Location: ~/myrepos/gen_data
Requires: 
Required-by: 
Enter fullscreen mode Exit fullscreen mode

Usage

# Show Help for the command
$ gen_data --help

# Run sample command to generate csv with 5 rows
# and columns of (bool, string, string with length of 20, int) and column titles
# List of supported types: BOOL=1 INT=2 STRING=3 FLOAT=4 DATE=5 DATETIME=6

$ gen_data f1.csv -r 5 -c"1,3,3:20,2" -t"is_active,name,job_desc,score"

# Sample output of above command is f1.csv with below content
$ cat f1.csv 
id,is_active,name,job_desc,score
1,False,Louis Martinez,Really tonight we.,559
2,True,Larry Williams,Moment word camera.,845
3,True,Brandon Williams,Wear your consumer.,677
4,False,Chelsea Zamora,Identify itself let.,384
5,True,Jonathan Collier MD,Offer popular.,502
Enter fullscreen mode Exit fullscreen mode

Image of Datadog

How to Diagram Your Cloud Architecture

Cloud architecture diagrams provide critical visibility into the resources in your environment and how they’re connected. In our latest eBook, AWS Solution Architects Jason Mimick and James Wenzel walk through best practices on how to build effective and professional diagrams.

Download the Free eBook

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more