DEV Community

Pavlo Romashchenko
Pavlo Romashchenko

Posted on

Fake data generator for .NET with Bogus

A bit of background

Recently, we decided to move our several projects with Elasticsearch to the latest version. One of the projects is a search tool for clients with many possible fields for search. This service has more than one hundred fields, and we use Elasticsearch version 5. The new version of Elasticsearch has different changes in indexes and a lot of other stuff; as a result, we couldn't so easily move from 5 versions to 7. We changed the Architecture of indexes, rewrite a lot of code and faced one question - How to test our improvements?
I found an excellent idea to create a job that will populate with Fake data in Elasticsearch. Brilliant! 💡

In search of fake data instrument

Searching on the internet brought me to good library Bogus. This is a good instrument, has a lot of great stuff:

  • Up to date library
  • Has more than fifty contributors
  • Don't have a lot of issues (folks try to resolve it as soon as possible)
  • Wide API to use in your code
  • Great documentation and samples
  • Ease of use
  • Supports different locales
  • Amazing community extensions (also has premium extensions)
  • and many others...

Note
You will be surprised if you find out that Elasticsearch, Microsoft and FluentValidation use this library. You can check it. 🤔

Let's try!

In this tutorial, I try to show different examples and techniques of using this library, and we will be working to generate fake data for vehicles. Also, we will create unit tests and a few load tests to measure the speed of generating data. We will try different locales and more. Let's deep dive into code!

Installation

To add NuGet packages from the CLI, you need to execute the command dotnet add package Bogus or in Package Manager Console - Install-Package Bogus and you install the latest version.

Simple example

This is example of my Car model

    public class Car
    {
        public string VinCode { get; set; }
        public string RegistrationNumber { get; set; }
        public ColorEnum Color { get; set; }
        public DateTime Year { get; set; }
        public double Price { get; set; }
        public double EngineVolume { get; set; }
        public VehicleTypeEnum VehicleType { get; set; }
        public BrandEnum Brand { get; set; }
        public bool IsAllWheelDrive { get; set; }
        public Tire Tire { get; set; }
        public Transmission Transmission { get; set; }
    }
Enter fullscreen mode Exit fullscreen mode

Also, I create FakeDataMapper - mapper for properties, because each property should have own rule to create fake data.

// Default example of mapping rules for Car model
public static Faker<Car> DeafultCar = new Faker<Car>()
    .RuleFor(x => x.Brand, f => f.PickRandom<BrandEnum>())
    .RuleFor(x => x.Color, f => f.Random.Enum<ColorEnum>())
    .RuleFor(x => x.EngineVolume, f => f.Random.Double(0, 5))
    .RuleFor(x => x.IsAllWheelDrive, f => f.Random.Bool())
    .RuleFor(x => x.Price, f => f.Random.Double(1000, 10000))
    .RuleFor(x => x.VinCode, f => f.Vehicle.Vin())
    .RuleFor(x => x.VehicleType, f => f.Random.Enum<VehicleTypeEnum>())
    .RuleFor(x => x.Year, f => f.Date.Past())
    .RuleFor(x => x.RegistrationNumber, f => f.Random.Guid().ToString());
Enter fullscreen mode Exit fullscreen mode

As you can see, we use wide API of library, and it is cool!
Also, I want note such stuff:

  1. We can use library API, for example Vehicle class f.Vehicle.Vin() - get string VinCode
  2. We can use Random API which provide a lot of methods
  3. For enum you can use two ways to get random value from enum - f.PickRandom<BrandEnum>() or f.Random.Enum<ColorEnum>()

Advanced example

public static Faker<Car> AdvancedCar = new Faker<Car>()
    // Ensure all properties have rules. By default, StrictMode is false
    // Example with StrictMode = true 
    .StrictMode(true)
    .RuleFor(x => x.VinCode, f => f.Vehicle.Vin())
    .RuleFor(x => x.RegistrationNumber, f => f.Random.Guid().ToString())
    // Method to pick random data of your class 
    .RuleFor(x => x.Brand, f => f.PickRandom<BrandEnum>())
    .RuleFor(x => x.Color, f => f.Random.Enum<ColorEnum>())
    .RuleFor(x => x.EngineVolume, f => f.Random.Double(0, 5))
    .RuleFor(x => x.IsAllWheelDrive, f => f.Random.Bool())
    .RuleFor(x => x.Price, f => f.Random.Double(1000, 10000))
    // Method to pick random data of your class 
    .RuleFor(x => x.VehicleType, f => f.Random.Enum<VehicleTypeEnum>())
    .RuleFor(x => x.Year, f => f.Date.Past())
    .RuleFor(x => x.Tire, f => new Tire
    {
        Brand = f.PickRandom<TireBrandEnum>(),
        Diameter = f.Random.Number(10, 100)
    })
    .RuleFor(x => x.Transmission, f => new Transmission 
    {
        Brand = f.Company.CompanyName(),
        Number = f.Commerce.ProductName()
    })
    // Optional: After all rules are applied finish with the following action
    // Sometimes can be logging or other helpful logic
    .FinishWith((faker, car) => 
    {
        Console.WriteLine($"New car {car.VinCode} was generated successfully");
    });
Enter fullscreen mode Exit fullscreen mode

In this example we can see:

  1. StrictMode(true) - is checked all properties have their rules
  2. FinishWith - sometimes can be done after generation data

Locales

If you want to use specific language data - you can use locales! You just need set Locale property.

public static List<Car> GenerateAdvancedCar(int count, 
string locale = "en")
{
    var generator = FakeDataMapper.AdvancedCar;
    generator.Locale = locale;

    // Check configuration
    generator.AssertConfigurationIsValid();

    return generator.Generate(count);
}
Enter fullscreen mode Exit fullscreen mode

This is my results:

[
   {
      "VinCode":"77BS77F3WSR911082",
      "RegistrationNumber":"2e09f4a8-61ca-448e-4b93-4e36f700c418",
      "Color":1,
      "Year":"2021-10-06T03:36:00.5403703+03:00",
      "Price":9891.462153238925,
      "EngineVolume":4.046757858734,
      "VehicleType":2,
      "Brand":22,
      "IsAllWheelDrive":false,
      "Tire":{
         "Brand":6,
         "Width":0.0,
         "Profile":0.0,
         "Diameter":70
      },
      "Transmission":{
         "Brand":"Wilkinson - Fritsch",
         "Number":"Handcrafted Metal Bacon"
      }
   }
]
Enter fullscreen mode Exit fullscreen mode

Unit tests

There is so simple to test your code with unit tests.

        private static IEnumerable<TestCaseData> Data
        {
            get
            {
                yield return new TestCaseData(100);
                yield return new TestCaseData(1000);
                yield return new TestCaseData(10000);
                yield return new TestCaseData(100000);
            }
        }

        [Test]
        public void TestAssertConfiguration_ShouldSuccess()
        {
            var generator = new FakeGenerator(1, "en");
            generator.AssertConfiguration();
        }

        [Test]
        [TestCaseSource(nameof(Data))]
        public void GenerateSimpleCar_ShouldSuccess(int count)
        {
            var generator = new FakeGenerator(count, "en");
            var data = generator.GenerateDefaultCars();

            Assert.NotNull(data);
            Assert.True(data.Count == count);
        }
Enter fullscreen mode Exit fullscreen mode

Benchmarks

I think this is the most exciting chapter. How fast is the library? We will find out!

Image description

I think it is okay results for the library. Probably, my PC is not so fast :)

Library has own benchmarks

Summarise

I found this library great and valuable in my projects when I needed to generate some data. Wide API helps to create fake data for different objects in a different areas.

For example, you need to test database performance with a large amount of data. How to fill a database with extensive data? You can use this library and create a simple console to fill up your database.

Another example, you want to create a simple Web API project with CRUD operations. But you need to get data and use it in CRUD. How to generate data? You can use this library and store data in some storage - database, cache or maybe JSON file and use it.
As you can see, this library has a great number of use cases to use it.

I hope this article was helpful for you, and you can use this library in your projects!
Write in comments which library do you use for fake data generation?
Also, if you have any questions about the library, you can ask me in the comments
Thank you for reading it ❤️ and have a nice day 😊

Useful Links

  • My GitHub example of code in article sample
  • Full API of Bogus you can find here
  • Bogus supports the following locales
  • Amazing community extensions here
  • Premium extensions here
  • Documentation
  • Good article on dev.to about Bogus

Top comments (0)