DEV Community

Cover image for Leveling up your custom fake data with Faker.js
Matt Mayer
Matt Mayer

Posted on

Leveling up your custom fake data with Faker.js

Faker.js is a very useful library which lets you generate large amounts of fake data in Javascript.

Need a name?

faker.person.fullName() //Oliver Gorczany
Enter fullscreen mode Exit fullscreen mode

A street address?

faker.location.streetAddress() //'5097 Hill Road'
Enter fullscreen mode Exit fullscreen mode

Even a dog breed?

faker.animal.dog() //'Saarloos Wolfdog'
Enter fullscreen mode Exit fullscreen mode

If you need to fill a database with dummy data, or create an API with mock data, Faker can quickly generate masses of data.

Of course the built in methods of Faker will never provide all the fake data you might need for a project. So what's the best strategy for when you need some fake data which Faker doesn't yet provide?

Let's suppose we're making a website containing data about video games reviews. Each game has a title, publisher, reviewer and star rating.

const publisher = faker.company.name()
const reviewer = faker.person.fullName()
const rating = faker.number.int({min:1, max:5})
const title = "FIXME"
Enter fullscreen mode Exit fullscreen mode

The first three are easy, but looking through the API documentation while there are APIs for books and music, there's no obviously suitable API for a video game title.

Level 1 - random selection

As a first attempt, we might just make a short list of titles by hand, and pick one at random.

const titles = ["Call of Duty", "EA Sports FC", "Minecraft", "Tetris", "Grand Theft Auto V"]
const title = titles[Math.floor(Math.random()*titles.length)]
Enter fullscreen mode Exit fullscreen mode

This works, but doesn't take advantage of any of Faker's built in features. For example, Faker allows setting a random seed for reproducible results.

faker.seed(1234)
const publisher = faker.company.name()
const reviewer = faker.person.fullName()
const rating = faker.number.int({min:1, max:5})
const titles = ["Call of Duty", "EA Sports FC", "Minecraft", "Tetris", "Grand Theft Auto V"]
const title = titles[Math.floor(Math.random()*titles.length)]
console.dir({publisher, reviewer, rating, title})
Enter fullscreen mode Exit fullscreen mode

Running this code will always output a consistent publisher, reviewer and rating for the fixed seed, but the title will be different every time I run the code.

{
  publisher: 'Kuphal LLC',
  reviewer: 'Dr. Brad Schaden',
  rating: 5,
  title: 'Call of Duty'
}
{
  publisher: 'Kuphal LLC',
  reviewer: 'Dr. Brad Schaden',
  rating: 5,
  title: 'Tetris'
}
Enter fullscreen mode Exit fullscreen mode

Level 2 - using helper function

To fix this, we use the faker.helpers.arrayElement() method which picks a random element from an array, while respecting the seed value.

faker.seed(1234)
const publisher = faker.company.name()
const reviewer = faker.person.fullName()
const rating = faker.number.int({min:1, max:5})
const titles = ["Call of Duty", "EA Sports FC", "Minecraft", "Tetris", "Grand Theft Auto V"]
const title = faker.helpers.arrayElement(titles)
console.dir({publisher, reviewer, rating, title})
Enter fullscreen mode Exit fullscreen mode

Now we get consistent values each time:

{
  publisher: 'Kuphal LLC',
  reviewer: 'Dr. Brad Schaden',
  rating: 5,
  title: 'EA Sports FC' //always the same
}
Enter fullscreen mode Exit fullscreen mode

This is better, but we're still limited to our handcrafted short list of video games.

Level 3: [citation needed]

Heading over to Wikipedia, starting at https://en.wikipedia.org/wiki/Lists_of_video_games we can see lots of lists of video games.

Image description

With some judicious copy-pasting we can easily build a list of a few thousand video games. But we're still limited by the data we have available. What if we wanted 10,000 unique video game names? Or 1,000,000?

Level 4: Digging for fake rubies

Faker was originally written in Perl and is also available as a library for Ruby, Java, Python and several other languages.

The Ruby version of Faker in particular has an even more varied set of data by default than the JS version. A little digging around the Github repository for faker-ruby finds that there is a data source for video game names located at lib/locales/en/game.yml.

Image description

This is a YAML file, so you can use a NPM YAML parsing package or a online tool to convert to a JSON array.

But still, we only end up with a few hundred entries.

Level 5: It's 2024, let's unnecessarily use AI

Let's ask ChatGPT to generate us some more video games:

Image description

Now we simply paste our generated array into the previous code:

const {faker} = require(".")
faker.seed(1234)
const titles = [
  "Super Mario Bros.",
  "The Legend of Zelda: Breath of the Wild",
  "Grand Theft Auto V",
  "Minecraft",
... 161 entries in total 
]
Enter fullscreen mode Exit fullscreen mode

That's still only 161 games. How can we get more? ChatGPT is lazy and won't generate 10,000 entries for us.

Level 6: Using fake patterns

This is where we can utilize faker's patterns. Instead of picking from a static list of real games, we can use strings with placeholders which refer to other Faker methods.

There are lots of helpful methods in other Faker modules which are well suited to being used as placeholders, for example:

  • animal.type (e.g. dog)
  • color.human (e.g. red)
  • person.firstName (e.g. Ericka)
  • person.zodiacSign (e.g. Pisces)
  • number.int (e.g. 10)
  • vehicle.type (e.g. minivan)

And there are several generic parts of speech

  • word.noun (e.g. flour)
  • word.verb (e.g. outlive)
  • word.adjective (e.g. loathsome)

For example faker.helpers.fake('Clash of {{word.noun}}s') might return 'Clash of gongs', 'Clash of ice-creams' or 'Clash of foams'.

This becomes especially powerful when you realise each fake pattern can have multiple placeholders: for example in {{word.adjective}} {{animal.type}} Adventures there are 1328 possible values for adjective and 13 possible values for animal type, leading to 17,264 possible fake video game names from this pattern alone (anyone for a quick game of Loathsome Fish Adventures or Polished Cow Adventures?)

If you pass multiple patterns to faker.helpers.fake in an array, it will pick a pattern at random, and then substitute in all the placeholders.

By choosing a variety of patterns which imitate the typical phrasing used in video game titles, we can easily create thousands or millions of possible titles.

Even with as few as 10 patterns, we can make something fun:

const patterns = 
[
  'Clash of {{word.noun}}s',
  'Pokemon {{color.human}}',
  'Cyberpunk {{number.int({"min": 2020, "max": 3000})}}',
  '{{word.noun}} of war',
  '{{person.zodiacSign}} {{word.noun}}',
  '{{word.noun}} {{number.int(10)}}',
  '{{word.adjective}} {{word.noun}}: {{word.verb}} Edition',
  "{{person.firstName}}'s {{word.noun}}",
  '{{word.adjective}} {{animal.type}} Adventures',
  '{{vehicle.type}} Simulator {{number.int({"min": 2000, "max": 2024})}}'
];
console.log(faker.helpers.fake(patterns))
Enter fullscreen mode Exit fullscreen mode

Note how it's even possible to pass parameters to methods inside a faker pattern, for example {{number.int({"min": 2000, "max": 2024})}} will provide a year between 2000 and 2024.

You may need a little post-processing to tidy up your data before use, for example video game titles usually follow title case patterns (Clash Of Bananas, not Clash of bananas). This can easily be fixed with a simple toTitleCase function.

Level 7: Putting it all together

Let's put it all together and use the faker.helpers.multiple method to generate 100 fake video games:

const titles = [
  'Clash of {{word.noun}}s',
  'Pokemon {{color.human}}',
  'Cyberpunk {{number.int({"min": 2020, "max": 3000})}}',
  '{{word.noun}} of war',
  '{{person.zodiacSign}} {{word.noun}}',
  '{{word.noun}} {{number.int(10)}}',
  '{{word.adjective}} {{word.noun}}: {{word.verb}} Edition',
  "{{person.firstName}}'s {{word.noun}}",
  '{{word.adjective}} {{animal.type}} Adventures',
  '{{vehicle.type}} Simulator {{number.int({"min": 2000, "max": 2024})}}'
];

const toTitleCase = (str) =>
  str.replace(
    /\w\S*/g,
    (txt) => txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase()
  );
console.dir(
  faker.helpers.multiple(() => toTitleCase(faker.helpers.fake(titles)), {
    count: 100,
  })
);
Enter fullscreen mode Exit fullscreen mode

And here's the output

[
  'Capricorn Marimba',
  'Minivan Simulator 2011',
  'Clash Of Routers',
  'Cyberpunk 2155',
  'Scorpio Share',
  'Accomplished Expansionism: Buff Edition',
  'Pokemon Orange',
  'Pokemon Turquoise',
  'Capricorn Semiconductor',
  "Brandy's Mozzarella",
  'Aries Management',
  'Mediocre Bird Adventures',
  'Grit 8',
  "Bessie's Wedge",
  'Urban Urge: Haul Edition',
  'Pokemon Silver',
  'Cyberpunk 2817',
  'Negative Plum: Disrupt Edition',
  'Ritual 10',
  'Cyberpunk 2813',
  "Jarret's Wriggler",
  'Tweet 9',
  'Clash Of Plentys',
  'Hatchback Simulator 2020',
  'Able Swan: Disinter Edition',
  'Aquarius Duration',
  'Mild Intentionality: Hire Edition',
  'Superficial Step-grandmother: Waft Edition',
  'Leo First',
  "Ericka's Hive",
  "Rick's Boy",
  'Pokemon Purple',
  'Leo Jack',
  'Advanced Bear Adventures',
  'Cyberpunk 2829',
  'Pokemon Purple',
  'Secondary Fish Adventures',
  'Cluttered Bear Adventures',
  'Waist 8',
  'Gloomy Pleat: Prosecute Edition',
  'Aquarius Terminology',
  'Scary Snake Adventures',
  'Leo Linguist',
  'Clash Of Gladioluss',
  'Bold Exaggeration: Outlive Edition',
  'Weird 2',
  'Better Flour: Cheapen Edition',
  'Cargo 5',
  'Choir Of War',
  'Vivacious Fish Adventures'
]
Enter fullscreen mode Exit fullscreen mode

Level 8: Contribute to Faker

If you think you have something which other Faker users would find useful, you can contribute it to Faker! See the contributing guide, you can create an issue or a pull request.

Top comments (1)

Collapse
 
jsspen profile image
Jordan Spencer

This is great! I hadn't heard of Faker before. And I would totally play Negative Plum: Disrupt Edition 🤣