<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Erick Müller</title>
    <description>The latest articles on DEV Community by Erick Müller (@rqmlr).</description>
    <link>https://dev.to/rqmlr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F235681%2F1cbb80d7-a7e1-4481-bb11-794ffcce75b3.jpeg</url>
      <title>DEV Community: Erick Müller</title>
      <link>https://dev.to/rqmlr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rqmlr"/>
    <language>en</language>
    <item>
      <title>Building fake data for tests using python</title>
      <dc:creator>Erick Müller</dc:creator>
      <pubDate>Wed, 08 Jul 2020 01:58:11 +0000</pubDate>
      <link>https://dev.to/rqmlr/building-fake-data-for-tests-using-python-3k9a</link>
      <guid>https://dev.to/rqmlr/building-fake-data-for-tests-using-python-3k9a</guid>
      <description>&lt;p&gt;Sometimes you start a project from the scratch and have that empty feeling of having no data to help you to test your code. Or you will receive a "gift" of a legacy project that has code signaled as "done" but has no data that really proves this code really works.&lt;/p&gt;

&lt;p&gt;We’ve all been there, even you, novice programmer. And every time, we do the same: inserts some data manually, in a tedious work that is always done using headphones, listening to that song that makes the time passes SLOWER.&lt;/p&gt;

&lt;p&gt;So, to help you to avoid these problems, I'll show you some techniques that I've been using in the last years to generate test data, or fake data, if you want to use this name. And using Python, because is the best language to do this kind of thing. &lt;/p&gt;

&lt;h1&gt;
  
  
  Using the random package
&lt;/h1&gt;

&lt;p&gt;Python comes bundled with a nice package to handle with randomness, called &lt;code&gt;random&lt;/code&gt;, and it can help us to generate some random data based on a existing sample. And, yes, you can build this sample, and have &lt;code&gt;random&lt;/code&gt; work with it.&lt;/p&gt;

&lt;p&gt;To use it, let's start by importing the package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;

&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; The &lt;code&gt;random.seed()&lt;/code&gt; initializes the randomizer. If you want to your randomizer always return the same data, pass the same value to this command (eg: &lt;code&gt;random.seed(42)&lt;/code&gt;). This is specially useful for reproducible tests. If you do not inform nothing, it will use the current system's datetime to generate the seed, and then for each run you will receive aleatory data.&lt;/p&gt;

&lt;p&gt;Ok. So now, for the first example, we want to generate a list of products and prices. We know some things: the product has a product_id, a category_id and a price. The product_id has at least 4 numbers, the category is always one, but based in a known list, and the price is between a known range, from 0.01 to 100.00. And we need exactly 273 itens in this list.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;

&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;known_categories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'DRY'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'FRESH'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'LIQUID'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;    &lt;span class="c1"&gt;# my data sample
&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"product_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;9999&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; 
        &lt;span class="s"&gt;"category_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;known_categories&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
        &lt;span class="s"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randrange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; 
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;273&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;After running this code, we'll have the &lt;code&gt;items&lt;/code&gt; list made of items as expected.&lt;/p&gt;

&lt;p&gt;I've used a &lt;a href="https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions"&gt;list comprehension&lt;/a&gt;,  that is a expression that generates a list. &lt;/p&gt;

&lt;p&gt;The three used functions from &lt;code&gt;random&lt;/code&gt; are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;random.sample&lt;/code&gt;, that selects a item from the range. &lt;code&gt;sample&lt;/code&gt; assures that there will be no repeated items. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;random.choice&lt;/code&gt; that select one item from the list &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;random.randrange&lt;/code&gt; that select one item from the range. Given that &lt;code&gt;randrange&lt;/code&gt; only returns an integer, we divide the returned number by 100 to have our price.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Using the faker package
&lt;/h1&gt;

&lt;p&gt;For other data formats, specific cases, or other scenarios, we can use the &lt;a href="https://github.com/joke2k/faker"&gt;&lt;em&gt;faker&lt;/em&gt;&lt;/a&gt; package. &lt;/p&gt;

&lt;p&gt;This package can be used to generate data, based some common concepts, like "name" and "address".&lt;/p&gt;

&lt;p&gt;To install it, use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;faker
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;On your code, import and use it. For our second example, we will generate a list of users, with full name, address and phone number.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;faker&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Faker&lt;/span&gt;

&lt;span class="n"&gt;fake&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Faker&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;user_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; 
        &lt;span class="s"&gt;"full_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"{} {}"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;&lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
        &lt;span class="s"&gt;"address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s"&gt;"phone"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;

    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The code returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3005 Sydney Isle&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Combsmouth, FL 14919'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Rachel Weaver'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'691-992-2752'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'290 Rich Walk&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Quinntown, MT 05852'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Elijah Hood'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'9308368606'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'787 Andrea Valley&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Woodfurt, VA 86763'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Janice Jones'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'(889)096-4636x1433'&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;If you specify the language code when instancing the &lt;code&gt;Faker&lt;/code&gt; object, the same functions returns data specific for this language. If you want data tailored for Germany, you can use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;fake&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Faker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'de_DE'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Updating the localization parameter, the same example now will return&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Scheelplatz 863&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;68737 Pinneberg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Tim Jessel'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'+49(0)5917 36668'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Bertold-Dussen van-Weg 78&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;89881 Hohenstein-Ernstthal'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Sophia Lorch'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'(01072) 277426'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
 &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'address'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Herrmannstr. 4/6&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;98896 Lübeck'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'full_name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Elke Bohlander'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;'phone'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'(03509) 08874'&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Even the localizations have their own special functions. The full list can  be found &lt;a href="https://faker.readthedocs.io/en/master/locales.html"&gt;here&lt;/a&gt;, and the  special functions related to each localization can be found inside the  provider's documentation to each "locale".&lt;/p&gt;

&lt;p&gt;As a last note, you can use the &lt;code&gt;seed&lt;/code&gt; method the &lt;code&gt;Faker&lt;/code&gt; class, exactly as you can use on Python's &lt;code&gt;random&lt;/code&gt;, and it yields the same results, given the same &lt;em&gt;seed&lt;/em&gt;. &lt;/p&gt;

&lt;h1&gt;
  
  
  And now...
&lt;/h1&gt;

&lt;p&gt;Using this two tools, you can generate any data you need to test your code. From this starting point, some options are available: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;you can send this generated data to a database, and regenerate the data to each test. If you set the seed to the same value, you always have the same results, and  then is assured that the tests always run with the same test data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;you write a file with the data, using a format like &lt;code&gt;csv&lt;/code&gt;, and share with users that can validate the test data you're using on your tests. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
    </item>
  </channel>
</rss>
