<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: ugo landini</title>
    <description>The latest articles on DEV Community by ugo landini (@ugol).</description>
    <link>https://dev.to/ugol</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1076139%2F731ef72d-768a-4f53-9f34-5f38f1907fcd.jpeg</url>
      <title>DEV Community: ugo landini</title>
      <link>https://dev.to/ugol</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ugol"/>
    <language>en</language>
    <item>
      <title>JR, quality Random Data from the Command line, part II</title>
      <dc:creator>ugo landini</dc:creator>
      <pubDate>Wed, 31 May 2023 17:36:34 +0000</pubDate>
      <link>https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-ii-3nb3</link>
      <guid>https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-ii-3nb3</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-i-5e90"&gt;first part&lt;/a&gt; of this series, we have seen how to use &lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; in simple use cases to stream random data from predefined templates to standard out and &lt;a href="https://kafka.apache.org/"&gt;Apache Kafka&lt;/a&gt; on &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this follow-up, we'll have a closer look at the JR data generation process and how you can use it to generate data which is usable to streaming applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart functions
&lt;/h2&gt;

&lt;p&gt;We defined quality data across 2 dimensions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;things that must be realistic "in themselves", like an IP address, or a credit card number&lt;/li&gt;
&lt;li&gt;things that are realistic if coherent to other data, like names, companies, emails, cities, zip codes, mobile phones, locale, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Some JR template functions are “smart”, so let's talk a bit about &lt;strong&gt;type 2&lt;/strong&gt; data. Let's look at the predefined user template for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; jr template show user

{
  "guid": "{{uuid}}",
  "isActive": {{bool}},
  "balance": "{{amount 100 10000 "€"}}",
  "picture": "http://placehold.it/32x32",
  "age": {{integer 20 60}},
  "eyeColor": "{{randoms "blue|brown|green"}}",
  "name": "{{name}} {{surname}}",
  "gender": "{{gender}}",
  "company": "{{company}}",
  "work_email": "{{email_work}}",
  "email": "{{email}}",
  "about": "{{lorem 20}}",
  "country": "{{country}}",
  "address": "{{city}}, {{street}} {{building 2}}, {{zip}}",
  "phone_number": "{{phone}}",
  "mobile": "{{mobile_phone}}",
  "latitude": {{latitude}},
  "longitude": {{longitude}}
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the &lt;code&gt;user&lt;/code&gt; template doesn't contain any logic to correlate type 2 data, but if you try to run the template, you'll see that everything works as expected. Let's run the template with IT localisation for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr run --locale IT user

{
  "guid": "3c37f1d2-c4d4-4a10-ac9e-eefa0d0a4fc1",
  "isActive": false,
  "balance": "€8106.36",
  "picture": "http://placehold.it/32x32",
  "age": 21,
  "eyeColor": "green",
  "name": "Maria Rizzo",
  "gender": "F",
  "company": "Evil Partners",
  "work_email": "maria.rizzo@evilpartners.com",
  "email": "maria.rizzo@hotmail.com",
  "about": "Lorem ipsum dolor sit amet, laoreet ligula. Curabitur id nisl ut Lorem sit amet justo pulvinar aliquet accumsan sit amet",
  "country": "IT",
  "address": "Lodi, Piazza dei Miracoli 80, 26900",
  "phone_number": "0371 95903936",
  "mobile": "3899578232",
  "latitude": -22.4702,
  "longitude": -4.6067
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, name, gender, email, country, address, zip code and phones are all coeherent. That's because JR, under the hood, keep track of everything and reuse data previously generated in the template. So, if you generate a work_email, the function will reuse name, surname and company. &lt;br&gt;
Zip code is a reverse regex pattern which is valid for the city, mobile phone is valid for the country, and so on. At the moment some JR localisations are in progress, so pls &lt;a href="https://github.com/ugol/jr/issues"&gt;contribute&lt;/a&gt; if you want to help us!&lt;/p&gt;

&lt;p&gt;This is pretty simple and straightforward, so let's look now at relations between data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Emitters
&lt;/h2&gt;

&lt;p&gt;So far we have seen simple generation use cases. If you need to generate related data, you need more tools. JR comes preconfigured with some example emitters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter list

List of JR emitters:

shoe
shoe_customer
shoe_order
shoe_clickstream
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What's an &lt;strong&gt;emitter&lt;/strong&gt;? It's basically a preconfigured jr job, and it's really helpful when you have to generate different entities with different generation parameters and relations between them. &lt;/p&gt;

&lt;p&gt;Let's study the preconfigured shoe example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter show shoe

Name:shoe
Locale: us
Num: 0
Frequency: 0s
Duration: 0s
Preload: 100
Output: stdout
Topic: shoes
Kcat: false
Oneline: false
Key Template: null
Value Template: shoe
Output Template: {{.V}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;this will generate just &lt;strong&gt;10&lt;/strong&gt; shoes in preload phase (i.e. before the generation phase), and no more: &lt;code&gt;frequency&lt;/code&gt; and &lt;code&gt;duration&lt;/code&gt; are both at &lt;strong&gt;0&lt;/strong&gt;. So this is useful for more static "table-like" stuff.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter show shoe_customer

Name:shoe_customer
Locale: us
Num: 1
Frequency: 1s
Duration: 10s
Preload: 20
Output: stdout
Topic: shoe_customers
Kcat: false
Oneline: false
Key Template: null
Value Template: shoe_customer
Output Template: {{.V}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;code&gt;shoe_customer&lt;/code&gt; we have a &lt;code&gt;preload&lt;/code&gt; of &lt;strong&gt;20&lt;/strong&gt;, but it will also generate a customer per second for &lt;strong&gt;10&lt;/strong&gt; seconds. So it's static, but less than the shoes, which is reasonable. You don't have a new product to sell every second, but you may have new customers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter show shoe_clickstream

Name:shoe_clickstream
Locale: us
Num: 1
Frequency: 100ms
Duration: 10s
Preload: 0
Output: stdout
Topic: shoe_clickstream
Kcat: false
Oneline: false
Key Template: null
Value Template: shoe_clickstream
Output Template: {{.V}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;shoe_clickstream&lt;/code&gt; is much more dynamic, it emits &lt;strong&gt;1&lt;/strong&gt; click every &lt;strong&gt;100ms&lt;/strong&gt;, with no &lt;code&gt;preload&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter show shoe_order

Name:shoe_order
Locale: us
Num: 1
Frequency: 500ms
Duration: 10s
Preload: 0
Output: stdout
Topic: shoe_orders
Kcat: false
Oneline: false
Key Template: null
Value Template: shoe_order
Output Template: {{.V}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;shoe_order&lt;/code&gt; is similar, no &lt;code&gt;preload&lt;/code&gt; and a lower &lt;code&gt;frequency&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But wait, this is just a way to simplify the command line and differentiate frequency, duration, preload and other parameters for every template: where are the relations?&lt;/p&gt;

&lt;p&gt;Let's look at the show template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr template show shoe

{{$id:=uuid}}{{add_v_to_list "shoes_id_list" $id}}{
  "id": "{{$id}}",
  "sale_price": "{{amount 200 2000 ""}}",
  "brand": "{{from "sport_brand"}}",
  "name": "{{randoms "Pro|Cool|Soft|Air|Perf"}} {{from "cool_name"}} {{integer 1 20}}",
  "rating": "{{format_float "%.2f" (floating 1 5)}}"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here you can see that a random &lt;strong&gt;uuid&lt;/strong&gt; is assigned to a &lt;strong&gt;$id&lt;/strong&gt; variable, and then added to a &lt;code&gt;shoes_id_list&lt;/code&gt; with the &lt;code&gt;add_v_to_list&lt;/code&gt; command.&lt;br&gt;
The list is automatically shared with all the running templates, so to have a working relationship you just need to get random ids from this list instead of generating them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr template show shoe_clickstream
{
  "product_id": "{{random_v_from_list "shoes_id_list"}}",
  "user_id": "{{random_v_from_list "customers_id_list"}}",
  "view_time": {{integer 10 120}},
  "page_url": "https://www.acme.com/product/{{random_string 4 5}}",
  "ip": "{{ip "10.1.0.0/16"}}",
  "ts": {{counter "ts" 1609459200000 10000 }}
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the &lt;code&gt;shoe_clickstream&lt;/code&gt; template that's pretty clear: &lt;code&gt;product_id&lt;/code&gt; and &lt;code&gt;user_id&lt;/code&gt; are not random but come from &lt;code&gt;shoes_id_list&lt;/code&gt; and &lt;code&gt;customers_id_list&lt;/code&gt;, so there is &lt;strong&gt;full referential integrity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you need to have more than 1 value from a list, you can use &lt;code&gt;random_n_v_from_list&lt;/code&gt; function instead of &lt;code&gt;random_v_from_list&lt;/code&gt;. This function is guaranteed to peek &lt;strong&gt;n&lt;/strong&gt; different values form the list, so is ideal for &lt;strong&gt;1:many&lt;/strong&gt; relationships.&lt;/p&gt;

&lt;p&gt;to start all the emitters, just type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr emitter run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A goroutine per emitter will start producing random data, but not too random: coherency and integrity are important for your streaming applications! &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;We have seen how to use &lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; in more advanced use cases, streaming quality random data with referential integrity. &lt;br&gt;
In the next part of this series, we will see how to use &lt;strong&gt;REST apis&lt;/strong&gt; with JR.&lt;br&gt;
In the meanwhile, happy streaming!&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>datagen</category>
      <category>streaming</category>
      <category>cli</category>
    </item>
    <item>
      <title>JR, quality Random Data from the Command line, part I</title>
      <dc:creator>ugo landini</dc:creator>
      <pubDate>Sun, 07 May 2023 18:11:48 +0000</pubDate>
      <link>https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-i-5e90</link>
      <guid>https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-i-5e90</guid>
      <description>&lt;h1&gt;
  
  
  What is JR?
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; is a cli tool which helps to stream quality random data. We all know what streaming is and why it is so important nowadays. Now, let's try to define what "quality random data" is. &lt;br&gt;
A simple - and not too scientific - way of defining it is whatever data is good enough to look real.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is 1.2.3.4 a good IP? And 10.2.138.203?&lt;/li&gt;
&lt;li&gt;Is
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  {
    "ID": "ABCDEFG1234"
    "name": "Ugo Landini",
    "gender": "F",
    "company": "Confluent",
    "email": "john.wayne@ibm.com"
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;a good random user? &lt;/p&gt;

&lt;p&gt;What about this one instead?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  {
    "ID": "69167997-0253-4165-a17d-9ef896124426"
    "name": "Laura Kim",
    "gender": "F",
    "company": "Boston Static",
    "email": "laura.kim@bostonstatic.com"
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Defining quality data
&lt;/h2&gt;

&lt;p&gt;There are essentially two different dimensions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;things that must be realistic "in themselves", like an IP address, or a credit card number&lt;/li&gt;
&lt;li&gt;things that are realistic if coherent to other data, like names, companies, emails, cities, zip codes, mobile phones, locale, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sometimes we may need to generate random data of type 2 in different streams, so the "coherency" must also spread across different entities, think for example to referential integrity in databases. If I am generating users, products and orders to three different Kafka topics and I want to create a streaming application with &lt;a href="https://flink.apache.org/"&gt;Apache Flink&lt;/a&gt;, I definitely need data to be coherent across topics.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is JR?
&lt;/h2&gt;

&lt;p&gt;So, is &lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; yet another faking library written in Go? Yes and no. &lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; indeed implements most of the APIs in &lt;a href="https://fakerjs.dev/api"&gt;fakerjs&lt;/a&gt; and &lt;a href="https://github.com/brianvoe/gofakeit"&gt;Go fake it&lt;/a&gt;, but it's also able to stream data directly to stdout, &lt;a href="https://kafka.apache.org/"&gt;Kafka&lt;/a&gt;, &lt;a href="https://redis.io/"&gt;Redis&lt;/a&gt; and more (Elastic and MongoDB coming). JR can talk directly to &lt;a href="https://github.com/confluentinc/schema-registry"&gt;Confluent Schema Registry&lt;/a&gt;, manage json-schema and Avro schemas, easily maintain coherence and referential integrity. If you need more than what is OOTB in JR, you can also easily pipe your data streams to other cli tools like &lt;a href="https://github.com/edenhill/kcat"&gt;kcat&lt;/a&gt; thanks to its flexibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it's called JR?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;J&lt;/strong&gt;ust a &lt;strong&gt;R&lt;/strong&gt;andom generator, &lt;strong&gt;J&lt;/strong&gt;son &lt;strong&gt;R&lt;/strong&gt;andom generator, or, better just &lt;strong&gt;JR&lt;/strong&gt; from the famous 80's Dallas &lt;a href="https://en.wikipedia.org/wiki/J._R._Ewing"&gt;character&lt;/a&gt; are all valid answers. JR can generate everything and not only JSON, so I definitely prefer the last one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The use case that generated the generator.
&lt;/h2&gt;

&lt;p&gt;I work as Staff Solutions Engineer in &lt;a href="https://www.confluent.io/"&gt;Confluent&lt;/a&gt;: some weeks ago I was talking with a prospect customer and he told me that they needed to send json documents like this (among others) to &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
"VLAN": "DELTA",
"IPV4_SRC_ADDR": "10.1.41.98",
"IPV4_DST_ADDR": "10.1.137.141",
"IN_BYTES": 1220,
"FIRST_SWITCHED": 1681984281,
"LAST_SWITCHED": 1682975009,
"L4_SRC_PORT": 81,
"L4_DST_PORT": 80,
"TCP_FLAGS": 0,
"PROTOCOL": 1,
"SRC_TOS": 211,
"SRC_AS": 4,
"DST_AS": 1,
"L7_PROTO": 443,
"L7_PROTO_NAME": "ICMP",
"L7_PROTO_CATEGORY": "Application"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;They needed to send many of these (and similar) documents to Kafka and so it was important to measure how good &lt;a href="https://developer.ibm.com/articles/benefits-compression-kafka-messaging/"&gt;Kafka client compression&lt;/a&gt; would have been at &lt;strong&gt;their&lt;/strong&gt; rate and with &lt;strong&gt;their&lt;/strong&gt; data. &lt;/p&gt;

&lt;p&gt;When you use fully managed services like &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt; it's very important to understand how much your data will be compressed: price is directly proportional to throughput and kafka batches messages in the producers, so producer compression can easily save you a lot of bandwidth and therefore a lot of money. Now, producing data in real time and analysing it in real time is pretty easy with &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;. But answering to the prospect question in real time (i.e. during the conference call) it wasn't as easy as it should. Which compression algorithm is better? Would it be fast enough? Is &lt;a href="https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html#batch-size"&gt;batch size&lt;/a&gt; important for the compression?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developer.confluent.io/tutorials/kafka-connect-datagen/confluent.html"&gt;Datagen&lt;/a&gt; is the de-facto standard to generate random data for Kafka. But customising what's generated is not something you can do in 30 seconds, and enabling compression is currently not an option with the managed connectors. So I decided to write a tool which you could use to easily start streaming random data to kafka in seconds, and that's why JR &lt;a href="https://github.io/ugol/jr"&gt;was born&lt;/a&gt;. With the help of some &lt;a href="https://github.com/ugol/jr/graphs/contributors"&gt;friends and colleagues&lt;/a&gt; we packed JR with a lot of features (and many more coming!)&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic JR usage
&lt;/h2&gt;

&lt;p&gt;JR is very straightforward to use. Let's look at all the preinstalled templates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr template list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All the templates should be green: that means that their syntax is correct and they &lt;em&gt;compile&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Let's see the &lt;code&gt;net_device&lt;/code&gt; template, which is what I should have written if I had JR during the conference call to randomise what they gave me:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; jr template show net_device

{
"VLAN": "{{randoms "ALPHA|BETA|GAMMA|DELTA"}}",
"IPV4_SRC_ADDR": "{{ip "10.1.0.0/16"}}",
"IPV4_DST_ADDR": "{{ip "10.1.0.0/16"}}",
"IN_BYTES": {{integer 1000 2000}},
"FIRST_SWITCHED": {{unix_time_stamp 60}},
"LAST_SWITCHED": {{unix_time_stamp 10}},
"L4_SRC_PORT": {{ip_known_port}},
"L4_DST_PORT": {{ip_known_port}},
"TCP_FLAGS": 0,
"PROTOCOL": {{integer 0 5}},
"SRC_TOS": {{integer 128 255}},
"SRC_AS": {{integer 0 5}},
"DST_AS": {{integer 0 2}},
"L7_PROTO": {{ip_known_port}},
"L7_PROTO_NAME": "{{ip_known_protocol}}",
"L7_PROTO_CATEGORY": "{{randoms "Network|Application|Transport|Session"}}"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;net-device&lt;/code&gt; template is pretty easy to write: these are all "Type 1" fields with no relations. You can easily generate a good IP starting from its CIDR with the &lt;code&gt;ip&lt;/code&gt; function. There are other networking functions used in this template, all pretty straightforward, like &lt;code&gt;ip_known_port&lt;/code&gt;, &lt;code&gt;integer&lt;/code&gt; and &lt;code&gt;unix_time_stamp&lt;/code&gt;. Running this template is just a matter of typing&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; jr template run net-device 

{
"VLAN": "DELTA",
"IPV4_SRC_ADDR": "10.1.175.220",
"IPV4_DST_ADDR": "10.1.148.210",
"IN_BYTES": 1553,
"FIRST_SWITCHED": 1680183839,
"LAST_SWITCHED": 1682746947,
"L4_SRC_PORT": 443,
"L4_DST_PORT": 81,
"TCP_FLAGS": 0,
"PROTOCOL": 0,
"SRC_TOS": 195,
"SRC_AS": 0,
"DST_AS": 0,
"L7_PROTO": 22,
"L7_PROTO_NAME": "SFTP",
"L7_PROTO_CATEGORY": "Network"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you write your own templates you'll probably need to look at all the available functions. Let's see for example how to ask JR which networking functions are available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; jr man -c network

...

Name: ip_known_protocol
Category: network
Description: returns a random known protocol
Parameters:
Localizable: false
Return: string
Example: jr run --template '{{ip_known_protocol}}'
Output: tcp

Name: http_method
Category: network
Description: returns a random http method
Parameters:
Localizable: false
Return: string
Example: jr run --template '{{http_method}}'
Output: GET

Name: mac
Category: network
Description: returns a random mac Address
Parameters:
Localizable: false
Return: string
Example: jr run --template '{{mac}}'
Output: 7e:8e:75:a5:0a:85
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you can also immediately test the function without writing a template, directly from jr man:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; jr man ip --run

Name: ip
Category: network
Description: returns a random Ip Address matching the given cidr
Parameters: cidr string
Localizable: false
Return: string
Example: jr run --template '{{ip "10.2.0.0/16"}}'
Output: 10.2.55.217

10.2.240.243

Elapsed time: 0s
Data Generated (Objects): 1
Data Generated (bytes): 12
Number of templates (Objects): 5
Throughput (bytes per second):       118
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Create more random data
&lt;/h2&gt;

&lt;p&gt;Using &lt;code&gt;-n&lt;/code&gt; option you can create more data in each pass. You can use jr run or jr template run, they are equivalent.&lt;br&gt;
This example creates 3 net_device objects at once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr run net_device -n 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;--frequency&lt;/code&gt; option you can repeat the whole creation pass as you like:&lt;/p&gt;

&lt;p&gt;This example creates 2 net_device every second, for ever:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr run net_device -n 2 -f 1s 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;--duration&lt;/code&gt; option you can time bound the entire object creation.&lt;br&gt;
This example creates 2 net_device every 100ms for 1 minute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr run net_device -n 2 -f 100ms -d 1m 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Results are by default written on standard out (&lt;code&gt;--output "stdout"&lt;/code&gt;), but streaming to Kafka is as simple as that.&lt;/p&gt;

&lt;p&gt;If you have &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;, you can just download the client configuration, put the file in a &lt;code&gt;kafka&lt;/code&gt; dir and start streaming. If you don't have &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;, give it a &lt;a href="https://www.confluent.io/confluent-cloud/tryfree/"&gt;try&lt;/a&gt;: no credit card needed, a basic cluster to test JR is super cheap and you'll also get 400$ of traffic included.  &lt;/p&gt;

&lt;p&gt;Anyway, here is the configuration template if you need to configure it manually. It's just a standard &lt;a href="https://github.com/confluentinc/librdkafka/blob/master/CONFIGURATION.md"&gt;librdkafka configuration&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Kafka configuration
# https://github.com/confluentinc/librdkafka/blob/master/CONFIGURATION.md

bootstrap.servers=
security.protocol=SASL_SSL
sasl.mechanisms=PLAIN
sasl.username=
sasl.password=
compression.type=gzip
compression.level=9
statistics.interval.ms=1000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Streaming to Kafka
&lt;/h2&gt;

&lt;p&gt;Once Kafka is configured, streaming to it with JR is straightforward&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jr run -n 5 -f 500ms -d 5s net-device -o kafka
2023/05/07 20:03:07         0 bytes produced to Kafka
2023/05/07 20:03:08      5250 bytes produced to Kafka
2023/05/07 20:03:09      8765 bytes produced to Kafka
2023/05/07 20:03:10     12260 bytes produced to Kafka
2023/05/07 20:03:11     15763 bytes produced to Kafka

Elapsed time: 5s
Data Generated (Objects): 50
Data Generated (bytes): 17364
Number of templates (Objects): 1
Throughput (bytes per second):      3172
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, JR writes to a topic named &lt;code&gt;test&lt;/code&gt;, but you can change that with with &lt;code&gt;-t&lt;/code&gt; option. &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;We have seen how to use &lt;a href="https://jrnd.io"&gt;JR&lt;/a&gt; in simple use cases, streaming quality random data from predefined templates to standard out and Kafka on &lt;a href="https://confluent.cloud/"&gt;Confluent Cloud&lt;/a&gt;. &lt;br&gt;
In the &lt;a href="https://dev.to/ugol/jr-quality-random-data-from-the-command-line-part-ii-3nb3"&gt;second part&lt;/a&gt; of this series, we will see how to produce your own templates and manage integrity of generated data.&lt;br&gt;
In the meanwhile, happy streaming!&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>datagen</category>
      <category>cli</category>
      <category>streaming</category>
    </item>
  </channel>
</rss>
