<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vlad L</title>
    <description>The latest articles on DEV Community by Vlad L (@frizzled).</description>
    <link>https://dev.to/frizzled</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F747133%2Faead535d-a13b-4e70-8b59-b16469540224.png</url>
      <title>DEV Community: Vlad L</title>
      <link>https://dev.to/frizzled</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/frizzled"/>
    <language>en</language>
    <item>
      <title>De-identify SQL: Transforming Production Data</title>
      <dc:creator>Vlad L</dc:creator>
      <pubDate>Mon, 06 Dec 2021 15:35:11 +0000</pubDate>
      <link>https://dev.to/frizzled/de-identify-sql-transforming-production-data-19kd</link>
      <guid>https://dev.to/frizzled/de-identify-sql-transforming-production-data-19kd</guid>
      <description>&lt;p&gt;Perhaps you've noticed similar comments where you work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Sales:&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This sales demo needs updated data to demonstrate a new feature."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;QA:&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Regression testing wasn't possible because the database is stale."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Development:&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Our staging environment doesn't contain the necessary data." &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With data-driven applications, our production environment often contains the answer to Sales, QA &amp;amp; Development needs but crossing our fingers and hoping it all works can be problematic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftm2thkxmj17aubdbcs19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftm2thkxmj17aubdbcs19.png" alt="Testing in Production?"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I need a tool that transforms a production database, modifies identifiable fields and creates de-identified data to share with prospective customers, create a development database or populate QA environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  De-identifying SQL: Creating Data
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.npmjs.com/package/de-identify-sql" rel="noopener noreferrer"&gt;De-identify SQL&lt;/a&gt; is a Node.js command-line tool that modifies SQL statements using a customizable strategy file for each table.  For any database column I can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Replace data with a fabricated value.&lt;/li&gt;
&lt;li&gt;Redact data with a constant value.&lt;/li&gt;
&lt;li&gt;Generate new data with JavaScript.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;De-identify SQL accepts a passed-in file or piped input and creates an output .sql or pipes data to another process.  So I can wire this into an automated process or manually create data as needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Ahead: Creating Fake Users
&lt;/h2&gt;

&lt;p&gt;To begin I need a JSON file for each table I want to modify.  Each strategy file is named after the table it's applied to.  Here's example input SQL for a user table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="nv"&gt;`user`&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nv"&gt;`id`&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'0'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nv"&gt;`email`&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nv"&gt;`full_name`&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nv"&gt;`birth_date`&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nv"&gt;`gov_id`&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;`id`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="nv"&gt;`user`&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Florida.Gibson@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Florida Gibson'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'2000-12-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'111-22-3333'&lt;/span&gt;&lt;span class="p"&gt;),(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'godfreymo44@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Godfrey Mosciski'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1980-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'333-22-1111'&lt;/span&gt;&lt;span class="p"&gt;),(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Tommie1Quitzon@gmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Tommie Quitzon'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1960-06-07'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'444-55-6666'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any field I list in my strategy JSON file will be modified, all other fields will pass-through.  To modify the above SQL I can use the following file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"columns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"columnKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"full_name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"redactWith"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"NAME REMOVED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tracked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"columnKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"redactWith"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"internet.email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tracked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"columnKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"birth_date"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"redactWith"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"createBirthday"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tracked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"columnKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gov_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"redactWith"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{{datatype.number({&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;min&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:100,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;max&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:999})}}-{{datatype.number({&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;min&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:10,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;max&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:99})}}-{{datatype.number({&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;min&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:1000,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;max&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:9999})}}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tracked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running De-identify SQL with the above input and JSON strategy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;user.sql | de-identify-sql &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; user-de-identified.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'll receive back (output generated randomly, your results may vary):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="nv"&gt;`user`&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Titus1@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'NAME REMOVED'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1976-05-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'420-15-1747'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Jeremy.Bechtelar@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'NAME REMOVED'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1961-10-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'327-49-5054'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Emiliano.Grimes@gmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'NAME REMOVED'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'2002-11-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'106-57-2546'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Under the Hood
&lt;/h2&gt;

&lt;p&gt;I have a lot of flexibility when modifying SQL statements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the above example, &lt;code&gt;full_name&lt;/code&gt; is redacted with a constant: &lt;code&gt;NAME_REMOVED&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;For the &lt;code&gt;email&lt;/code&gt; column, we're calling a &lt;a href="https://rawgit.com/Marak/faker.js/master/examples/browser/index.html" rel="noopener noreferrer"&gt;faker function&lt;/a&gt; to generate a new email address.&lt;/li&gt;
&lt;li&gt;To generate a &lt;code&gt;gov_id&lt;/code&gt;, we use a &lt;a href="https://mustache.github.io/" rel="noopener noreferrer"&gt;mustache&lt;/a&gt;-like template of faker functions.&lt;/li&gt;
&lt;li&gt;Lastly, to create birth_date, we're calling custom JavaScript. This is coming from &lt;code&gt;user.js&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;faker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;faker&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;createBirthday&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;faker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;past&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Preserving Data Relationships
&lt;/h3&gt;

&lt;p&gt;I may also need to work with denormalized or repeating data which needs to be consistently processed to retained the relationships between that data while removing identifying details.  Here I can change the &lt;code&gt;tracked&lt;/code&gt; parameter in the JSON strategy file from &lt;code&gt;false&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt; this will replace an repeated fields with the same fabricated values.  For example, this order SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="nv"&gt;`order`&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Dorthy.OKeefe@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Belle10@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1002&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Dorthy.OKeefe@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Garfield.Renner89@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;542&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Garfield.Renner89@hotmail.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;67&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;52&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this JSON strategy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"columns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"columnKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"order_email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"redactWith"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"internet.email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tracked"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Could become:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="nv"&gt;`order`&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; 
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Demario26@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Audra_Stroman@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1002&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Demario26@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Rogelio_OHara1@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;542&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'Rogelio_OHara1@yahoo.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;67&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;52&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This allows us to remove identifying fields while preserving relationships between data elements that the application may require.  This also makes our fabricated data appear more realistic: in certain tables we expect to see fields repeat.&lt;/p&gt;

&lt;h2&gt;
  
  
  Disclaimers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;De-identify-SQL is currently compatible with MySQL and MariaDB.  Feel free to &lt;a href="https://gitlab.com/dbash-public/de-identify-sql/-/issues" rel="noopener noreferrer"&gt;request a different database&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mysqldump&lt;/code&gt; should be used to create input SQL.&lt;/li&gt;
&lt;li&gt;I'm a contributor for &lt;a href="https://www.npmjs.com/package/de-identify-sql" rel="noopener noreferrer"&gt;De-identify SQL&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>sql</category>
      <category>privacy</category>
      <category>devops</category>
    </item>
    <item>
      <title>Remove Protected Data with No-code</title>
      <dc:creator>Vlad L</dc:creator>
      <pubDate>Mon, 08 Nov 2021 17:08:26 +0000</pubDate>
      <link>https://dev.to/frizzled/remove-protected-data-with-no-code-2cng</link>
      <guid>https://dev.to/frizzled/remove-protected-data-with-no-code-2cng</guid>
      <description>&lt;p&gt;&lt;em&gt;&lt;a href="//remove-pii.com"&gt;remove-pii&lt;/a&gt; is a no-code solution that creates shareable CSVs while removing protected data, PII and PHI with just a few clicks.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Working with third parties, contractors and partners who need access to files containing protected data means walking a tightrope between providing useful data without violating privacy, HIPAA or GDPR. &lt;/p&gt;

&lt;h2&gt;
  
  
  Start Simple
&lt;/h2&gt;

&lt;p&gt;Let's remove protected data from a CSV without writing any code using &lt;a href="//gitlab.com/dbash-public/redact-phi/-/raw/main/example/remove_pii_demo.csv?inline=false"&gt;an example file&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jqJnxZnb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uuzbhtrco3nrt74wz32d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jqJnxZnb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/uuzbhtrco3nrt74wz32d.jpg" alt="Remove protected data with no code" width="872" height="487"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This walk-through will use &lt;a href="//remove-pii.com"&gt;remove-pii.com&lt;/a&gt;. Although we're working with a website, no data is transmitted to any server; all processing is handled in the browser. An offline version is &lt;a href="//www.npmjs.com/package/redact-phi"&gt;also available&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, we'll provide our CSV to the website. This will parse the headers and allow us to replace any data within the file.  Any unset columns pass through to the final file. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--95VtgOe8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8sxlbd2h9znm0uyu1l0w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--95VtgOe8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8sxlbd2h9znm0uyu1l0w.png" alt="De-identifying a file with protected data" width="880" height="651"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  De-identification Examples
&lt;/h2&gt;

&lt;p&gt;Let's set a data-replacement strategy for a column by clicking on "Set Strategy ...".  This will replace data in our document with the function we select. We can use a text search to find data-replacers or scroll through all available approaches:&lt;/p&gt;

&lt;h3&gt;
  
  
  Email address:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cxsSKrOT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/021wuw6qy268e7a0maol.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cxsSKrOT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/021wuw6qy268e7a0maol.png" alt="De-identify email address" width="762" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Phone number:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8r6r4ffh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3zqmbvy42950xxawawqt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8r6r4ffh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3zqmbvy42950xxawawqt.png" alt="De-identify phone number" width="759" height="263"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Credit card number:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--U7hxqdpa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h47wstao4ugla9wwtuly.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--U7hxqdpa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h47wstao4ugla9wwtuly.png" alt="De-identify credit card number" width="759" height="261"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Bitcoin address:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rVmeqRLB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n26odlrmqqhclc7xjv1h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rVmeqRLB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/n26odlrmqqhclc7xjv1h.png" alt="De-identify credit bitcoin address" width="762" height="264"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Order date:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qZ42SHfm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rj1ilbw7zn0ar28yo1fr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qZ42SHfm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rj1ilbw7zn0ar28yo1fr.png" alt="De-identify order date" width="760" height="264"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we have strategies to replace data in our file we can &lt;strong&gt;Save&lt;/strong&gt; the updated file and view the result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--w-vvkX7i--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6h5yha6ezptfet82t6gk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--w-vvkX7i--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6h5yha6ezptfet82t6gk.png" alt="Protected data replaced in CSV" width="880" height="235"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Done!&lt;/strong&gt; With just a few clicks we've replaced all protected data within the example file.  If we need to repeat this process on multiple files we can download the replacement strategy by clicking &lt;strong&gt;Download Strategy&lt;/strong&gt; and use &lt;a href="//www.npmjs.com/package/redact-phi"&gt;redact PII&lt;/a&gt; offline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preserve Data Relationships
&lt;/h2&gt;

&lt;p&gt;What if we need to preserve the relationships within the data while replacing values? Notice that email repeats in our example file (e.g. &lt;em&gt;Nora.OReilly21 and Preston1&lt;/em&gt;) but is replaced with new, different values in the final file. If we want to replace this data with the same new value we'll enable "Preserve" on this column. Now, when we save the updated file any repeated email addresses will be replaced with the same, new value.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--66PLFEAX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/we1ufafvlwm5wtbtfhda.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--66PLFEAX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/we1ufafvlwm5wtbtfhda.png" alt="Preserve relationships in replaced data" width="880" height="182"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Need More?
&lt;/h2&gt;

&lt;p&gt;For those who are (understandably) cautious: just the headers in a CSV can be provided to create a removal strategy which can be downloaded separately.  Then the &lt;a href="//www.npmjs.com/package/redact-phi"&gt;offline application&lt;/a&gt; can be used to remove protected data, PII or PHI.  If necessary, the replacement strategy can be created manually in JSON. For those interested, a dedicated version of remove-pii.com exclusive to your organization can be provided.&lt;/p&gt;

&lt;p&gt;Please contact us at &lt;a href="//dba.sh/contact"&gt;dba.sh&lt;/a&gt; with any questions. &lt;/p&gt;

</description>
      <category>privacy</category>
      <category>productivity</category>
      <category>tutorial</category>
      <category>node</category>
    </item>
  </channel>
</rss>
