<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Santhoshkumar. P</title>
    <description>The latest articles on DEV Community by Santhoshkumar. P (@sann3).</description>
    <link>https://dev.to/sann3</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F304924%2F867f1d04-167f-4e9e-9c94-e94bd405d79c.jpg</url>
      <title>DEV Community: Santhoshkumar. P</title>
      <link>https://dev.to/sann3</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sann3"/>
    <language>en</language>
    <item>
      <title>BigQuery dynamic SQL and managing temp tables</title>
      <dc:creator>Santhoshkumar. P</dc:creator>
      <pubDate>Fri, 23 Apr 2021 14:13:24 +0000</pubDate>
      <link>https://dev.to/sann3/bigquery-dynamic-sql-and-managing-temp-tables-5c43</link>
      <guid>https://dev.to/sann3/bigquery-dynamic-sql-and-managing-temp-tables-5c43</guid>
      <description>&lt;p&gt;Google introduced support for dynamic SQL in BigQuery. Developers working particularly in Oracle must have some liking for EXECUTE IMMEDIATE, the way you execute dynamic SQL queries. Such a feature in BigQuery was missing for a long time, and now that it is here, I can't wait to use it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing a problem statement
&lt;/h3&gt;

&lt;p&gt;Let's choose a problem that easily resonates with every developer working with the Google BigQuery world. Who isn't noticing the large volume of temporary tables churned by the client drivers and large datasets. This is particularly true where downstream products implement a version of BigQuery driver and fail to leverage nice features like auto expiration of tables. Not so good part is the hygiene of the dataset, these tables stay forever until explicitly cleared.  What is important for this blog is a problem statement to demonstrate the utility of dynamic SQL.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lets address it using Dynamic SQL
&lt;/h3&gt;

&lt;p&gt;Temporary tables do offer the convenience of caching large result sets. With data rapidly changing on BigQuery dataset, let us target the old temporary tables and remove those from the datasets. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Our primary goal is to clear all temporary tables older than 24 hours. &lt;/li&gt;
&lt;li&gt;Achieving this goal needs some more information. We need to identify when a table was created. This is when INFORMATION_SCHEMA of BigQuery is helpful. &lt;/li&gt;
&lt;li&gt;Last step is that I want this to be scheduled every day, without my intervention. Yes, you can schedule SQL statements using the BigQuery scheduled query feature. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To clear temporary tables across all datasets, let's write code employing dynamic SQL, iterate all the dataset using the INFORMATION_SCHEMA and delete the temp table using the timestamp and the name starting with temp_table_. And schedule the SQL code using the BigQuery scheduled query option. With this, all the temp tables that are older than 1 day should get automatically cleared at a daily cadence.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


</description>
    </item>
    <item>
      <title>Data Platform zones
</title>
      <dc:creator>Santhoshkumar. P</dc:creator>
      <pubDate>Wed, 01 Jan 2020 02:43:53 +0000</pubDate>
      <link>https://dev.to/sann3/data-platform-zone-names-a8e</link>
      <guid>https://dev.to/sann3/data-platform-zone-names-a8e</guid>
      <description>&lt;p&gt;I was in search of suitable names for zones in a data platform, and this is what I have until now.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access zone&lt;/li&gt;
&lt;li&gt;Additional zone&lt;/li&gt;
&lt;li&gt;Analytics zone&lt;/li&gt;
&lt;li&gt;Archive data zone&lt;/li&gt;
&lt;li&gt;Canonical data zone&lt;/li&gt;
&lt;li&gt;Certified zone&lt;/li&gt;
&lt;li&gt;Clean zone&lt;/li&gt;
&lt;li&gt;Cleansing zone&lt;/li&gt;
&lt;li&gt;Consumer zone&lt;/li&gt;
&lt;li&gt;Consumption zone&lt;/li&gt;
&lt;li&gt;Curated zone&lt;/li&gt;
&lt;li&gt;Dev zone&lt;/li&gt;
&lt;li&gt;Exploration zone&lt;/li&gt;
&lt;li&gt;Gold zone&lt;/li&gt;
&lt;li&gt;Insights zone&lt;/li&gt;
&lt;li&gt;Landing zone&lt;/li&gt;
&lt;li&gt;Master data zone&lt;/li&gt;
&lt;li&gt;Operationalization zone&lt;/li&gt;
&lt;li&gt;Persisted zone&lt;/li&gt;
&lt;li&gt;Process zone&lt;/li&gt;
&lt;li&gt;Production zone&lt;/li&gt;
&lt;li&gt;Published zone&lt;/li&gt;
&lt;li&gt;Raw zone&lt;/li&gt;
&lt;li&gt;Refined zone&lt;/li&gt;
&lt;li&gt;Refinery zone&lt;/li&gt;
&lt;li&gt;Reporting zone&lt;/li&gt;
&lt;li&gt;Sandbox zone&lt;/li&gt;
&lt;li&gt;Sensitive zone&lt;/li&gt;
&lt;li&gt;Silver zone&lt;/li&gt;
&lt;li&gt;Staging zone&lt;/li&gt;
&lt;li&gt;Standard zone&lt;/li&gt;
&lt;li&gt;Structured zone&lt;/li&gt;
&lt;li&gt;Temporal Zone&lt;/li&gt;
&lt;li&gt;Transformed zone&lt;/li&gt;
&lt;li&gt;Transient zone&lt;/li&gt;
&lt;li&gt;Trusted zone&lt;/li&gt;
&lt;li&gt;User Drop zone&lt;/li&gt;
&lt;li&gt;Work zone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt;&lt;br&gt;
Public blogs and images. &lt;/p&gt;

</description>
      <category>dataplatform</category>
      <category>zone</category>
      <category>data</category>
      <category>platform</category>
    </item>
  </channel>
</rss>
