<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Raj Das</title>
    <description>The latest articles on DEV Community by Raj Das (@rajdas2412).</description>
    <link>https://dev.to/rajdas2412</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1042003%2F3690da7a-d0df-4b17-a89c-165ee6f8aa5d.jpg</url>
      <title>DEV Community: Raj Das</title>
      <link>https://dev.to/rajdas2412</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rajdas2412"/>
    <language>en</language>
    <item>
      <title>Data Ingestion in Snowflake with Google Cloud Storage - Part I</title>
      <dc:creator>Raj Das</dc:creator>
      <pubDate>Fri, 03 May 2024 18:26:11 +0000</pubDate>
      <link>https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-i-1mip</link>
      <guid>https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-i-1mip</guid>
      <description>&lt;p&gt;In this article we will go through the process to ingest data into Snowflake from Google Cloud Storage (GCS). Along the way we will understand the required concepts and tasks involved in a step by step manner.&lt;/p&gt;

&lt;p&gt;You can refer to the official docs &lt;a href="https://docs.snowflake.com/en/user-guide/data-load-gcs-config" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assumptions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have an active Google Cloud account with permissions to create IAM roles, GCS buckets, Pub/Sub topics.&lt;/li&gt;
&lt;li&gt;You have an active Snowflake account with permissions to create Database, Schema, Stage, Integration Objects and fire queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f33cjtkfze6pf7y42l0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7f33cjtkfze6pf7y42l0.png" alt="Data Ingestion Flow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data Source: &lt;a href="https://open.canada.ca/en" rel="noopener noreferrer"&gt;Open Data&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a bucket in GCS&lt;/strong&gt; (GCP)&lt;br&gt;
We create a bucket in GCS &lt;code&gt;snowflake_gcs-stage-bucket&lt;/code&gt; and a folder &lt;code&gt;raw_csv&lt;/code&gt; inside it to stage the files that we would load into Snowflake later.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cnidxe9z6rlfd6egn22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0cnidxe9z6rlfd6egn22.png" alt="GCS Bucket and Folder"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For the steps described here, &lt;strong&gt;ACCOUNTADMIN&lt;/strong&gt; user and &lt;strong&gt;COMPUTE_WH&lt;/strong&gt; Warehouse were used in the Snowflake account.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcftj98kbi3hw4d7whi0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcftj98kbi3hw4d7whi0y.png" alt="Snowflake Account and Warehouse"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a Database&lt;/strong&gt; (Snowflake)&lt;br&gt;
Here we are creating a database to hold the manufacturing details of various Breweries.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE DATABASE manuf_db;

USE manuf_db;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Create Storage Integration Object&lt;/strong&gt; (Snowflake)&lt;br&gt;
A storage integration is a Snowflake object. It stores a generated identity and access management (IAM) entity for the external cloud storage location, in our case it will be Google Cloud Storage (GCS). The IAM entity will be provided with permissions to access the GCS location within the GCP Project.&lt;/p&gt;

&lt;p&gt;In the below SQL, we are creating the storage integration object which will have access to two locations viz &lt;code&gt;snowflake_gcs-stage-bucket/&lt;/code&gt; and &lt;code&gt;snowflake_gcs-stage-bucket/raw_csv/&lt;/code&gt;. This object will enable us to create a staging location to host our files which are located externally i.e. outside of Snowflake.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE STORAGE INTEGRATION snowflake_gcp_int
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'GCS'
  ENABLED = TRUE
  STORAGE_ALLOWED_LOCATIONS = ('gcs://snowflake_gcs-stage-bucket/', 'gcs://snowflake_gcs-stage-bucket/raw_csv/');


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In case you get the below error, make sure the &lt;code&gt;gcs://&lt;/code&gt; url prefix is preceding the GCS bucket/folder location.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaxu2nea684ckim9zinp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaxu2nea684ckim9zinp.png" alt="Integration Error"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

DESC STORAGE INTEGRATION snowflake_gcp_int;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjn2vnjs1cjfgzhcfjhbu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjn2vnjs1cjfgzhcfjhbu.png" alt="Storage Integration Object"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Make a note of the &lt;code&gt;STORAGE_GCP_SERVICE_ACCOUNT&lt;/code&gt; from the output above. This is a service account created by Snowflake which needs to be provided with appropriate permissions in Google Cloud so that it can access GCS.&lt;br&gt;
You may notice that the nomenclature of the Service Account appears to be something within an AWS account rather than GCP. This is because my Snowflake account is hosted in AWS.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;STORAGE_GCP_SERVICE_ACCOUNT&lt;/code&gt; needs to be provided with the following permissions.&lt;br&gt;
&lt;code&gt;storage.buckets.get&lt;br&gt;
    storage.objects.create&lt;br&gt;
    storage.objects.delete&lt;br&gt;
    storage.objects.get&lt;br&gt;
    storage.objects.list&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create an IAM Role&lt;/strong&gt; (GCP)&lt;br&gt;
Create a Role in Google Cloud and add the required permissions. This step can be performed using GCP Web UI or CLI. The gcloud CLI command shown below creates a new role &lt;code&gt;snowflake_gcp_role&lt;/code&gt; and adds the required permissions to it. Replace the &lt;strong&gt;GCP-PROJECT-ID&lt;/strong&gt; with your own GCP Project ID in the below command.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

gcloud iam roles create snowflake_gcp_role --project=GCP-PROJECT-ID  
 --description="This role will be assigned to Snowflake Account to enable GCS Access" --permissions=storage.buckets.get,storage.objects.create,storage.objects.delete,storage.objects.get,storage.objects.list


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95oyieg61ybgyupygxyy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95oyieg61ybgyupygxyy.png" alt="GCP Role Creation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Attach Principal and assign Role&lt;/strong&gt; (GCP)&lt;br&gt;
In this step we will add the &lt;code&gt;STORAGE_GCP_SERVICE_ACCOUNT&lt;/code&gt; to the GCS bucket and assign the Role to the account. In the GCP Web UI, go to Cloud Storage and open the GCS bucket we created for staging our files. Click on the Permissions tab.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbhj01hlfcldw3diihn3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbhj01hlfcldw3diihn3.png" alt="Add Permissions to gCS"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Now click on Grant Access&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcpu8wxypueod4znwtjra.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcpu8wxypueod4znwtjra.png" alt="Grant Access"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Under Add Principals set the &lt;code&gt;STORAGE_GCP_SERVICE_ACCOUNT&lt;/code&gt; name and under Role set newly created IAM Role in GCP.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko47ysy1vwaf9lz55jtu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko47ysy1vwaf9lz55jtu.png" alt="Add Principal and Role"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create an external stage&lt;/strong&gt; (Snowflake)&lt;br&gt;
Here we are creating an external stage in Snowflake which references data files stored in a location within GCS. We provide the &lt;code&gt;STORAGE_INTEGRATION&lt;/code&gt; object name created earlier which has the required access permissions to the external GCS location.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE SCHEMA stages;

CREATE STAGE manu_db.stages.snowflake_gcp_stage
  URL = ('gcs://snowflake_gcs-stage-bucket/raw_csv/')
  STORAGE_INTEGRATION = snowflake_gcp_int;

SHOW STAGES IN manuf_db.stages;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;List files from the external stage&lt;/strong&gt; (Snowflake)&lt;br&gt;
The LIST command may return empty results if no files are loaded in the designated GSC bucket/folder.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiipqk41rfs2fqo97b7c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiipqk41rfs2fqo97b7c.png" alt="List Output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LIST @manuf_db.stages.snowflake_gcp_stage;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;LIST&lt;/code&gt; command may throw up an error as below if the &lt;code&gt;STORAGE_GCP_SERVICE_ACCOUNT&lt;/code&gt; was not provided with appropriate permissions on the GCS bucket/folder.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffdeovd7lbn1cfn9bqd6i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffdeovd7lbn1cfn9bqd6i.png" alt="List Error"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you load files in the GCS bucket/folder, you will see output similar to the below.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F04e2h0j7zdonayemhovs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F04e2h0j7zdonayemhovs.png" alt="GCS Bucket Folder"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrcbl92tzi8hlvtua8k5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsrcbl92tzi8hlvtua8k5.png" alt="List Output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You may query the file in the external storage using the below SQL.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SELECT $1, $2, $3 
FROM @manuf_db.stages.snowflake_gcp_stage/beer-manufacturers-microbrewers-brands.csv;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1974c8y9ujtwljngl8jk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1974c8y9ujtwljngl8jk.png" alt="Stage File Output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice that the Header row is showing up as part of content from the staged file. We need to skip the header to avoid this error. We can create and re-use a File Format object for this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a file format&lt;/strong&gt; (Snowflake)&lt;br&gt;
A file format is a named object in Snowflake which holds information about a particular file format like CSV/JSON etc. and certain attributes associated with the format. This object comes handy to describe the files in the staged location while loading into Snowflake tables.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE SCHEMA file_formats;

CREATE OR REPLACE FILE FORMAT manuf_db.file_formats.file_format_csv
TYPE = CSV
FIELD_DELIMITER = ','
SKIP_HEADER = 1
FIELD_OPTIONALLY_ENCLOSED_BY = '"';

SHOW FILE FORMATS IN manuf_db.file_formats;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Create Table&lt;/strong&gt; (Snowflake)&lt;br&gt;
A table is created in the &lt;code&gt;manuf_db&lt;/code&gt;. The &lt;code&gt;public&lt;/code&gt;schema is the default schema in Snowflake. &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE TABLE manuf_db.public.manuf_tbl(
    manuf_type  varchar(20),
    manuf_name  varchar(100),
    manuf_brand varchar(100)
);

SELECT * FROM manuf_db.public.manuf_tbl;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Loading data into Table&lt;/strong&gt; (Snowflake)&lt;br&gt;
The COPY INTO command will load staged file, the data from the file &lt;code&gt;beer-manufacturers-microbrewers-brands.csv&lt;/code&gt;, into the existing table &lt;code&gt;manuf_tbl&lt;/code&gt; by referencing the stage location &lt;code&gt;snowflake_gcp_stage&lt;/code&gt; created earlier. Multiple runs of the COPY INTO table &lt;strong&gt;will not&lt;/strong&gt; duplicate the data unless &lt;code&gt;FORCE = TRUE&lt;/code&gt; is specified as an optional clause. &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

COPY INTO manuf_db.public.manuf_tbl
FROM @manuf_db.stages.snowflake_gcp_stage
FILE_FORMAT = manuf_db.file_formats.file_format_csv
FILES = ('beer-manufacturers-microbrewers-brands.csv');

SELECT COUNT(*) FROM manuf_db.public.manuf_tbl;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61znrfnvfov3c1tyne0j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61znrfnvfov3c1tyne0j.png" alt="Table Output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These are all the Snowflake objects that we created in our account.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38c21cijla8zk1e8usxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38c21cijla8zk1e8usxl.png" alt="Snowflake Objects"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clean Up&lt;/strong&gt; (Snowflake)&lt;br&gt;
Clean up by deleting the data from the table and also removing the staged file from GCS.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

DELETE FROM manuf_db.public.manuf_tbl;

REMOVE @manuf_db.stages.snowflake_gcp_stage PATTERN='.*.csv';


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F233qvwpwoz5biwto09zq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F233qvwpwoz5biwto09zq.png" alt="Delete Stage File"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-ii-3l96"&gt;Part-II&lt;/a&gt; of Data Ingestion, we will use SnowPipe to auto-load data in Snowflake.&lt;/p&gt;

&lt;p&gt;Kindly let me know if this article was helpful. Your feedback is highly appreciated.&lt;/p&gt;

</description>
      <category>snowflake</category>
      <category>gcp</category>
      <category>dataengineering</category>
      <category>data</category>
    </item>
    <item>
      <title>Data Ingestion in Snowflake with Google Cloud Storage - Part II</title>
      <dc:creator>Raj Das</dc:creator>
      <pubDate>Fri, 03 May 2024 18:23:57 +0000</pubDate>
      <link>https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-ii-3l96</link>
      <guid>https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-ii-3l96</guid>
      <description>&lt;p&gt;&lt;a href="https://dev.tourl"&gt;&lt;/a&gt;In this article we will go through the process to auto ingest data into Snowflake from Google Cloud Storage (GCS) using a Snow Pipe. Along the way we will understand the required concepts and tasks involved in a step by step manner.&lt;/p&gt;

&lt;p&gt;You can refer to the official docs &lt;a href="https://docs.snowflake.com/en/user-guide/data-load-snowpipe-auto-gcs#cloud-platform-support" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assumptions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have an active Google Cloud account with permission to create IAM roles, GCS buckets, Pub/Sub topics.&lt;/li&gt;
&lt;li&gt;You have an active Snowflake account with permission to create Database, Schema, Stage and Integration Objects.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Data Source:&lt;/strong&gt; &lt;a href="https://open.canada.ca/en" rel="noopener noreferrer"&gt;Open Data&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;Snowpipe&lt;/strong&gt; is an event-based data ingestion tool that speeds up the process of loading data from files as soon as they arrive at a designated staging location. We will use Google Cloud Storage (GCS) in this example to load data into Snowflake. Google PubSub will be used to notify the Snowpipe as soon as a file is placed in GCS.&lt;/p&gt;

&lt;p&gt;As a pre-requisite, refer to the &lt;a href="https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-i-1mip"&gt;Part-I&lt;/a&gt; article and create all the objects mentioned there before continuing further.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a Topic&lt;/strong&gt; (GCP)&lt;br&gt;
We need to create a Topic in Google PubSub. A message will be published to this topic whenever a new file is uploaded into the GCS bucket/folder.&lt;/p&gt;

&lt;p&gt;Here we are creating a topic &lt;code&gt;snowpipe_gcs_topic&lt;/code&gt;. The gcloud CLI command below will create the topic if it does not already exist. It will also verify that the GCS bucket has permission to publish events to this topic and grant the required permission if necessary. The &lt;code&gt;OBJECT_FINALIZE&lt;/code&gt; event is sent out when a new object is successfully created in the bucket.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

gcloud storage buckets notifications create -f json gs://snowflake_gcs-stage-bucket -t snowpipe_gcs_topic -e OBJECT_FINALIZE

gcloud pubsub topics describe snowpipe_gcs_topic


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;PubSub Topic created&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3n8efco2sdegn5i3h8j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3n8efco2sdegn5i3h8j.png" alt="PubSub Topic"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a Subscription&lt;/strong&gt; (GCP)&lt;br&gt;
Next we need to create a subscription to the topic so that we get notified whenever a new message is published to the topic. Here we are creating a subscription &lt;code&gt;snowpipe_gcs_sub&lt;/code&gt; to the topic &lt;code&gt;snowpipe_gcs_topic&lt;/code&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

gcloud pubsub subscriptions create snowpipe_gcs_sub --topic=snowpipe_gcs_topic

gcloud pubsub subscriptions describe snowpipe_gcs_sub


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkftnmo9gp3vuu996a87.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkftnmo9gp3vuu996a87.png" alt="Subscription Details"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Make sure that the subscription state is active&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuviw7xx4y88red6w98yu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuviw7xx4y88red6w98yu.png" alt="Subscription Active"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service Account Permission&lt;/strong&gt; (GCP)&lt;br&gt;
A Service Account is created by Snowflake when a Storage Integration Object is created. This Service Account needs to be provided with PubSub role.&lt;/p&gt;

&lt;p&gt;Please refer to &lt;a href="https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-i-1mip"&gt;Part-I&lt;/a&gt; for more details on how to create a Storage Integration Object and provide appropriate permissions.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjn2vnjs1cjfgzhcfjhbu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjn2vnjs1cjfgzhcfjhbu.png" alt="Storage Integration Object"&gt;&lt;/a&gt;&lt;br&gt;
Select the Subscription name in PubSub UI. Click on View Permissions.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgensxc5laq8mvbw6e9j9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgensxc5laq8mvbw6e9j9.png" alt="Select Subscription"&gt;&lt;/a&gt;&lt;br&gt;
Click on Add Principal.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8era2gbcbo7q2irdfo0b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8era2gbcbo7q2irdfo0b.png" alt="Add Principal"&gt;&lt;/a&gt;&lt;br&gt;
Input the Service Account name and add the PubSub Subscriber Role to the Service Account.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7q0gvxk4fiv66jgmgek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh7q0gvxk4fiv66jgmgek.png" alt="PubSub Permission"&gt;&lt;/a&gt;&lt;br&gt;
Navigate to the Dashboard page in the Cloud Console, and select your project from the dropdown list.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznrrg0e2mpq1pl5rdpht.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznrrg0e2mpq1pl5rdpht.png" alt="Dashboard"&gt;&lt;/a&gt;&lt;br&gt;
Click the ADD PEOPLE TO THIS PROJECT button.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi46l3s09n4d9pszr7kan.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi46l3s09n4d9pszr7kan.png" alt="Add People to Project"&gt;&lt;/a&gt;&lt;br&gt;
Add the service account name. From the Select a role dropdown, select Monitoring Viewer.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpz8nj32wu7viz4ggvdx4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpz8nj32wu7viz4ggvdx4.png" alt="Monitoring Viewer"&gt;&lt;/a&gt;&lt;br&gt;
Click the Save button. The service account name is added to the Monitoring Viewer role.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a Notification Integration&lt;/strong&gt; (Snowflake)&lt;br&gt;
The notification integration references the Pub/Sub subscription in Google Cloud. Snowflake associates the notification integration with a GCS service account created for your Snowflake account. This service account is created when we created the Storage Integration Object.&lt;/p&gt;

&lt;p&gt;Replace the &lt;code&gt;GCP-PROJECT-ID&lt;/code&gt; in the below command with your own project ID.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE NOTIFICATION INTEGRATION snowflake_gcp_pubsub_int
  TYPE = QUEUE
  NOTIFICATION_PROVIDER = GCP_PUBSUB
  ENABLED = true
  GCP_PUBSUB_SUBSCRIPTION_NAME = 'projects/GCP-PROJECT-ID/subscriptions/snowpipe_gcs_sub';

DESC NOTIFICATION INTEGRATION snowflake_gcp_pubsub_int;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhildg29tch4grqax5dcp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhildg29tch4grqax5dcp.png" alt="Notification Integration"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a Snow Pipe&lt;/strong&gt; (Snowflake)&lt;br&gt;
Lets create a Snow Pipe which will auto-ingest data into the Snowflake table as it arrives in the GCS location.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

CREATE SCHEMA manuf_db.pipes;

CREATE PIPE manuf_db.pipes.snow_gcs_pipe
AUTO_INGEST = true
INTEGRATION = snowflake_gcp_pubsub_int
AS
COPY INTO manuf_db.public.manuf_tbl
FROM @manuf_db.stages.snowflake_gcp_stage
PATTERN = '.*manufacturers.*'
FILE_FORMAT = manuf_db.file_formats.file_format_csv;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

DESC PIPE manuf_db.pipes.snow_gcs_pipe;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7jscf19v78aahwml0twh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7jscf19v78aahwml0twh.png" alt="Pipe Defination"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SHOW PIPES IN manuf_db.pipes;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Check the Status of the Snow Pipe. It should be in RUNNING status for it to work correctly. Make sure the Service Account is provided with correct permissions in GCP (as described above).&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SELECT SYSTEM$PIPE_STATUS('manuf_db.pipes.snow_gcs_pipe');


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga7u1idv5yue9xfxhrtf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fga7u1idv5yue9xfxhrtf.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Upload files in GCS&lt;/strong&gt; (GCP)&lt;br&gt;
Upload a file in the stage bucket/folder. As soon as the file is loaded a message will be published to the Topic &lt;code&gt;snowpipe_gcs_topic&lt;/code&gt;. &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1esyz3n61586gf4a2i63.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1esyz3n61586gf4a2i63.png" alt="GCS File Upload"&gt;&lt;/a&gt;&lt;br&gt;
Click on the PubSub Subscription and click on the Metrics tab to view the messages published.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mrdkonktky99livqlvp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mrdkonktky99livqlvp.png" alt="PuSub Message"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The file now should be loaded to the manuf_tbl in Snowflake. It takes around a minute or two for the data ingestion to happen.&lt;/p&gt;

&lt;p&gt;You may load multiple files in the GCS bucket and those will be loaded using the Snow Pipe.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41xbuwawxxq8zy6e7mvc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41xbuwawxxq8zy6e7mvc.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftkw9pvo5qnh8cce3o2l0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftkw9pvo5qnh8cce3o2l0.png" alt="Subscription Metrics"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verify Ingested Data&lt;/strong&gt; (Snowflake)&lt;br&gt;
Run the below SQL to verify if the data is loaded in the table.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SELECT * FROM manuf_db.public.manuf_tbl;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhlnptfb9mxdhhqye6jt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhlnptfb9mxdhhqye6jt.png" alt="Table Data"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SELECT count(*) FROM manuf_db.public.manuf_tbl;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrn18yzs4xtyax35v4za.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frrn18yzs4xtyax35v4za.png" alt="Count"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Copy History of the files can also be seen in Snowflake. &lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fna1hrke89nlgsde31let.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fna1hrke89nlgsde31let.png" alt="Copy History"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Below is the output from the Pipe Status. This data can be very handy for debugging purposes. It displays the last file that was ingested along with the timestamp.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

SELECT SYSTEM$PIPE_STATUS('manuf_db.pipes.snow_gcs_pipe');


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau58ol1rj7lzqujhfhek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau58ol1rj7lzqujhfhek.png" alt="SnowPipe Status"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All Objects created in Snowflake Account &lt;a href="https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-i-1mip"&gt;Part-I&lt;/a&gt; and &lt;a href="https://dev.to/rajdas2412/data-ingestion-in-snowflake-with-google-cloud-storage-part-ii-3l96"&gt;Part-II&lt;/a&gt;)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiza2c2856tcyfrrpkf6j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiza2c2856tcyfrrpkf6j.png" alt="Snowflake Objects"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fun Fact&lt;/strong&gt;&lt;br&gt;
Based on the data used, as of 2020-11-01 there are 543 Beer Manufacturer in ON producing 1403 different kinds of Beers.&lt;/p&gt;

&lt;p&gt;Kindly let me know if this article was helpful. Your feedback is highly appreciated.&lt;/p&gt;

</description>
      <category>snowflake</category>
      <category>gcp</category>
      <category>dataengineering</category>
      <category>data</category>
    </item>
  </channel>
</rss>
