<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sohil Shah</title>
    <description>The latest articles on DEV Community by Sohil Shah (@bugsbunnyshah).</description>
    <link>https://dev.to/bugsbunnyshah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1423431%2F93ec97d3-0e34-48a8-8090-d8fe976e06c9.jpeg</url>
      <title>DEV Community: Sohil Shah</title>
      <link>https://dev.to/bugsbunnyshah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bugsbunnyshah"/>
    <language>en</language>
    <item>
      <title>Data Pipeline Techniques in Action</title>
      <dc:creator>Sohil Shah</dc:creator>
      <pubDate>Sun, 25 Aug 2024 17:40:11 +0000</pubDate>
      <link>https://dev.to/bugsbunnyshah/data-pipeline-techniques-in-action-4m9o</link>
      <guid>https://dev.to/bugsbunnyshah/data-pipeline-techniques-in-action-4m9o</guid>
      <description>&lt;p&gt;Take a deep dive into the architectural concepts of data pipelines along with a hands-on tutorial for implementation, demonstrating the concepts in action.&lt;/p&gt;

&lt;p&gt;The topics covered are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Data pipeline architecture&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High-scale data ingestion&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data transformation and processing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data storage&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Staging data delivery&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Operational data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hands-on exercise&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Article : &lt;a href="https://dzone.com/articles/data-pipeline-techniques-in-action" rel="noopener noreferrer"&gt;https://dzone.com/articles/data-pipeline-techniques-in-action&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fpd1jnq77jv68blzgkk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1fpd1jnq77jv68blzgkk.png" alt="Data Pipeline Architecture" width="771" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>datascience</category>
      <category>softwaredevelopment</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Open Source High-Scale Data Pipeline Platform for Enterprise Data, Analytics, and Machine Learning Applications</title>
      <dc:creator>Sohil Shah</dc:creator>
      <pubDate>Sat, 13 Apr 2024 22:41:21 +0000</pubDate>
      <link>https://dev.to/bugsbunnyshah/open-source-high-scale-data-pipeline-platform-for-enterprise-data-analytics-and-machine-learning-applications-ld0</link>
      <guid>https://dev.to/bugsbunnyshah/open-source-high-scale-data-pipeline-platform-for-enterprise-data-analytics-and-machine-learning-applications-ld0</guid>
      <description>&lt;p&gt;Braineous is designed for an optimal out-of-the-box experience for developers focused on &lt;strong&gt;ETL&lt;/strong&gt;, &lt;strong&gt;ELT&lt;/strong&gt;, &lt;strong&gt;Analytics&lt;/strong&gt; and &lt;strong&gt;Machine Learning&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/guides/developer-guide"&gt;https://bugsbunnyshah.github.io/braineous/guides/developer-guide&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get Started&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/get-started/"&gt;https://bugsbunnyshah.github.io/braineous/get-started/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/bugsbunnyshah/braineous_dataplatform"&gt;https://github.com/bugsbunnyshah/braineous_dataplatform&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;License&lt;/strong&gt;: &lt;a href="https://github.com/bugsbunnyshah/braineous_dataplatform/blob/main/LICENSE"&gt;https://github.com/bugsbunnyshah/braineous_dataplatform/blob/main/LICENSE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Roadmap&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/about/"&gt;https://bugsbunnyshah.github.io/braineous/about/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache Kafka&lt;/strong&gt; is the backbone for high scale data ingestion and maintenance of source of data truth and in the future for CDC and time travel for a system in the past and training AI models for predictive analytics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More details&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/container-first/"&gt;https://bugsbunnyshah.github.io/braineous/container-first/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The downstream engine is &lt;strong&gt;Apache Flink&lt;/strong&gt;. If &lt;strong&gt;Apache Flink&lt;/strong&gt; is the brain, then &lt;strong&gt;Apache Kafka&lt;/strong&gt; is the spinal chord. A biological analogy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More details&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/about/"&gt;https://bugsbunnyshah.github.io/braineous/about/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Braineous is built on &lt;strong&gt;Apache Flink&lt;/strong&gt; as its data processing engine and supports &lt;strong&gt;Apache Hive&lt;/strong&gt; based data lakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Future releases&lt;/strong&gt; of Braineous will include a &lt;strong&gt;Data Lake Connector framework&lt;/strong&gt; that can support custom data lakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More details&lt;/strong&gt;: &lt;a href="https://bugsbunnyshah.github.io/braineous/data-lake/"&gt;https://bugsbunnyshah.github.io/braineous/data-lake/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Braineous bridges the unstructured dataset to the structured dataset on the fly. Your data lake evolves with the dataset. &lt;strong&gt;Analytics&lt;/strong&gt; and &lt;strong&gt;Machine Learning&lt;/strong&gt; need structured queries for training the AI model.&lt;/p&gt;

&lt;p&gt;Braineous bridges two Worlds on the fly. Downtime is a time that is entirely unacceptable for Braineous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More details:&lt;/strong&gt; &lt;a href="https://bugsbunnyshah.github.io/braineous/developer-joy/"&gt;https://bugsbunnyshah.github.io/braineous/developer-joy/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We would love your feedback when it comes to developer experience, ease of use, and ability to go from 0 to 60 in 15 minutes when it comes to data processing. &lt;/p&gt;

&lt;p&gt;Developer input would be valuable to shape the roadmap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Feedback&lt;/strong&gt;: &lt;a href="https://github.com/bugsbunnyshah/braineous_dataplatform/discussions/16"&gt;https://github.com/bugsbunnyshah/braineous_dataplatform/discussions/16&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Sohil&lt;/p&gt;

</description>
      <category>etl</category>
      <category>elt</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
