<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: John Barku</title>
    <description>The latest articles on DEV Community by John Barku (@john_barku).</description>
    <link>https://dev.to/john_barku</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1863257%2Fd2c325df-c682-46b2-ac88-38a455da914a.png</url>
      <title>DEV Community: John Barku</title>
      <link>https://dev.to/john_barku</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/john_barku"/>
    <language>en</language>
    <item>
      <title>DATA Science</title>
      <dc:creator>John Barku</dc:creator>
      <pubDate>Sun, 04 Aug 2024 23:58:06 +0000</pubDate>
      <link>https://dev.to/john_barku/data-science-5906</link>
      <guid>https://dev.to/john_barku/data-science-5906</guid>
      <description>&lt;h2&gt;
  
  
  Data Science
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is Data Science?&lt;/strong&gt;&lt;br&gt;
Data Science is an interdisciplinary field focused on extracting knowledge, manipulating and analyzing data, and using data to answer questions or make recommendations.&lt;br&gt;
A data scientist is a professional who creates programming code and combines it with statistical knowledge to develop insights from data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Categories of Data Science;&lt;/strong&gt;&lt;br&gt;
Data Management: Collecting, persisting, and retrieving data securely, efficiently, and cost-effectively&lt;br&gt;
Data Integration and Transformation: Extract, transform, and load data (ETL). Some of the data is distributed in multiple repositories such as databases &lt;/p&gt;

&lt;p&gt;Data Visualization: Graphical representation of data and information in charts, plots, maps, and animations. It conveys data more effectively.&lt;/p&gt;

&lt;p&gt;Model Building: You train the data and analyze patterns using suitable machine-learning algorithms&lt;/p&gt;

&lt;p&gt;Model Deployment: Integrate a model into a production environment. Here the machine learning model is made available to third-party apps via APIs, helping them make data-based decisions.&lt;/p&gt;

&lt;p&gt;Model Monitoring and Assessment: Tracks deployed models and model assessment checks for accuracy, fairness, and robustness monitoring&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Open Source Tools for Data Science&lt;/strong&gt;&lt;br&gt;
Data management:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Relational Databases; &lt;br&gt;
a.  MySQL&lt;br&gt;
b.  PostgreSQL&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;NoSQL Databases;&lt;br&gt;
a.  MongoDB&lt;br&gt;
b.  Apache CouchDB&lt;br&gt;
c.  Apache Cassandra&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;File Based Tools;&lt;br&gt;
a.  Hadoop File System &lt;br&gt;
b.  Cloud File System &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Data Integration and Transformation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Apache Airflow &lt;/li&gt;
&lt;li&gt;KubeFlow&lt;/li&gt;
&lt;li&gt;Apache SparkSQL&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Data Visualization:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pixie Dust&lt;/li&gt;
&lt;li&gt;Hue &lt;/li&gt;
&lt;li&gt;Kibana&lt;/li&gt;
&lt;li&gt;Apache Superset&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Model Deployment:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Apache Prediction IO&lt;/li&gt;
&lt;li&gt;Seldon&lt;/li&gt;
&lt;li&gt;Mleap&lt;/li&gt;
&lt;li&gt;TensorFlow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Model Monitoring and Assessment:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Model DB&lt;/li&gt;
&lt;li&gt;Prometheus&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Libraries for Data Science&lt;/strong&gt; &lt;br&gt;
Scientific Computing Libraries;&lt;br&gt;
• Pandas &lt;br&gt;
• Numpy &lt;/p&gt;

&lt;p&gt;Visualization Libraries;&lt;br&gt;
• Matplotlib&lt;br&gt;
• Seaborn&lt;/p&gt;

&lt;p&gt;Machine Learning and Deep Learning Libraries;&lt;br&gt;
• Scikit-Learn&lt;br&gt;
• Keras&lt;br&gt;
• TensorFlow &lt;br&gt;
• Pytorch&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
